Model Overview
EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta is an 8 billion parameter language model developed by EpistemeAI2. It is a fine-tuned version of the EpistemeAI2/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math base model, utilizing the KTO (Kahneman-Tversky Optimization) method. This fine-tuning approach aims to enhance the model's performance, particularly in areas related to reasoning and mathematical problem-solving.
Key Characteristics
- KTO Fine-tuning: Leverages the Kahneman-Tversky Optimization method for improved performance.
- Llama 3.1 Base: Built upon the Llama 3.1 architecture, providing a strong foundation.
- Efficient Training: Trained 2x faster using Unsloth and Huggingface's TRL library.
- Focus Areas: Designed to excel in philosophical and mathematical reasoning tasks.
Performance Highlights
Evaluations on the Open LLM Leaderboard indicate its capabilities across various benchmarks:
- Avg. Score: 24.90
- IFEval (0-Shot): 72.74
- BBH (3-Shot): 26.90
- MATH Lvl 5 (4-Shot): 13.22
- MMLU-PRO (5-shot): 28.26
Detailed evaluation results are available on the Hugging Face Open LLM Leaderboard.
Use Cases
This model is particularly well-suited for applications requiring:
- Mathematical Problem Solving: Its KTO fine-tuning and base model focus suggest strengths in mathematical reasoning.
- Philosophical Inquiry: The "Philos" in its name indicates an orientation towards philosophical text understanding and generation.
- Reasoning Tasks: General reasoning capabilities are enhanced through its specialized training.