EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 12, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

EpistemeAI's Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta is an 8 billion parameter Llama 3.1 based model, fine-tuned using the KTO (Kahneman-Tversky Optimization) method. Developed by EpistemeAI2, this model is optimized for reasoning and mathematical tasks, building upon the Fireball-Alpaca-Llama3.1.07-8B-Philos-Math base. It was trained 2x faster using Unsloth and Huggingface's TRL library, making it suitable for applications requiring efficient mathematical and philosophical reasoning.

Loading preview...

Model Overview

EpistemeAI/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math-KTO-beta is an 8 billion parameter language model developed by EpistemeAI2. It is a fine-tuned version of the EpistemeAI2/Fireball-Alpaca-Llama3.1.07-8B-Philos-Math base model, utilizing the KTO (Kahneman-Tversky Optimization) method. This fine-tuning approach aims to enhance the model's performance, particularly in areas related to reasoning and mathematical problem-solving.

Key Characteristics

  • KTO Fine-tuning: Leverages the Kahneman-Tversky Optimization method for improved performance.
  • Llama 3.1 Base: Built upon the Llama 3.1 architecture, providing a strong foundation.
  • Efficient Training: Trained 2x faster using Unsloth and Huggingface's TRL library.
  • Focus Areas: Designed to excel in philosophical and mathematical reasoning tasks.

Performance Highlights

Evaluations on the Open LLM Leaderboard indicate its capabilities across various benchmarks:

  • Avg. Score: 24.90
  • IFEval (0-Shot): 72.74
  • BBH (3-Shot): 26.90
  • MATH Lvl 5 (4-Shot): 13.22
  • MMLU-PRO (5-shot): 28.26

Detailed evaluation results are available on the Hugging Face Open LLM Leaderboard.

Use Cases

This model is particularly well-suited for applications requiring:

  • Mathematical Problem Solving: Its KTO fine-tuning and base model focus suggest strengths in mathematical reasoning.
  • Philosophical Inquiry: The "Philos" in its name indicates an orientation towards philosophical text understanding and generation.
  • Reasoning Tasks: General reasoning capabilities are enhanced through its specialized training.