EpistemeAI/Math-Code-Llama3.1-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Aug 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

EpistemeAI/Math-Code-Llama3.1-8B is an 8 billion parameter Llama 3.1-based language model developed by EpistemeAI, fine-tuned from EpistemeAI/MathLlama3.1-8B-16bit. This model is optimized for multilingual text and code generation, supporting a 32768-token context length. It was trained using Unsloth and Huggingface's TRL library, and further fine-tuned with the ORPO technique for improved performance and alignment.

Loading preview...

Model Overview

EpistemeAI/Math-Code-Llama3.1-8B is an 8 billion parameter model built upon the Meta Llama 3.1 architecture, developed by EpistemeAI. It is fine-tuned from EpistemeAI/MathLlama3.1-8B-16bit, leveraging Unsloth for 2x faster training and Huggingface's TRL library. A key differentiator is its use of the ORPO (Optimized Reward Prompting) technique, which combines supervised fine-tuning and preference alignment into a single, more efficient process, empirically outperforming other alignment methods.

Key Capabilities

  • Multilingual Text and Code Generation: Supports output in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for other languages through fine-tuning.
  • Extended Context Window: Features a substantial 32768-token context length, enabling processing of longer inputs and generating more coherent, extended responses.
  • Optimized Architecture: Utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability.
  • Instruction Following: Instruction-tuned for assistant-like chat and various natural language generation tasks.

Intended Use Cases

  • Commercial and Research Applications: Suitable for a wide range of commercial and research purposes.
  • Multilingual Dialogue Systems: Optimized for multilingual chat and assistant-like interactions.
  • Code Generation: Capable of generating code, as indicated by its output modalities.
  • Synthetic Data Generation: Can be used to generate synthetic data to improve other models.