axolotl-ai-co/llama-3-8b-chatml

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 19, 2024License:llama3Architecture:Transformer Cold

The axolotl-ai-co/llama-3-8b-chatml is an 8 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. Optimized for dialogue use cases, it utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and was fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). This model excels in general language understanding, reasoning, and code generation, outperforming many open-source chat models on common industry benchmarks.

Loading preview...

Model Overview

This is the 8 billion parameter instruction-tuned variant of Meta's Llama 3 family of large language models. It is built upon an optimized transformer architecture and incorporates Grouped-Query Attention (GQA) for enhanced inference scalability. The model was developed using a combination of supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Dialogue Optimization: Specifically tuned for chat and assistant-like conversational use cases.
  • Strong Performance: Outperforms many other open-source chat models across various industry benchmarks, including MMLU (68.4), HumanEval (62.2), and GSM-8K (79.6).
  • Broad General Intelligence: Demonstrates strong capabilities in general reasoning, common sense, and knowledge retrieval.
  • Code Generation: Achieves a HumanEval score of 62.2, indicating proficiency in code generation tasks.
  • Safety & Alignment: Developed with a focus on helpfulness and safety, incorporating extensive red teaming and adversarial evaluations.

Good For

  • Commercial and Research Applications: Intended for a wide range of uses in English-speaking contexts.
  • Assistant-like Chatbots: Ideal for building conversational AI agents and dialogue systems.
  • Natural Language Generation: Suitable for various text generation tasks where instruction following is crucial.
  • Code Assistance: Can be leveraged for tasks requiring code generation or understanding.