Overview
Overview
Meta's Llama 3.1 8B Instruct is an 8 billion parameter instruction-tuned language model, part of the Llama 3.1 family, designed for multilingual dialogue and general-purpose applications. Developed by Meta, it features an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability and a substantial 128k context length. The model was trained on over 15 trillion tokens of publicly available online data with a knowledge cutoff of December 2023, and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Key Capabilities
- Multilingual Support: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for fine-tuning in other languages.
- Tool Use: Supports advanced tool use formats, enabling integration with external functions and services.
- Strong Performance: Demonstrates competitive results across various benchmarks, including MMLU, HumanEval (72.6% pass@1), GSM-8K (84.5% em_maj1@1), and API-Bank (82.6% acc).
- Instruction Following: Instruction-tuned for assistant-like chat and natural language generation tasks.
Good For
- Commercial and Research Use: Intended for a wide range of applications in both commercial and research settings.
- Assistant-like Chat: Optimized for conversational AI and dialogue systems.
- Code Generation: Shows strong capabilities in coding tasks, as evidenced by HumanEval and MBPP++ benchmarks.
- Tool-Augmented Applications: Ideal for scenarios requiring interaction with external tools and APIs.