Overview
Overview
This model is the 8 billion parameter instruction-tuned variant from Meta's Llama 3.1 collection, released on July 23, 2024. It is built on an optimized transformer architecture utilizing Grouped-Query Attention (GQA) for enhanced inference scalability. The model boasts an extensive 128k token context length and was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023. Fine-tuning involved supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Key Capabilities
- Multilingual Dialogue: Optimized for assistant-like chat in multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
- Extended Context: Supports a 128k token context window, enabling processing of longer inputs and generating more coherent, extended responses.
- Advanced Tool Use: Features robust support for tool use and function calling, with detailed prompt formatting guidelines and integration with Hugging Face Transformers chat templates.
- Strong Benchmarks: Demonstrates significant improvements over Llama 3 8B Instruct across various benchmarks, particularly in MMLU (CoT), HumanEval (pass@1), GSM-8K (CoT), MATH (CoT), and API-Bank for tool use.
Good For
- Developing multilingual chat assistants and conversational AI applications.
- Tasks requiring long-context understanding and generation.
- Applications leveraging tool use and function calling for enhanced capabilities.
- Research and commercial use in natural language generation across supported languages.