NousResearch/Meta-Llama-3.1-8B is an 8 billion parameter instruction-tuned generative language model developed by Meta, part of the Llama 3.1 collection. Optimized for multilingual dialogue use cases, it features a 128k context length and is trained on over 15 trillion tokens of diverse online data with a December 2023 cutoff. This model excels in assistant-like chat applications and supports commercial and research use across multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Overview
Meta-Llama-3.1-8B is an 8 billion parameter instruction-tuned model from Meta's Llama 3.1 family, designed for multilingual dialogue. It utilizes an optimized transformer architecture and is fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model boasts a substantial 128k context length and was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023.
Key Capabilities
- Multilingual Support: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for fine-tuning in other languages.
- Enhanced Performance: Instruction-tuned versions show improvements across various benchmarks, including MMLU, IFEval, HumanEval, and MATH, compared to Llama 3 8B Instruct.
- Tool Use: Demonstrates significant advancements in tool-use benchmarks like API-Bank and BFCL.
- Long Context Window: Features a 128k token context length, enabling processing of extensive inputs.
Good For
- Assistant-like Chat: Ideal for building conversational AI applications and chatbots.
- Multilingual Applications: Suitable for commercial and research use cases requiring multilingual text generation and understanding.
- Code Generation: Shows strong performance in coding benchmarks like HumanEval and MBPP++.
- Reasoning and Math: Improved capabilities in complex reasoning and mathematical problem-solving tasks.