RedHatAI/Llama-3.1-8B-Instruct Overview
This model is an 8 billion parameter instruction-tuned variant from Meta's Llama 3.1 family of multilingual large language models. It is built on an optimized transformer architecture, utilizing supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model was trained on over 15 trillion tokens of publicly available online data with a knowledge cutoff of December 2023, and supports a substantial context length of 128K tokens (though the 8B model specifically uses 32K in deployment examples).
Key Capabilities
- Multilingual Dialogue: Optimized for assistant-like chat in multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
- Enhanced Performance: Outperforms many open-source and closed chat models on common industry benchmarks, showing improvements in MMLU, reasoning (ARC-C), code generation (HumanEval, MBPP++), and mathematics (GSM-8K, MATH) compared to its Llama 3 8B Instruct predecessor.
- Tool Use: Supports various tool use formats, enabling integration with external functions and services, with examples provided for Transformers chat templates.
- Scalable Inference: Incorporates Grouped-Query Attention (GQA) for improved inference scalability across the Llama 3.1 family.
Good For
- Assistant-like Chatbots: Its instruction-tuned nature makes it ideal for conversational AI applications.
- Multilingual Applications: Suitable for use cases requiring interaction in the supported languages.
- Code Generation and Mathematical Reasoning: Demonstrates strong performance in these technical domains.
- Research and Commercial Use: Intended for a broad range of applications under the Llama 3.1 Community License, including synthetic data generation and model distillation.