Overview

Meta Llama 3 8B is an 8 billion parameter instruction-tuned large language model developed by Meta. It is built on an optimized transformer architecture, incorporating Grouped-Query Attention (GQA) for enhanced inference scalability. The model is designed for dialogue use cases and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. It supports a context length of 8192 tokens and was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of March 2023.

Key Capabilities

Instruction Following: Optimized for assistant-like chat and dialogue applications.
Performance: Outperforms many open-source chat models on industry benchmarks, showing significant improvements over Llama 2 7B across various tasks like MMLU, AGIEval, and HumanEval.
Safety: Developed with extensive red teaming, adversarial evaluations, and safety mitigations, including reduced false refusal rates compared to Llama 2.
Code Generation: Achieves a HumanEval score of 62.2, indicating strong code generation capabilities.

Good for

Commercial and Research Use: Intended for a wide range of applications in English.
Assistant-like Chatbots: Excels in conversational AI scenarios due to its instruction-tuned nature.
Natural Language Generation: Adaptable for various text generation tasks.
Developers: Provides resources and guidance for responsible AI development, including integration with tools like Meta Llama Guard 2 and Code Shield.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)