Meta-Llama-3-8B: An Advanced 8B Parameter LLM from Meta

Meta-Llama-3-8B is an 8 billion parameter instruction-tuned large language model developed by Meta, designed for generative text and code. As part of the Llama 3 family, it leverages an optimized transformer architecture and incorporates Grouped-Query Attention (GQA) for enhanced inference scalability. The instruction-tuned variant is specifically optimized for dialogue use cases, demonstrating superior performance compared to previous Llama 2 models on various industry benchmarks.

Key Capabilities

Optimized for Dialogue: Instruction-tuned for assistant-like chat applications, aligning with human preferences for helpfulness and safety through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
Strong Benchmark Performance: Achieves 68.4 on MMLU (5-shot), 62.2 on HumanEval (0-shot), and 79.6 on GSM-8K (8-shot, CoT), significantly surpassing Llama 2 7B and 13B models.
Extensive Training Data: Pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of March 2023, and fine-tuned with over 10 million human-annotated examples.
8K Context Length: Supports an 8,000 token context window, enabling processing of longer inputs and generating more coherent responses.
Responsible AI Focus: Developed with a strong emphasis on safety, including extensive red teaming, adversarial evaluations, and mitigations to reduce residual risks and false refusals, making it more helpful than Llama 2.

Good for

Building highly performant English-language chatbots and virtual assistants.
General natural language generation tasks requiring high accuracy and coherence.
Applications benefiting from strong reasoning and code generation capabilities, as indicated by its HumanEval and GSM-8K scores.
Developers seeking a powerful, openly available model with robust safety considerations for commercial and research use.

Overview

Meta-Llama-3-8B: An Advanced 8B Parameter LLM from Meta

Key Capabilities

Good for

Full Model Card (README)