meta-llama/Meta-Llama-3-8B-Instruct

5.0 based on 2 reviews
Warm
Public
8B
FP8
8192
Apr 17, 2024
License: llama3
Hugging Face
Gated
Overview

Meta-Llama-3-8B-Instruct Overview

Meta-Llama-3-8B-Instruct is an 8 billion parameter instruction-tuned model from Meta's Llama 3 family, designed for dialogue and assistant-like chat applications. It leverages an optimized transformer architecture and Grouped-Query Attention (GQA) for efficient inference. The model was trained on over 15 trillion tokens from publicly available online data, with its pretraining data cutoff in March 2023, and features an 8k token context length.

Key Capabilities & Performance

  • Instruction Following: Optimized through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
  • Strong Benchmarks: Significantly outperforms its predecessor, Llama 2 7B and 13B, across various benchmarks including MMLU (68.4), HumanEval (62.2), and GSM-8K (79.6).
  • Reduced Refusals: Engineered to be less prone to false refusals on benign prompts compared to Llama 2, enhancing user experience.

Intended Use Cases

  • Assistant-like Chat: Primarily intended for conversational AI and dialogue systems in English.
  • Commercial and Research: Suitable for a broad range of commercial and research applications requiring natural language generation.
  • Fine-tuning: Developers can fine-tune the model for specific applications or other languages, adhering to the Llama 3 Community License.