meta-llama/Meta-Llama-3-70B-Instruct

Warm
Public
70B
FP8
8192
Apr 17, 2024
License: llama3
Hugging Face
Gated
Overview

Overview

Meta-Llama-3-70B-Instruct is a 70 billion parameter instruction-tuned model from Meta's Llama 3 family, optimized for dialogue and assistant-like chat applications. It utilizes an optimized transformer architecture and was fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to enhance helpfulness and safety. The model was trained on over 15 trillion tokens of publicly available data, with a knowledge cutoff of December 2023, and features an 8k token context length.

Key Capabilities

  • Enhanced Performance: Significantly outperforms Llama 2 70B across various benchmarks, including MMLU (82.0 vs 52.9), HumanEval (81.7 vs 25.6), and GSM-8K (93.0 vs 57.5).
  • Dialogue Optimization: Specifically tuned for conversational use cases, demonstrating improved alignment with human preferences.
  • Reduced Refusals: Engineered to be less prone to false refusals on benign prompts compared to Llama 2, improving user experience.
  • Robust Safety Measures: Developed with extensive red teaming, adversarial evaluations, and safety mitigations, complemented by resources like Meta Llama Guard 2 and Code Shield.

Good For

  • Commercial and research applications requiring high-performance English-language text generation.
  • Building assistant-like chat systems and dialogue agents.
  • Tasks demanding strong general reasoning, knowledge retrieval, and coding capabilities, as evidenced by its benchmark scores.