meta-llama/Meta-Llama-3-70B

Warm
Public
70B
FP8
8192
License: llama3
Hugging Face
Gated
Overview

Meta-Llama-3-70B: An Advanced Instruction-Tuned LLM

Meta-Llama-3-70B is a 70 billion parameter instruction-tuned generative text model developed by Meta, designed for dialogue and general natural language generation tasks. It is built on an optimized transformer architecture and incorporates supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model was pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023.

Key Capabilities

  • High Performance: Outperforms many other open-source chat models on common industry benchmarks, including MMLU (82.0), HumanEval (81.7), and GSM-8K (93.0).
  • Optimized for Dialogue: Specifically instruction-tuned for assistant-like chat applications.
  • Robust Safety Features: Developed with extensive red teaming, adversarial evaluations, and safety mitigation techniques, resulting in significantly fewer false refusals compared to Llama 2.
  • Efficient Architecture: Utilizes Grouped-Query Attention (GQA) for improved inference scalability.

Good for

  • Commercial and research use in English.
  • Developing assistant-like chat applications.
  • Natural language generation tasks requiring high accuracy and helpfulness.
  • Applications where strong benchmark performance and reduced false refusals are critical.