NousResearch/Meta-Llama-3-70B-Instruct

Warm
Public
70B
FP8
8192
License: other
Hugging Face
Overview

Overview

NousResearch/Meta-Llama-3-70B-Instruct is a 70 billion parameter instruction-tuned large language model developed by Meta. It is part of the Llama 3 family, utilizing an optimized transformer architecture and Grouped-Query Attention (GQA) for improved inference scalability. The model was trained on over 15 trillion tokens of publicly available data, with fine-tuning data including public instruction datasets and over 10 million human-annotated examples. Its pretraining data has a cutoff of December 2023, and it supports a context length of 8192 tokens.

Key Capabilities

  • Optimized for Dialogue: Specifically instruction-tuned for assistant-like chat applications.
  • Strong Performance: Outperforms many other open-source chat models on common industry benchmarks, demonstrating significant improvements over Llama 2 70B across various metrics like MMLU (82.0 vs 52.9), HumanEval (81.7 vs 25.6), and GSM-8K (93.0 vs 57.5).
  • Text and Code Generation: Capable of generating both text and code outputs.
  • Enhanced Safety and Helpfulness: Developed with a focus on optimizing helpfulness and safety, incorporating supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).

Good for

  • Assistant-like Chatbots: Ideal for building conversational AI agents and interactive applications.
  • General Natural Language Generation: Suitable for a wide range of text generation tasks in English.
  • Commercial and Research Use: Intended for both commercial deployment and academic research in English-speaking contexts.