Undi95/Meta-Llama-3-70B-hf

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

Undi95/Meta-Llama-3-70B-hf is a 70 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and is fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). Optimized for dialogue use cases, this model excels in general reasoning, knowledge, and mathematical benchmarks, outperforming its predecessor, Llama 2 70B, across various tasks.

Loading preview...

Model Overview

Undi95/Meta-Llama-3-70B-hf is a 70 billion parameter instruction-tuned model from Meta's Llama 3 family, designed for generative text and code. It is built on an optimized transformer architecture, incorporating Grouped-Query Attention (GQA) for enhanced inference scalability. The instruction-tuned variant is specifically optimized for dialogue and assistant-like chat applications, leveraging supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • High Performance: Significantly outperforms Llama 2 70B across various benchmarks, including MMLU (79.5% for base, 82.0% for instruct), AGIEval (63.0%), CommonSenseQA (83.8%), and ARC-Challenge (93.0%).
  • Enhanced Reasoning & Math: Achieves strong results in complex reasoning tasks like BIG-Bench Hard (81.3%) and mathematical problem-solving with GSM-8K (93.0%) and MATH (50.4%) for the instruct model.
  • Code Generation: The instruction-tuned model demonstrates strong coding capabilities, scoring 81.7% on HumanEval.
  • Extensive Training Data: Pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023 for the 70B model.
  • Safety & Alignment: Developed with a strong focus on responsible AI, incorporating extensive red teaming, adversarial evaluations, and safety mitigations, while also reducing false refusals compared to Llama 2.

Good For

  • Dialogue Systems: Ideal for building assistant-like chat applications due to its instruction-tuned optimization.
  • General-Purpose Text Generation: Suitable for a wide range of natural language generation tasks in English.
  • Research & Commercial Use: Intended for both commercial deployment and research endeavors.
  • Applications Requiring Strong Reasoning: Excels in tasks demanding logical inference, common sense, and mathematical understanding.
  • Code Assistance: Can be effectively used as a coding assistant, generating and understanding code.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p