Undi95/Meta-Llama-3-70B-Instruct-hf

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 18, 2024License:llama3Architecture:Transformer0.0K Warm

Undi95/Meta-Llama-3-70B-Instruct-hf is a 70 billion parameter instruction-tuned generative text model developed by Meta, part of the Llama 3 family. Optimized for dialogue use cases, it utilizes an auto-regressive transformer architecture with Grouped-Query Attention and a context length of 8192 tokens. This model is fine-tuned using SFT and RLHF to align with human preferences for helpfulness and safety, outperforming many open-source chat models on industry benchmarks.

Loading preview...

Model Overview

Undi95/Meta-Llama-3-70B-Instruct-hf is a 70 billion parameter instruction-tuned model from Meta's Llama 3 family, designed for dialogue and assistant-like chat applications. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) for efficient inference and supports an 8k token context length. The model was trained on over 15 trillion tokens of publicly available data, with a knowledge cutoff of December 2023 for the 70B version.

Key Capabilities

  • Enhanced Dialogue Performance: Optimized for chat and assistant-like interactions through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
  • Strong Benchmark Results: Significantly outperforms Llama 2 70B across various benchmarks, including MMLU (82.0 vs 52.9), HumanEval (81.7 vs 25.6), and GSM-8K (93.0 vs 57.5).
  • Safety and Refusal Improvements: Features extensive red teaming, adversarial evaluations, and mitigations to reduce residual risks and significantly decrease false refusals compared to Llama 2.
  • Code Generation: Demonstrates strong performance in code generation tasks, achieving 81.7 on HumanEval.

Good for

  • Commercial and Research Use: Intended for a wide range of applications in English-speaking contexts.
  • Assistant-like Chatbots: Its instruction-tuned nature makes it highly suitable for conversational AI and virtual assistants.
  • Code Generation Tasks: Excels in generating code, making it valuable for developer tools and programming assistance.
  • Applications Requiring High Helpfulness: Designed to be highly helpful while incorporating robust safety measures.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p