meta-llama/Llama-3.1-405B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:405BQuant:FP8Ctx Length:32kPublished:Jul 16, 2024License:llama3.1Architecture:Transformer1.0K Gated Warm

The Meta Llama 3.1 405B is a large, 405 billion parameter multilingual language model developed by Meta, part of the Llama 3.1 collection. It features an optimized transformer architecture with Grouped-Query Attention and a 128k token context length. This instruction-tuned model is optimized for multilingual dialogue use cases, outperforming many chat models on common benchmarks, and supports commercial and research applications in languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Loading preview...

Llama 3.1 405B: Multilingual Dialogue and Advanced Reasoning

The Meta Llama 3.1 405B is the largest model in the Llama 3.1 series, developed by Meta. This instruction-tuned, 405 billion parameter model is built on an optimized transformer architecture with Grouped-Query Attention (GQA) and boasts an extensive 128k token context length. It was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023.

Key Capabilities

  • Multilingual Dialogue: Optimized for assistant-like chat in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with support for additional languages through fine-tuning.
  • High Performance: Outperforms many open-source and closed chat models on industry benchmarks, achieving 87.3% on MMLU and 96.8% on GSM-8K (CoT).
  • Advanced Reasoning & Code: Demonstrates strong performance in reasoning tasks (e.g., 96.9% on ARC-C) and code generation (89.0% on HumanEval pass@1).
  • Tool Use: Shows significant improvements in tool-use benchmarks like API-Bank (92.0%) and BFCL (88.5%).

Good For

  • Commercial and Research Use: Intended for a wide range of applications, from assistant-like chat to natural language generation tasks.
  • Multilingual Applications: Ideal for developing applications requiring robust performance across its 8 explicitly supported languages.
  • System Integration: Designed to be deployed as part of larger AI systems, with Meta providing safeguards like Llama Guard 3, Prompt Guard, and Code Shield for responsible development.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p