meta-llama/Llama-3.1-8B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jul 18, 2024License:llama3.1Architecture:Transformer5.8K Gated Warm

Llama 3.1 8B Instruct is an 8 billion parameter instruction-tuned generative language model developed by Meta, optimized for multilingual dialogue and general natural language generation tasks. It features a 128K token context length and utilizes an optimized transformer architecture with Grouped-Query Attention for improved inference scalability. Trained on over 15 trillion tokens with a December 2023 knowledge cutoff, this model excels in areas like reasoning, code generation, and tool use, outperforming its predecessor Llama 3 8B Instruct on various benchmarks.

Loading preview...

Llama 3.1 8B Instruct: Multilingual Dialogue and Enhanced Capabilities

Meta Llama 3.1 8B Instruct is an 8 billion parameter instruction-tuned large language model, part of the Llama 3.1 collection. Developed by Meta, this model is specifically optimized for multilingual dialogue use cases and general natural language generation. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) for efficient inference and boasts a substantial 128K token context length. The model was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023, and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Multilingual Dialogue: Optimized for conversations in supported languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Extended Context Window: Supports a 128K token context length, enabling processing of longer inputs and generating more extensive responses.
  • Enhanced Performance: Demonstrates improved scores over Llama 3 8B Instruct across various benchmarks, including MMLU (69.4%), HumanEval (72.6% pass@1), GSM-8K (84.5% em_maj1@1), and API-Bank (82.6% accuracy) for tool use.
  • Tool Use Support: Features robust support for multiple tool use formats, facilitating integration into agentic systems.
  • Code Generation: Shows strong performance in code-related tasks, with a HumanEval pass@1 score of 72.6%.

Good for

  • Developing assistant-like chat applications requiring high performance in multiple languages.
  • Natural language generation tasks where a large context window is beneficial.
  • Applications requiring robust tool-use and function calling capabilities.
  • Research and commercial use in multilingual environments, with a focus on safety and helpfulness.
  • Improving other models through synthetic data generation and distillation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p