context-labs/Meta-Llama-3.1-8B-Instruct-FP16

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 9, 2025License:llama3.1Architecture:Transformer0.0K Warm

The Meta Llama 3.1 8B Instruct model is an 8 billion parameter instruction-tuned generative language model developed by Meta, optimized for multilingual dialogue use cases. It utilizes an optimized transformer architecture with Grouped-Query Attention and a 128k token context length, trained on over 15 trillion tokens of publicly available online data. This model excels in general reasoning, code generation, and mathematical tasks, supporting languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Loading preview...

Meta Llama 3.1 8B Instruct: Overview

Meta Llama 3.1 8B Instruct is an 8 billion parameter instruction-tuned model from Meta's Llama 3.1 family, designed for multilingual dialogue. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) and boasts a substantial 128k token context length. The model was trained on over 15 trillion tokens of diverse public online data, with a knowledge cutoff of December 2023.

Key Capabilities & Performance

  • Multilingual Support: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for fine-tuning in other languages.
  • Enhanced Instruction Following: Significantly improved performance on instruction-tuned benchmarks, including MMLU (73.0% CoT), ARC-C (83.4%), and IFEval (80.4%).
  • Strong Code Generation: Achieves 72.6% pass@1 on HumanEval and 72.8% on MBPP++.
  • Advanced Reasoning & Math: Demonstrates 84.5% on GSM-8K (CoT) and 51.9% on MATH (CoT).
  • Tool Use Integration: Shows substantial gains in tool use benchmarks like API-Bank (82.6%) and BFCL (76.1%), supporting various tool use formats.

Intended Use Cases

This model is suitable for commercial and research applications requiring assistant-like chat and natural language generation in multiple languages. It is also designed to support synthetic data generation and distillation for improving other models. Developers are encouraged to integrate system-level safeguards, such as Llama Guard 3, Prompt Guard, and Code Shield, for responsible deployment, especially when leveraging its new capabilities like long context and tool use.