Overview

This model is the 8 billion parameter instruction-tuned variant from Meta's Llama 3.1 collection, designed for multilingual text-in/text-out generative tasks. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) and boasts an extended context length of 128k tokens. The model was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023, and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for alignment with human preferences.

Key Capabilities

Multilingual Support: Optimized for dialogue in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for other languages through fine-tuning.
Extended Context Window: Features a 128k token context length, enabling processing of longer inputs and generating more extensive responses.
Instruction Following: Instruction-tuned for assistant-like chat and various natural language generation tasks.
Code Generation: Demonstrates strong performance in code generation benchmarks like HumanEval (72.6 pass@1 for 8B Instruct).
Tool Use: Shows significant improvements in tool-use benchmarks such as API-Bank (82.6 acc for 8B Instruct).

Good For

Multilingual Chatbots: Ideal for building conversational AI agents that operate across multiple languages.
Code Assistants: Suitable for applications requiring code generation and understanding.
Research and Commercial Use: Intended for a broad range of commercial and research applications, including synthetic data generation and model distillation.
Long-Context Applications: Beneficial for tasks requiring processing or generating extensive text, thanks to its 128k context window.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)