alpindale/Llama-3.2-3B-Instruct
The alpindale/Llama-3.2-3B-Instruct is a 3.21 billion parameter instruction-tuned generative language model developed by Meta, part of the Llama 3.2 collection. Optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks, it features an optimized transformer architecture with a 32768-token context length. This model excels in multilingual performance across supported languages like English, German, French, and Spanish, outperforming many open-source and closed chat models on industry benchmarks.
Loading preview...
Model Overview
alpindale/Llama-3.2-3B-Instruct is a 3.21 billion parameter instruction-tuned model from Meta's Llama 3.2 family, designed for multilingual text-in/text-out generation. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and supports a substantial context length of 32768 tokens. The model was trained on a new mix of publicly available online data, up to 9 trillion tokens, with a knowledge cutoff of December 2023. It incorporates knowledge distillation from larger Llama 3.1 models and undergoes Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO) for alignment.
Key Capabilities
- Multilingual Dialogue: Optimized for multilingual chat, agentic retrieval, and summarization tasks, with official support for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
- Instruction Following: Achieves strong performance on instruction-following benchmarks (e.g., 77.4 on IFEval for the 3B model).
- Mathematical Reasoning: Demonstrates solid capabilities in math tasks, scoring 77.7 on GSM8K (CoT) and 47.3 on MATH (CoT).
- Long Context Handling: Features a 32K context window, showing good recall on Needle in Haystack and performance on InfiniteBench long context tasks.
- Resource Efficiency: The 3B size is suitable for deployment in constrained environments, such as mobile devices, offering a balance of capability and efficiency.
Intended Use Cases
This model is intended for commercial and research use, particularly for assistant-like chat applications, agentic systems (like knowledge retrieval and summarization), mobile AI-powered writing assistants, and query/prompt rewriting. Developers are encouraged to implement additional safety guardrails, especially for constrained environments, and can leverage Meta's provided safeguards like Llama Guard.