Model Overview
Meta's Llama 3.2-3B-Instruct-Base is a 3.21 billion parameter instruction-tuned model from the Llama 3.2 family, optimized for multilingual dialogue. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) and supports a substantial 32768 token context length. The model was developed by Meta and released on September 25, 2024, with training data cutoff in December 2023.
Key Capabilities
- Multilingual Dialogue: Specifically optimized for multilingual chat and agentic applications.
- Agentic Tasks: Excels in knowledge retrieval and summarization use cases.
- Performance: Outperforms many open-source and closed chat models on common industry benchmarks, including strong results on MMLU, GSM8K, and IFEval.
- Long Context: Features a 32768 token context window, supporting complex interactions.
- Supported Languages: Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with broader training data for other languages.
Intended Use Cases
This model is ideal for commercial and research applications requiring:
- Assistant-like chat functionalities.
- Agentic systems for knowledge retrieval and summarization.
- Mobile AI-powered writing assistants.
- Query and prompt rewriting tasks.
Unique Aspects
Llama 3.2 models incorporate logits from larger Llama 3.1 8B and 70B models during pretraining, utilizing knowledge distillation to enhance performance. It also emphasizes responsible deployment with robust safety fine-tuning and system-level safeguards, making it suitable for constrained environments like mobile devices.