Model Overview
Llama 3.2-1B, developed by Meta, is a 1 billion parameter multilingual large language model (LLM) from the Llama 3.2 collection. It features an optimized transformer architecture and is instruction-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model was trained on up to 9 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023.
Key Capabilities
- Multilingual Support: Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with broader training data for other languages.
- Dialogue Optimization: Specifically instruction-tuned for multilingual dialogue use cases.
- Agentic Applications: Excels in agentic retrieval and summarization tasks.
- Quantization: Available in quantized versions (SpinQuant, QLoRA) optimized for on-device use with limited compute resources, demonstrating significant improvements in inference speed and reduced memory footprint.
- Long Context: Supports a context length of 32768 tokens.
Good For
- Commercial and Research Use: Intended for a wide range of applications.
- Assistant-like Chat: Ideal for building conversational AI assistants.
- Agentic Systems: Suitable for knowledge retrieval and summarization agents.
- Mobile AI: Quantized versions are designed for deployment in highly constrained environments, such as mobile devices.
- Prompt Rewriting: Can be used for query and prompt rewriting tasks.