alpindale/Llama-3.2-3B

Warm
Public
3.2B
BF16
32768
1
Sep 25, 2024
License: llama3.2
Hugging Face

alpindale/Llama-3.2-3B is a 3.21 billion parameter instruction-tuned large language model developed by Meta, part of the Llama 3.2 collection. Optimized with an advanced transformer architecture, it excels in multilingual dialogue, agentic retrieval, and summarization tasks. This model supports a 32768-token context length and is specifically designed for commercial and research applications requiring efficient, multilingual text generation.

Overview

Model Overview

alpindale/Llama-3.2-3B is a 3.21 billion parameter instruction-tuned model from Meta's Llama 3.2 family, designed for multilingual text-in/text-out generative tasks. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability. The model was trained on up to 9 trillion tokens of publicly available data, with a knowledge cutoff of December 2023. Fine-tuning involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Multilingual Performance: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with broader training data for other languages.
  • Dialogue Optimization: Specifically tuned for assistant-like chat, agentic applications, knowledge retrieval, and summarization.
  • Efficient Architecture: Features a 32768-token context length and GQA for scalable inference.
  • Robust Safety: Developed with a three-pronged strategy for managing trust and safety risks, including extensive red teaming and safety fine-tuning.

Good For

  • Commercial and research applications requiring multilingual text generation.
  • Assistant-like chat and agentic systems, such as knowledge retrieval and summarization.
  • Deployment in constrained environments, including mobile devices, due to its smaller size.