Name: alpindale/Llama-3.2-3B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: alpindale

Model Overview

alpindale/Llama-3.2-3B is a 3.21 billion parameter instruction-tuned model from Meta's Llama 3.2 family, designed for multilingual text-in/text-out generative tasks. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability. The model was trained on up to 9 trillion tokens of publicly available data, with a knowledge cutoff of December 2023. Fine-tuning involved Supervised Fine-Tuning (SFT), Rejection Sampling (RS), and Direct Preference Optimization (DPO) to align with human preferences for helpfulness and safety.

Key Capabilities

Multilingual Performance: Optimized for English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with broader training data for other languages.
Dialogue Optimization: Specifically tuned for assistant-like chat, agentic applications, knowledge retrieval, and summarization.
Efficient Architecture: Features a 32768-token context length and GQA for scalable inference.
Robust Safety: Developed with a three-pronged strategy for managing trust and safety risks, including extensive red teaming and safety fine-tuning.

Good For

Commercial and research applications requiring multilingual text generation.
Assistant-like chat and agentic systems, such as knowledge retrieval and summarization.
Deployment in constrained environments, including mobile devices, due to its smaller size.