context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Feb 22, 2025License:llama3.2Architecture:Transformer0.0K Warm

The Llama 3.2-3B-Instruct-FP16 model, developed by Meta, is a 3.21 billion parameter instruction-tuned multilingual large language model with a 32768 token context length. Optimized for multilingual dialogue, it excels in agentic retrieval, summarization, and chat applications. This model utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and is fine-tuned using SFT and RLHF for helpfulness and safety, outperforming many open-source and closed chat models on industry benchmarks.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p