Model Overview
zededa/Llama-3.2-1B-Instruct is a 1 billion parameter instruction-tuned model from Meta's Llama 3.2 collection. It is built on an optimized transformer architecture utilizing Grouped-Query Attention (GQA) for enhanced inference scalability and features a 32768 token context length. The instruction-tuned versions are aligned with human preferences through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF).
Key Capabilities
- Multilingual Dialogue: Optimized for multilingual dialogue use cases, including agentic retrieval and summarization.
- Supported Languages: Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with training on a broader collection of languages.
- Performance: Outperforms many open-source and closed chat models on common industry benchmarks.
- Architecture: Uses an auto-regressive language model with an optimized transformer architecture and Grouped-Query Attention.
Good For
- Multilingual Applications: Ideal for applications requiring multilingual interaction and understanding.
- Dialogue Systems: Suitable for building conversational AI agents, chatbots, and systems requiring dialogue capabilities.
- Retrieval and Summarization: Optimized for tasks involving information retrieval and text summarization in various languages.
- Finetuning: The model is designed to be finetuned, with tools like Unsloth offering accelerated finetuning with reduced memory usage.