gshasiri/llama3.2-1B-chatml

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1BQuant:BF16Ctx Length:32kPublished:Nov 10, 2025License:llama3.2Architecture:Transformer Warm

The gshasiri/llama3.2-1B-chatml model is a 1.23 billion parameter instruction-tuned generative language model from Meta's Llama 3.2 collection, optimized for multilingual dialogue use cases. With a context length of 32768 tokens, it excels at agentic retrieval, summarization tasks, and mobile AI-powered writing assistants. This model is designed for commercial and research use, supporting English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, and offers strong performance on various benchmarks for its size.

Loading preview...

Llama 3.2 1B ChatML: Multilingual Dialogue and Agentic Applications

This model is a 1.23 billion parameter instruction-tuned variant from Meta's Llama 3.2 collection, specifically optimized for multilingual dialogue. It leverages an optimized transformer architecture and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model supports a substantial context length of 32,768 tokens, making it suitable for complex conversational and document-based tasks.

Key Capabilities

  • Multilingual Dialogue: Optimized for chat and conversational AI in officially supported languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Agentic Applications: Excels in tasks requiring knowledge retrieval, summarization, and query/prompt rewriting.
  • Quantization Support: Designed with quantization schemes (SpinQuant, QLoRA) for efficient deployment in constrained environments like mobile devices, significantly improving decode speed and reducing memory footprint.
  • Robust Safety Alignment: Developed with a comprehensive safety strategy, including extensive fine-tuning, red teaming, and integration with safeguards like Llama Guard.

Good For

  • Mobile AI Applications: Its smaller size and quantization optimizations make it ideal for on-device AI-powered writing assistants and other mobile use cases.
  • Multilingual Chatbots: Building conversational agents that can effectively interact in multiple languages.
  • Information Retrieval Systems: Enhancing agentic systems for knowledge retrieval and summarization from long documents.
  • Research and Commercial Use: Intended for both academic research and commercial deployments, with a focus on responsible AI development.