u12312828/ddc_models

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:May 7, 2026License:llama3.2Architecture:Transformer Warm

u12312828/ddc_models is a 3.2 billion parameter instruction-tuned Llama 3.2 model developed by Meta, optimized for multilingual dialogue use cases including agentic retrieval and summarization tasks. This model leverages an optimized transformer architecture and is fine-tuned using SFT and RLHF for helpfulness and safety. It supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, outperforming many open-source and closed chat models on common industry benchmarks.

Loading preview...

Overview

This model, u12312828/ddc_models, is a 3.2 billion parameter instruction-tuned variant of Meta's Llama 3.2 architecture. It is part of a collection of multilingual large language models (LLMs) designed for text-in/text-out generative tasks. The Llama 3.2 instruction-tuned models are specifically optimized for multilingual dialogue use cases, such as agentic retrieval and summarization.

Key Capabilities

  • Multilingual Support: Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with training on a broader set of languages.
  • Optimized Architecture: Utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability.
  • Instruction-Tuned: Fine-tuned using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) to align with human preferences for helpfulness and safety.
  • Performance: Outperforms many available open-source and closed chat models on common industry benchmarks.

Good For

  • Multilingual Dialogue: Excels in conversational AI applications requiring support for multiple languages.
  • Agentic Retrieval: Suitable for tasks involving information retrieval within an agentic framework.
  • Summarization: Effective for generating concise summaries from text inputs.
  • Finetuning: This model is part of a collection that can be finetuned 2-5x faster with 70% less memory using Unsloth, making it accessible for custom applications on platforms like Google Colab.