Guilherme34/sadtest

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Jan 5, 2026License:llama3.2Architecture:Transformer Warm

Guilherme34/sadtest is a 3.21 billion parameter instruction-tuned Llama 3.2 model developed by Meta, optimized for multilingual dialogue use cases. This auto-regressive language model utilizes an optimized transformer architecture and is fine-tuned with SFT and RLHF for helpfulness and safety. It excels in agentic retrieval, summarization tasks, and supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. The model features a 32768 token context length and incorporates advanced quantization schemes like SpinQuant and QLoRA for efficient on-device deployment.

Loading preview...

Guilherme34/sadtest: A Multilingual Llama 3.2 Model

This model is a 3.21 billion parameter instruction-tuned variant from Meta's Llama 3.2 collection, designed for multilingual text-in/text-out generative tasks. It leverages an optimized transformer architecture, enhanced with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The model was trained on up to 9 trillion tokens of publicly available data with a knowledge cutoff of December 2023, and features a 32768 token context length.

Key Capabilities

  • Multilingual Dialogue: Optimized for conversational AI in officially supported languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Agentic Applications: Excels in tasks like knowledge retrieval, summarization, mobile AI-powered writing assistants, and query/prompt rewriting.
  • Efficient Deployment: Incorporates advanced quantization schemes such as SpinQuant and QLoRA, significantly improving inference speed (up to 2.6x decode speed) and reducing model size and memory footprint for on-device use cases.
  • Robust Safety: Developed with a strong focus on responsible AI, including extensive safety fine-tuning, red teaming, and integration with safeguards like Llama Guard.

Good For

  • Commercial and research applications requiring multilingual chat and agentic capabilities.
  • Deployments in constrained environments, such as mobile devices, due to its optimized size and performance.
  • Developers seeking a foundation model for natural language generation tasks, with options for further fine-tuning.