CCCCCyx/Llama-3.2-3B-Instruct_slime

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 29, 2026License:llama3.2Architecture:Transformer Cold

The CCCCCyx/Llama-3.2-3B-Instruct_slime model is a 3.2 billion parameter instruction-tuned variant of Meta's Llama 3.2 family, optimized for multilingual dialogue use cases. This model, developed by Meta, features an optimized transformer architecture and a 32768 token context length. It excels in agentic retrieval and summarization tasks, outperforming many open-source and closed chat models on common industry benchmarks. Unsloth provides this version, enabling 2.4x faster fine-tuning with 58% less memory.

Loading preview...

CCCCCyx/Llama-3.2-3B-Instruct_slime Overview

This model is an instruction-tuned variant from Meta's Llama 3.2 family, featuring 3.2 billion parameters and a 32768 token context length. It utilizes an optimized transformer architecture and has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Key Capabilities

  • Multilingual Dialogue: Optimized for multilingual dialogue use cases, supporting English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Agentic Retrieval & Summarization: Specifically designed to excel in agentic retrieval and summarization tasks.
  • Performance: Outperforms many open-source and closed chat models on common industry benchmarks.
  • Efficient Fine-tuning: This specific version, provided by Unsloth, allows for 2.4x faster fine-tuning with 58% less memory usage compared to standard methods.

Good For

  • Multilingual Applications: Ideal for applications requiring understanding and generation in multiple languages.
  • Dialogue Systems: Suitable for building conversational AI agents and chatbots.
  • Information Retrieval: Effective for tasks involving retrieving and summarizing information from text.
  • Resource-Efficient Fine-tuning: Developers looking to fine-tune Llama 3.2 models with reduced computational resources and faster training times.