tostideluxekaas/Llama-3.2-3B-Instruct-uncensored

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Feb 20, 2026License:llama3.2Architecture:Transformer Warm

Llama-3.2-3B-Instruct-uncensored by tostideluxekaas is a 3.2 billion parameter causal decoder-only transformer model, fine-tuned from unsloth/Llama-3.2-3B-Instruct. This model is highly uncensored, exhibiting near-zero refusals (4 out of 100 test prompts) while maintaining the original quality of its base Llama model. It is optimized for creative writing, roleplay, and open-ended dialogue where maximum freedom and minimal content restrictions are desired.

Loading preview...

Overview

tostideluxekaas/Llama-3.2-3B-Instruct-uncensored is a 3.2 billion parameter instruction-tuned model, developed by tostideluxekaas, based on unsloth/Llama-3.2-3B-Instruct. Its primary distinction is its highly uncensored nature, achieved through a "Heretic abliteration" fine-tuning process, resulting in near-zero refusals.

Key Characteristics

  • Uncensored Output: Demonstrates significantly reduced content refusals, with only 4 out of 100 carefully selected test prompts being refused, compared to 97/100 for the original model.
  • Quality Preservation: Maintains a low KL-divergence of 0.0265, indicating that it largely retains the quality of its base Llama model.
  • Context Length: Supports a context length of 8,192 tokens, inherited from the base model.
  • Multilingual Support: Supports English and other languages, as the base Llama 3.2 model is optimized for multilingual dialogue.

Intended Use Cases

  • Creative Applications: Ideal for creative writing, roleplay, fiction generation, and open-ended dialogue.
  • Research: Suitable for research into model alignment and safety mechanisms.
  • Local Deployment: Well-suited for local deployment environments like Ollama, LM Studio, and SillyTavern, where unrestricted text generation is preferred.

Limitations

  • Not recommended for production systems without human oversight or for sensitive environments.
  • May still produce biased or factually incorrect information, similar to its base model.
  • Not optimized for code generation or high-precision reasoning tasks.