Benholl94/Llama-3.2-3B-Instruct-abliterated

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 29, 2026License:llama3.2Architecture:Transformer Warm

Benholl94/Llama-3.2-3B-Instruct-abliterated is a 3.2 billion parameter instruction-tuned causal language model, derived from Llama 3.2 3B Instruct. Developed by Benholl94 using the abliteration technique, this model is specifically designed to be an uncensored version of its base. It maintains a 32768 token context length and shows slight improvements across various benchmarks like IF_Eval, MMLU Pro, TruthfulQA, BBH, and GPQA, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Overview

Benholl94/Llama-3.2-3B-Instruct-abliterated is a 3.2 billion parameter instruction-tuned model based on the Llama 3.2 3B Instruct architecture. Its primary distinction is the application of the "abliteration" technique, which aims to create an uncensored version of the original model. This process, credited to @FailSpy, modifies the model's behavior while retaining its core capabilities.

Key Capabilities & Features

  • Uncensored Output: Designed to provide less restricted content generation compared to its base model.
  • Llama Architecture: Built upon the Llama 3.2 3B Instruct foundation, ensuring a familiar and robust underlying structure.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Benchmark Performance: Demonstrates marginal improvements over the base Llama-3.2-3B-Instruct across several benchmarks, including:
    • IF_Eval: 76.76 (vs 76.55)
    • MMLU Pro: 28.00 (vs 27.88)
    • TruthfulQA: 50.73 (vs 50.55)
    • BBH: 41.86 (vs 41.81)
    • GPQA: 28.41 (vs 28.39)

Deployment

This model can be easily integrated into Ollama environments. Users can either run a pre-quantized version directly or download the model and create custom quantized versions using provided instructions, allowing for flexible deployment and local execution.