Sabomako/gemma-3-12b-it-heretic

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 9, 2026Architecture:Transformer0.0K Warm

Sabomako/gemma-3-12b-it-heretic is a 12 billion parameter instruction-tuned language model, derived from Google's Gemma-3-12b-it. This model has been modified using the Heretic v1.2.0 tool with Magnitude-Preserving Orthogonal Ablation (MPOA) to reduce content refusals. It maintains a low KL divergence of 0.024 compared to the original, while significantly decreasing refusal rates, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Sabomako/gemma-3-12b-it-heretic Overview

This model is a decensored version of the google/gemma-3-12b-it 12 billion parameter instruction-tuned language model. It was created using the Heretic v1.2.0 tool, specifically employing Magnitude-Preserving Orthogonal Ablation (MPOA) to modify its behavior.

Key Characteristics & Performance

The primary modification targets the model's propensity for refusals, aiming to provide a more open-ended generation experience. Performance metrics highlight this change:

  • KL divergence: 0.024 (indicating minimal deviation from the original model's statistical distribution)
  • Refusals: Reduced from 97/100 in the original model to 4/100 in this modified version.

This significant reduction in refusal rate suggests that the model is less likely to decline generating responses based on perceived content restrictions, while largely preserving the original model's underlying capabilities.

Abliteration Parameters

The modification process involved specific abliteration parameters, including adjustments to direction_index, attn.o_proj weights and positions, and mlp.down_proj weights and positions. These parameters define how the MPOA technique was applied to achieve the desired decensoring effect.

Use Cases

This model is particularly suited for applications where a less restrictive or 'decensored' output is desired, such as creative writing, open-ended dialogue, or research into model safety and bias. Users should be aware of the reduced refusal rate and plan accordingly for content moderation if necessary.