diffusionmodels1254ani/gemma-3-12b-it-heretic-v2

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:May 12, 2026License:gemmaArchitecture:Transformer Cold

diffusionmodels1254ani/gemma-3-12b-it-heretic-v2 is a 12 billion parameter instruction-tuned Gemma 3 model, developed by diffusionmodels1254ani, that has been 'abliterated' using Heretic v1.2.0 to significantly reduce refusals while preserving model quality. This model is specifically optimized as an uncensored text encoder for video generation models like LTX-2, offering more faithful prompt encoding for creative content. It maintains a 32768 token context length and is available in various formats including HuggingFace, ComfyUI (with vision support), and GGUF quantizations.

Loading preview...

Overview

This model, gemma-3-12b-it-heretic-v2, is an 'abliterated' version of Google's Gemma 3 12B IT, created by diffusionmodels1254ani using the Heretic v1.2.0 tool. The primary goal of this abliteration is to reduce model refusals significantly (from 100/100 to 8/100 in testing) while maintaining minimal damage to the original model's quality (KL divergence of 0.0801). It is designed to function as an uncensored text encoder, particularly for video generation models like LTX-2.

Key Enhancements in v2

  • Utilizes Heretic v1.2.0 with 200 optimization trials for improved refusal reduction.
  • Features better trial selection, resulting in 8/100 refusals at a low KL divergence of 0.0801.
  • Preserves vision capabilities in ComfyUI variants, including vision_model and multi_modal_projector keys for I2V (image-to-video) prompt enhancement.
  • Introduces NVFP4 quantization for ComfyUI, offering a compact 4-bit format (~3x smaller than bf16) optimized for Blackwell GPUs.
  • Includes updated GGUF support, with merged Gemma 3 compatibility in ComfyUI-GGUF.

Primary Differentiator

The core distinction of this model is its reduced censorship and refusal behavior. By removing the 'soft censorship' present in the base Gemma model, it aims to provide more faithful and unadulterated prompt encoding, which is crucial for creative content generation, especially in video synthesis workflows. While the base Gemma model's inherent knowledge limitations persist, abliteration ensures that existing knowledge is not suppressed.

Recommended Use Cases

  • Text encoding for video generation: Specifically designed for use with LTX-2 (Text-to-Video and Image-to-Video) to ensure prompts are encoded without sanitization or weakening.
  • Creative content generation: For applications where adherence to potentially sensitive or 'taboo' prompts is desired, without the model refusing or altering the intent.
  • Research into model censorship and abliteration techniques: Provides a practical example of a model modified to reduce refusals.