ibyteohdear/DreamFast-gemma-3-12b-it-heretic-v2
DreamFast/gemma-3-12b-it-heretic-v2 is an abliterated version of Google's Gemma 3 12B IT model, processed using Heretic v1.2.0 to significantly reduce refusal rates from 100% to 8% while maintaining model quality with a low KL divergence of 0.0801. This 12 billion parameter model is primarily designed as an uncensored text encoder for video generation models like LTX-2, offering more faithful prompt encoding by removing soft censorship in embeddings. It supports various formats including HuggingFace, ComfyUI (with preserved vision capabilities), and GGUF for diverse deployment scenarios.
Loading preview...
DreamFast/gemma-3-12b-it-heretic-v2: Abliterated Gemma 3 12B IT
This model is an abliterated version of Google's Gemma 3 12B IT, created using Heretic v1.2.0. Its primary goal is to reduce model refusals and soft censorship, making it a more flexible text encoder, especially for creative applications like video generation with LTX-2.
Key Features & Improvements (v2):
- Reduced Refusals: Achieves 8/100 refusals (down from 100/100) with minimal model damage (KL divergence 0.0801).
- Preserved Vision: ComfyUI variants retain
vision_modelandmulti_modal_projectorkeys, enabling I2V (image-to-video) prompt enhancement. - Quantization Options: Offers NVFP4 quantization for Blackwell GPUs (~3x smaller than bf16) and updated GGUF support with various quantization levels (Q4_K_M recommended).
- Enhanced Prompt Encoding: By removing soft censorship, it allows for more faithful and less sanitized prompt encoding, which can lead to stronger adherence to creative prompts in video generation.
Use Cases & Considerations:
This model is particularly suited as an uncensored text encoder for video generation models like LTX-2. While abliteration removes refusals and soft censorship in embeddings, the base Gemma model's inherent knowledge limitations remain. It's important to note that LTX-2 was trained on original Gemma embeddings, so directly fine-tuning the text encoder might impact DiT performance. However, it provides a cleaner embedding distribution for creative content.
Limitations:
- Inherits limitations of the base Gemma 3 12B model.
- Abliteration reduces, but does not entirely eliminate, refusals.
- NVFP4 quantization performs optimally on Blackwell GPUs, though software dequantization is supported on older hardware.