Leon1000/gemma-3-12b-it-heretic-v2
Leon1000/gemma-3-12b-it-heretic-v2 is a 12 billion parameter instruction-tuned model based on Google's Gemma 3, specifically modified using the Heretic tool to significantly reduce refusals. This model maintains quality while offering uncensored text encoding, making it particularly suitable for creative video generation workflows with models like LTX-2. It features a 32768 token context length and is available in various optimized formats including ComfyUI-native NVFP4 and GGUF quantizations.
Loading preview...
Overview
Leon1000/gemma-3-12b-it-heretic-v2 is a 12 billion parameter instruction-tuned model derived from Google's Gemma 3 12B IT. It has been "abliterated" using the Heretic v1.2.0 tool to reduce refusals from 100/100 to 8/100, meaning 92% of previously refused prompts now work. This process was carefully performed to minimize model damage, achieving a KL divergence of 0.0801.
Key Capabilities & Features
- Reduced Refusals: Significantly lowers the model's tendency to refuse prompts, enabling more faithful encoding of creative or sensitive content.
- Vision Preserved: ComfyUI variants retain
vision_modelandmulti_modal_projectorkeys, supporting Image-to-Video (I2V) prompt enhancement for models like LTX-2. - Optimized Quantizations: Available in ComfyUI-native NVFP4 (7.8GB, ideal for Blackwell GPUs) and various GGUF formats (e.g., Q4_K_M at 6.8GB) for efficient deployment.
- Enhanced for Video Generation: Specifically designed as an uncensored text encoder for video generation models like LTX-2, addressing soft censorship in embeddings.
Why Use This Model?
This model is particularly beneficial for users who require a less censored text encoder for creative applications, especially in video generation workflows. By removing soft censorship, it allows for more faithful adherence to prompts, preventing the model from sanitizing or weakening certain concepts in the embeddings. While the base Gemma model's inherent knowledge limitations remain, this abliterated version ensures that the text encoder does not introduce additional censorship, leading to more accurate and unconstrained visual outputs when paired with models like LTX-2.