thomaskuo/gemma-3-12b-it-heretic

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 12, 2026License:gemmaArchitecture:Transformer Warm

The thomaskuo/gemma-3-12b-it-heretic is a 12 billion parameter instruction-tuned Gemma 3 model, abliterated using the Heretic tool to significantly reduce refusals while maintaining model quality. This model is specifically optimized as an uncensored text encoder for video generation models like LTX-2, ensuring more faithful adherence to creative prompts. It features a 32768 token context length and exhibits minimal model damage with a KL divergence of 0.0826, achieving 93% reduction in refusals compared to the base model.

Loading preview...

Overview

This model, thomaskuo/gemma-3-12b-it-heretic, is an abliterated version of Google's Gemma 3 12B IT, created by thomaskuo using the Heretic tool. Its primary purpose is to serve as an uncensored text encoder, particularly for video generation models like LTX-2, by reducing the base model's tendency to refuse or sanitize certain concepts.

Key Capabilities & Features

  • Reduced Refusals: Achieves a significant reduction in refusals (7/100 vs. 100/100 for the original model), meaning 93% of previously refused prompts now work.
  • Minimal Model Damage: The abliteration process resulted in a low KL divergence of 0.0826, indicating that the core model quality is largely preserved.
  • Enhanced Prompt Adherence: By removing soft censorship, the model ensures more faithful encoding of creative prompts, leading to stronger adherence and less altered visual outputs in downstream applications like LTX-2.
  • Versatile Formats: Available in HuggingFace safetensors, ComfyUI safetensors (bf16, fp8), and various GGUF quantizations (F16, Q8_0, Q6_K, Q5_K_M, Q4_K_M, Q3_K_M) for compatibility with transformers, ComfyUI, and llama.cpp.

Ideal Use Cases

  • Video Generation: Specifically designed for use as a text encoder within video generation workflows, such as with LTX-2, where uncensored and faithful prompt interpretation is crucial.
  • Creative Applications: Suitable for applications requiring an instruction-tuned model that avoids sanitization or weakening of creative concepts in its outputs.
  • Research & Experimentation: Useful for researchers exploring the impact of refusal reduction techniques on large language models and their downstream applications.