DavidAU/gemma-3-12b-it-vl-Gemini-3-Pro-Preview-Heretic-Uncensored-Thinking

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 10, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

DavidAU/gemma-3-12b-it-vl-Gemini-3-Pro-Preview-Heretic-Uncensored-Thinking is a 12 billion parameter Gemma 3 fine-tune, developed by DavidAU, featuring a 128k context window. This model is specifically engineered for uncensored, deep reasoning, leveraging a Gemini Pro Preview reasoning dataset. It excels at generating detailed, direct responses across general operations, output generation, and image processing, with stable reasoning across temperature ranges.

Loading preview...

Model Overview

This model, developed by DavidAU, is a 12 billion parameter Gemma 3 fine-tune, specifically designed for uncensored, deep reasoning. It utilizes a Gemini Pro Preview reasoning dataset and is trained via Unsloth. A key differentiator is its "Heretic" de-censoring, significantly reducing refusals (7/100 compared to 98/100 for the original model) with minimal KL divergence (0.0826), indicating high fidelity to the base model while removing content restrictions.

Key Capabilities & Features

  • Deep Reasoning: Enhanced reasoning across general model operation, output generation, image processing, and benchmarks. Reasoning is temperature stable.
  • Uncensored Output: Designed to provide direct, uninhibited responses, requiring minimal prompting for explicit content generation.
  • Extended Context: Features a 128k context window.
  • Flexible Activation: Reasoning typically activates automatically but can be explicitly triggered with "think deeply: prompt" or via specialized Jinja templates and system prompts.

Performance & Benchmarks

Compared to the base "Heretic, uncensored" model, this fine-tune shows improved performance across several benchmarks:

  • ARC Challenge: 0.555 (vs 0.534)
  • Hellaswag: 0.721 (vs 0.603)
  • Winogrande: 0.717 (vs 0.658)

Optimal Usage

For smoother operation and enhanced chat/roleplay, users are advised to set a "Smoothing_factor" to 1.5 in interfaces like KoboldCpp, oobabooga/text-generation-webui, or Silly Tavern. Optional system prompts are provided to further enhance thinking and output generation, though not always necessary due to the model's inherent fine-tuning.