DavidAU/gemma-3-12b-it-vl-GLM-4.7-Flash-INSTRUCT-Thinking-Hybrid-Heretic-Uncensored

Hugging Face
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Feb 25, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

DavidAU/gemma-3-12b-it-vl-GLM-4.7-Flash-INSTRUCT-Thinking-Hybrid-Heretic-Uncensored is a 12 billion parameter Gemma-based instruction-tuned model, fine-tuned by DavidAU using GLM 4.7 Flash reasoning dataset and HI16 training methods. It features a 128k context window and is designed for variable thinking and instruction-following, offering uncensored responses. This model excels in detailed reasoning and output generation, with enhanced image processing capabilities.

Loading preview...

Model Overview

This model, developed by DavidAU, is a 12 billion parameter Gemma-based instruction-tuned variant, leveraging GLM 4.7 Flash reasoning datasets and HI16 training methods. It is designed to be fully uncensored and offers a unique "variable thinking/instruct" capability, adapting its behavior based on prompt keywords. The model features an extended 128k context window and maintains reasoning stability across a wide temperature range (0.1 to 2.5).

Key Capabilities

  • Uncensored Responses: Provides direct answers without refusal, though explicit content may require specific directives.
  • Variable Thinking/Instruct: Dynamically switches between instruction-following and deep thinking modes based on prompt cues.
  • Enhanced Reasoning: Incorporates detailed and compact reasoning that improves general model operation, output generation, and image processing.
  • High Context Window: Supports up to 128k tokens, allowing for extensive input and output.
  • Improved Quantization: The model's architecture, with a manually split and trained output tensor, is noted to improve general quantization quality.

Benchmarks & Decensoring

Compared to its uncensored base, this model shows improved performance across several benchmarks, including arc_challenge, hellaswag, and piqa. Notably, its KL divergence is 0.0826, indicating minimal damage from the decensoring process, and it significantly reduces refusals from 98/100 to 7/100 compared to the original google/gemma-3-12b-it.

Good for

  • Use cases requiring uncensored and direct responses.
  • Applications benefiting from deep, detailed, and compact reasoning.
  • Scenarios where variable instruction-following and thinking capabilities are advantageous.
  • Tasks involving image processing that can benefit from enhanced reasoning.
  • Developers seeking a Gemma-based model with a large context window and robust performance.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p