Model Overview
This model, developed by DavidAU, is a 12 billion parameter Gemma-based instruction-tuned variant, leveraging GLM 4.7 Flash reasoning datasets and HI16 training methods. It is designed to be fully uncensored and offers a unique "variable thinking/instruct" capability, adapting its behavior based on prompt keywords. The model features an extended 128k context window and maintains reasoning stability across a wide temperature range (0.1 to 2.5).
Key Capabilities
- Uncensored Responses: Provides direct answers without refusal, though explicit content may require specific directives.
- Variable Thinking/Instruct: Dynamically switches between instruction-following and deep thinking modes based on prompt cues.
- Enhanced Reasoning: Incorporates detailed and compact reasoning that improves general model operation, output generation, and image processing.
- High Context Window: Supports up to 128k tokens, allowing for extensive input and output.
- Improved Quantization: The model's architecture, with a manually split and trained output tensor, is noted to improve general quantization quality.
Benchmarks & Decensoring
Compared to its uncensored base, this model shows improved performance across several benchmarks, including arc_challenge, hellaswag, and piqa. Notably, its KL divergence is 0.0826, indicating minimal damage from the decensoring process, and it significantly reduces refusals from 98/100 to 7/100 compared to the original google/gemma-3-12b-it.
Good for
- Use cases requiring uncensored and direct responses.
- Applications benefiting from deep, detailed, and compact reasoning.
- Scenarios where variable instruction-following and thinking capabilities are advantageous.
- Tasks involving image processing that can benefit from enhanced reasoning.
- Developers seeking a Gemma-based model with a large context window and robust performance.