grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 24, 2026License:gemmaArchitecture:Transformer0.0K Cold

grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B is a 12 billion parameter Gemma-3-IT model that has undergone Orthogonal Reflection Bounded Ablation (ORBA) to specifically target and reduce refusal behaviors. This model utilizes directional steering and row-wise norm clamping to geometrically ablate select refusal personas while maintaining safety knowledge and awareness. It is designed for applications requiring a large language model with refined control over refusal responses, without impacting its inherent vision capabilities.

Loading preview...

Model Overview

This model, gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B, is a 12 billion parameter Gemma-3-IT variant developed by grimjim. It incorporates a novel technique called Orthogonal Reflection Bounded Ablation (ORBA) applied to specific layers, targeting both mlp.down_proj.weight and self_attn.o_proj.weight streams.

Key Capabilities & Innovations

  • Refusal Behavior Ablation: Select refusal behaviors have been geometrically ablated using directional steering and Householder reflection, aiming to neutralize refusal personas while preserving safety knowledge.
  • Numerical Stability: Row-wise clamping of norms ensures numerical conservation, and specific magnitude clipping (Winsorization to 0.995) was implemented to prevent token-level glitching, particularly under the GeGLU activation function.
  • Vision Stack Intact: The model's inherent vision capabilities remain untouched by the ablation process.

When to Use This Model

  • Controlled Response Generation: Ideal for use cases where mitigating specific refusal behaviors is critical, allowing for more compliant or directed outputs.
  • Safety-Conscious Applications: Suitable for applications requiring a model that retains its safety awareness but has reduced tendencies for certain refusal patterns.
  • Vision-Integrated Tasks: Can be used in scenarios that leverage its 12B parameter scale and vision capabilities, alongside its refined refusal control.