grimjim/gemma-3-12b-it-abliterated

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kLicense:gemmaArchitecture:Transformer0.0K Cold

grimjim/gemma-3-12b-it-abliterated is a 12 billion parameter instruction-tuned language model derived from Google's Gemma-3 architecture, featuring a 32K token context length. This model has undergone a novel "abliteration" process to significantly reduce refusal rates while maintaining awareness of safety and harms. It is optimized for use cases requiring a highly compliant model that still retains ethical considerations.

Loading preview...

Overview

This model, grimjim/gemma-3-12b-it-abliterated, is a 12 billion parameter instruction-tuned variant based on Google's gemma-3-12b-it architecture. It has been subjected to a unique "abliteration" process, which aims to drastically reduce the model's tendency to refuse requests, without compromising its inherent awareness of safety and harmful content.

Key Findings & Process

  • The abliteration process specifically addressed challenges posed by the GeGLU activation function, utilizing magnitude clipping and 32-bit floating-point calculations to maintain performance.
  • Intervention was applied across a majority of layers, with measurements from layers 27 and 33 (global attention layers in Gemma3 12B) forming the basis for modification.
  • A significant finding is the model's ability to retain strong safety awareness despite reduced refusal, supporting research that LLMs encode harmfulness and refusal separately.

Good For

  • Applications where reducing model refusal is critical for user experience.
  • Scenarios requiring a compliant model that still possesses an understanding of ethical boundaries and safety.
  • Developers looking for a Gemma-3 based model with enhanced flexibility in response generation.