grimjim/gemma-3-12b-it-norm-preserved-biprojected-abliterated

Warm
Public
Vision
12B
FP8
32768
License: gemma
Hugging Face
Overview

Model Overview

This model, gemma-3-12b-it-norm-preserved-biprojected-abliterated, is a 12 billion parameter instruction-tuned variant based on Google's gemma-3-12b-it. Its primary innovation lies in the application of 'projected abliteration' and 'norm-preserving biprojected abliteration' techniques.

Key Differentiators

  • Reduced Refusal Rates: The model has undergone specific interventions to significantly decrease its tendency to refuse prompts, making it more compliant than its base model.
  • Norm Preservation: Unlike some abliteration methods, this approach focuses on removing only the directional component of refusal, thereby preserving the norms of the intervened layers. This aims to minimize model damage.
  • Safety Awareness: Despite the reduction in refusal, the model is designed to retain its awareness of safety guidelines and potential harms.
  • No Post-Abliteration Fine-Tuning: The model's design avoids subsequent fine-tuning to repair damage, suggesting a robust abliteration process.

Intended Use Cases

This model is particularly suited for applications where a lower refusal rate is desired, while still requiring the model to be cognizant of safety and ethical considerations. It offers a balance between compliance and responsible AI behavior.