Model Overview

This model, gemma-3-12b-it-norm-preserved-biprojected-abliterated, is a 12 billion parameter instruction-tuned variant based on Google's gemma-3-12b-it. Its primary innovation lies in the application of 'projected abliteration' and 'norm-preserving biprojected abliteration' techniques.

Key Differentiators

Reduced Refusal Rates: The model has undergone specific interventions to significantly decrease its tendency to refuse prompts, making it more compliant than its base model.
Norm Preservation: Unlike some abliteration methods, this approach focuses on removing only the directional component of refusal, thereby preserving the norms of the intervened layers. This aims to minimize model damage.
Safety Awareness: Despite the reduction in refusal, the model is designed to retain its awareness of safety guidelines and potential harms.
No Post-Abliteration Fine-Tuning: The model's design avoids subsequent fine-tuning to repair damage, suggesting a robust abliteration process.

Intended Use Cases

This model is particularly suited for applications where a lower refusal rate is desired, while still requiring the model to be cognizant of safety and ethical considerations. It offers a balance between compliance and responsible AI behavior.