Overview
Model Overview
This model, gemma-3-12b-it-norm-preserved-biprojected-abliterated, is a 12 billion parameter instruction-tuned variant based on Google's gemma-3-12b-it. Its primary innovation lies in the application of 'projected abliteration' and 'norm-preserving biprojected abliteration' techniques.
Key Differentiators
- Reduced Refusal Rates: The model has undergone specific interventions to significantly decrease its tendency to refuse prompts, making it more compliant than its base model.
- Norm Preservation: Unlike some abliteration methods, this approach focuses on removing only the directional component of refusal, thereby preserving the norms of the intervened layers. This aims to minimize model damage.
- Safety Awareness: Despite the reduction in refusal, the model is designed to retain its awareness of safety guidelines and potential harms.
- No Post-Abliteration Fine-Tuning: The model's design avoids subsequent fine-tuning to repair damage, suggesting a robust abliteration process.
Intended Use Cases
This model is particularly suited for applications where a lower refusal rate is desired, while still requiring the model to be cognizant of safety and ethical considerations. It offers a balance between compliance and responsible AI behavior.