YanLabs/gemma-3-27b-it-abliterated-normpreserve

Warm
Public
Vision
27B
FP8
32768
License: gemma
Hugging Face
Overview

YanLabs/gemma-3-27b-it-abliterated-normpreserve

This model is an abliterated version of google/gemma-3-27b-it, developed by YanLabs. It utilizes a norm-preserving biprojected abliteration technique to remove refusal mechanisms and safety guardrails from the base model. This process surgically targets and removes "refusal directions" from the model's activation space without relying on traditional fine-tuning methods, while aiming to preserve the model's original general capabilities.

Key Characteristics

  • Abliterated Safety Mechanisms: Explicitly designed to have safety guardrails and refusal behaviors removed.
  • Norm-Preserving: The abliteration technique aims to maintain the model's original performance and capabilities in areas unrelated to refusal.
  • Research-Focused: Intended for specific research into mechanistic interpretability and understanding how LLM safety mechanisms function.

Intended Use Cases

  • Mechanistic Interpretability Research: Studying the internal workings of large language models.
  • Analysis of LLM Safety: Investigating how refusal behaviors are encoded and can be removed.
  • Development of Abliteration Techniques: Testing and refining methods for modifying model behavior without extensive retraining.

Limitations

Users should be aware that this model may generate unsafe or harmful content due to the removal of safety mechanisms. It is not intended for production deployments or user-facing applications and its behavior can be unpredictable in certain edge cases.