Name: YanLabs/gemma-3-4b-it-abliterated-normpreserve API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: YanLabs

Overview

This model, developed by YanLabs, is an abliterated version of the google/gemma-3-4b-it causal language model. It utilizes a novel norm-preserving biprojected abliteration technique to surgically remove refusal behaviors and safety guardrails from the model's activation space. Unlike traditional fine-tuning, this method aims to preserve the model's original capabilities while eliminating specific undesirable responses.

Key Characteristics

Abliterated Safety Mechanisms: Explicitly designed to have safety guardrails and refusal mechanisms removed.
Norm-Preserving Biprojection: Employs a specific technique to alter model behavior without traditional retraining.
Research Focus: Primarily intended for mechanistic interpretability research to understand how LLM safety mechanisms function.
Base Model: Derived from google/gemma-3-4b-it.

Intended Use Cases

Mechanistic Interpretability Research: Studying the internal workings of large language models.
LLM Safety Analysis: Investigating the nature and removal of safety mechanisms.
Abliteration Technique Development: Testing and refining methods for modifying model behavior.

Important Limitations

No Safety Guarantees: Abliteration does not ensure complete removal of all refusals and may generate harmful content.
Not for Production: Explicitly not for production deployments or user-facing applications.
Unpredictable Behavior: Model behavior may be unpredictable in certain edge cases due to the removal of safety features.

Overview

Overview

Key Characteristics

Intended Use Cases

Important Limitations

Full Model Card (README)