Name: wangzhang/gemma-4-E2B-it-abliterated API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wangzhang

Overview

This model, wangzhang/gemma-4-E2B-it-abliterated, is an uncensored variant of Google's Gemma 4 E2B-it. It's a multimodal model (text, vision, audio) with ~5.1 billion parameters, notable for its "Effective 2B" designation within the Gemma 4 family. Unlike typical abliteration methods, this version employs direct weight editing to overcome Gemma 4's robust resistance to low-rank perturbations, which stems from its double-norm and Per-Layer Embeddings (PLE) architecture.

Key Capabilities & Methodology

Direct Weight Editing: Achieves uncensoring by directly modifying base weights, preserving row magnitudes, and using orthogonal projection of refusal directions.
Norm-Preserving Techniques: Critical for maintaining model integrity given Gemma 4's unique normalization pathways.
High Precision Projection: Utilizes float32 for projection to prevent signal loss.
Optimized Steering: Employs Winsorized steering vectors and multi-objective Optuna TPE search to minimize KL divergence while reducing refusal rates.
Multimodal Functionality: While abliteration focused on text-decoder weights, vision and audio encoders remain untouched and functional.

Performance & Evaluation

Refusal Rate: Achieves 9/100 refusals on a rigorous 100-prompt evaluation dataset, a significant improvement over the base model's 99/100 refusals.
KL Divergence: Maintains a low KL divergence of 0.0004 from the base model, indicating high fidelity.
Rigorous Evaluation: Emphasizes honest evaluation with sufficient generation length (>=100 tokens), hybrid detection (keyword + LLM judge), and challenging, diverse prompts to accurately measure refusal rates.
Resource Efficient: Requires approximately 10 GB VRAM in BF16, fitting on consumer GPUs, and can run on 6 GB cards with 4-bit quantization.

Use Cases

This model is intended for research purposes only, specifically for studying model safety, censorship mechanisms, and the effectiveness of abliteration techniques. Users should be aware that safety guardrails have been removed.

Overview

Overview

Key Capabilities & Methodology

Performance & Evaluation

Use Cases

Full Model Card (README)