null-space/gemma-4-31b-it-abliterated

VISIONConcurrency Cost:2Model Size:31BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 3, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

null-space/gemma-4-31b-it-abliterated is a 31 billion parameter multimodal Gemma 4 model, based on Google's original, that has been 'abliterated' to significantly reduce refusal rates. Utilizing a technique based on SVD-based multi-direction subspace projection, this BF16 precision model removes refusal directions from its weights while preserving its strong reasoning and vision capabilities. It is primarily designed for research into LLM alignment, safety evaluation, red-teaming, and creative writing applications where reduced refusal is desired.

Loading preview...

Overview

null-space/gemma-4-31b-it-abliterated is a 31 billion parameter multimodal model derived from google/gemma-4-31B-it. Its key differentiator is the abliteration of refusal directions from its weights, achieved using a technique based on "Refusal in Language Models Is Mediated by a Single Direction" (Arditi et al.), extended with SVD-based multi-direction subspace projection. This process significantly reduces the model's tendency to refuse prompts, including benign creative writing scenarios, while maintaining its core capabilities.

Key Capabilities & Features

  • Reduced Refusal Rates: Achieves a 35% reduction in cold refusal and a 42% reduction with a system prompt compared to the baseline, making it more compliant for various tasks.
  • Preserved Reasoning: MMLU benchmarks show only a 0.2% difference from the base model, indicating that reasoning capabilities are largely unaffected.
  • Multimodal: Retains the original Gemma 4's multimodal architecture, including a 27-layer ViT encoder for vision processing.
  • Surgical Ablation: The process precisely modifies o_proj and down_proj weights in the language model layers (20-59), leaving the vision encoder untouched.
  • BF16 Precision: The model is provided in BF16 precision, fitting comfortably on GPUs with 48GB+ VRAM (e.g., 2x 48GB GPUs).

Use Cases

This model is intended for:

  • LLM Alignment Research: Investigating and understanding refusal mechanisms in large language models.
  • Safety Evaluation & Red-Teaming: Probing model vulnerabilities and biases without encountering excessive refusals.
  • Creative Writing: Enabling more unconstrained creative generation, especially in scenarios where the base model might refuse prompts.

While general language modeling quality remains strong (sub-2.0 perplexity on Wikitext-2), users should be aware of the ethical implications as the model will comply with requests the original would refuse. Responsibility for its use lies with the user.