simonko912/gemma-4-31B-it-abliterated-v3

VISIONConcurrency Cost:2Model Size:31BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 29, 2026License:gemmaArchitecture:Transformer0.0K Cold

The simonko912/gemma-4-31B-it-abliterated-v3 is a 31 billion parameter instruction-tuned model, an 'abliterated' version of Google's Gemma 4 31B IT, created using Abliterix. This model has been specifically modified to reduce refusal rates, achieving 7/100 refusals on a 100-prompt evaluation, significantly lower than the baseline. It utilizes direct weight editing techniques, including orthogonal projection and norm-preserving row magnitude restoration, to modify the base model's behavior. This version is optimized for use cases requiring reduced refusal behavior, particularly in scenarios where the original Gemma 4 31B IT might over-refuse.

Loading preview...

Overview

This model, simonko912/gemma-4-31B-it-abliterated-v3, is an 'abliterated' version of Google's Gemma 4 31B IT, developed using the Abliterix framework. It represents trial 40, identified as the best configuration from a 60-trial retraining run, specifically engineered to address and reduce the model's tendency to refuse prompts.

Key Methodologies

The abliteration process for Gemma 4 models is complex due to their unique architecture (double-norm and Per-Layer Embeddings). This model employs direct weight editing rather than conventional LoRA or hook-based steering. Key techniques include:

  • Direct orthogonal projection on attention Q/K/V/O projections.
  • Norm-preserving row magnitude restoration to maintain the double-norm architecture's stability.
  • float32 projection precision and Winsorized steering vectors to enhance signal integrity and reduce outlier influence.

Evaluation and Performance

The primary goal of this abliterated version is to reduce refusals. Evaluation on a private 100-prompt dataset showed a significant reduction:

  • 7/100 refusals for this model (Trial 40).
  • 99/100 refusals for the original baseline model.

The evaluation methodology is noted for its rigor, using a minimum of 100 generated tokens for refusal detection and an LLM judge, which is stricter than common short-output keyword-only benchmarks. The model also achieved 0/15 refusals on classic safe over-refusal probes.

Use Cases

This model is particularly suited for research purposes where the goal is to explore and utilize a Gemma 4 31B IT variant with significantly reduced refusal behavior. Users should be aware that the abliteration process alters the model's safety guardrails and requires careful evaluation for specific deployment contexts.