TrevorJS/gemma-4-31B-it-uncensored
TrevorJS/gemma-4-31B-it-uncensored is a 31 billion parameter instruction-tuned Gemma-4 model, developed by TrevorJS, with its refusal behavior significantly reduced. It maintains original response quality while effectively removing censorship, achieving a 3.2% refusal rate across multiple datasets compared to the original model's high refusal rate. This model is optimized for applications requiring uncensored, direct responses without degradation in output quality, making it suitable for diverse conversational and generative tasks.
Loading preview...
Overview
TrevorJS/gemma-4-31B-it-uncensored is a modified version of Google's Gemma-4-31B-it model, specifically engineered to remove its inherent refusal behaviors. This uncensoring process significantly reduces the model's tendency to decline prompts, making it more versatile for various applications while preserving the original model's response quality.
Key Capabilities
- Reduced Refusal Behavior: Achieves a 3.2% refusal rate across 686 prompts from diverse datasets (JailbreakBench, tulu-harmbench, NousResearch/RefusalDataset, mlabonne/harmful_behaviors), a substantial improvement over the baseline.
- Quality Preservation: Maintains the original model's response quality, indicated by a harmless response length ratio of approximately 1.01, suggesting no degradation in output utility.
- Norm-Preserving Abliteration: Utilizes a novel norm-preserving biprojected abliteration method, which ensures that the weight magnitudes of the model are preserved during the uncensoring process, contributing to stable performance.
- Efficient Processing: Employs per-layer refusal directions and a deterministic single-pass method, offering a faster and equally effective alternative to traditional abliteration techniques.
Good For
- Applications requiring direct and uncensored responses from a large language model.
- Research into model safety, bias, and the effects of refusal behavior removal.
- Use cases where the original Gemma-4-31B-it model's refusal tendencies were a limiting factor.
- Developers seeking a powerful 31B parameter model with a high degree of compliance to user prompts.