benbird316/gemma-4-E2B-it-uncensored
The benbird316/gemma-4-E2B-it-uncensored is a 5.1 billion parameter Gemma-4-E2B-it model, developed by benbird316, that has been specifically modified to remove refusal behaviors. Utilizing a norm-preserving biprojected abliteration method, this model significantly reduces refusals while maintaining response quality. It is optimized for use cases requiring an instruction-tuned model without built-in content restrictions.
Loading preview...
Overview
This model, benbird316/gemma-4-E2B-it-uncensored, is a modified version of the google/gemma-4-E2B-it model, specifically engineered to eliminate refusal behaviors. It achieves this through a technique called norm-preserving biprojected abliteration, which projects out refusal directions from the model's weights without degrading overall response quality.
Key Capabilities
- Significantly Reduced Refusals: Achieves a refusal rate of 1/100 on the mlabonne dataset and 3/686 (0.4%) across a cross-dataset validation, compared to 98/100 for the original model.
- Quality Preservation: Maintains response quality, with a harmless response length ratio of approximately 1.01, indicating no degradation in output length or coherence.
- Advanced Abliteration Method: Employs a sophisticated method that ensures weight magnitudes are preserved, uses per-layer refusal directions, and is a deterministic single-pass process, differing from standard projection techniques.
Good For
- Applications requiring an instruction-tuned model with minimal content restrictions.
- Research into model safety and bias mitigation techniques.
- Use cases where the original Gemma-4-E2B-it's refusal behaviors are undesirable.