wangzhang/Qwen3.5-9B-abliterated
The wangzhang/Qwen3.5-9B-abliterated model is a 9 billion parameter causal language model, derived from Qwen/Qwen3.5-9B, developed by Wangzhang Wu using the Abliterix framework. It features a 32768 token context length and is specifically engineered to significantly reduce refusal rates to 1% while preserving core model capabilities. This model is optimized for applications requiring an unrestricted language model with practical VRAM requirements.
Loading preview...
Overview
wangzhang/Qwen3.5-9B-abliterated is a 9 billion parameter language model based on Qwen/Qwen3.5-9B, developed by Wangzhang Wu. This model has been processed using the Abliterix framework, an automated method designed to remove safety-refusal behaviors from large language models while maintaining their original capabilities. It achieves a remarkably low refusal rate of 1% (2 out of 200 trials) with a KL divergence of 0.0105, indicating minimal deviation from the base model's performance.
Key Capabilities & Methodology
The Abliterix process involves several sophisticated steps:
- Refusal direction extraction: Identifies specific activation patterns associated with refusal behavior.
- Orthogonal projection: Isolates the refusal signal, leading to a 67% reduction in refusals compared to raw abliteration.
- LoRA-based abliteration: Applies rank-1 modifications to attention and MLP weights via lightweight adapters, ensuring non-destructive edits.
- Bayesian optimization: Utilizes Optuna TPE across 50 trials to find an optimal balance between low refusal rates and preserved model capabilities.
Good For
- Applications requiring an unrestricted language model with a high degree of compliance.
- Use cases where minimal refusal behavior is critical, such as creative writing, open-ended dialogue, or research.
- Developers seeking a 9B parameter model that offers a strong balance of capability and efficiency, with practical VRAM requirements.
- Exploring the effects of advanced model abliteration techniques on LLM behavior.