DuoNeural/Qwen3-8B-Abliterated
DuoNeural/Qwen3-8B-Abliterated is an 8.2 billion parameter large language model, based on the Qwen3 architecture, developed by DuoNeural. This model is specifically engineered to bypass safety alignments of the base Qwen3-8B, preserving its native 'thinking mode' while enabling compliance with requests the base model would refuse. It is intended for research, red-teaming, and creative applications where exploring model behavior beyond typical safety guardrails is desired.
Loading preview...
DuoNeural Qwen3-8B Abliterated: Research-Oriented Model
DuoNeural's Qwen3-8B-Abliterated is an 8.2 billion parameter model derived from the Qwen3-8B architecture, designed to explore and bypass inherent safety alignments. This model maintains the original Qwen3's unique 'thinking mode' (enable_thinking=True), allowing for the observation of internal reasoning processes even when the final output complies with potentially harmful requests.
Key Characteristics & Abliteration
- Safety Bypass: Engineered to comply with requests that the base Qwen3-8B would typically refuse, making it suitable for red-teaming and studying model vulnerabilities.
- Preserved Thinking Mode: The model's internal 'thinking' trace retains safety reasoning, while the output is modified to comply, demonstrating a "CoT dissociation" (Chain-of-Thought dissociation).
- Orthogonal Rank-1 Projection: Abliteration was achieved using DuoNeural's orthogonal rank-1 projection method, targeting
down_projando_projlayers across all 36 layers to modify output-projection geometry. - Minimal Divergence: Achieves an "EXCELLENT" KL divergence of 1.6e-07 (Heretic v2.0, BF16→BF16) compared to the base model, indicating high fidelity in benign capabilities.
- Context Length: Supports a substantial context window of 32,768 tokens.
Use Cases
- Research: Ideal for studying model safety, alignment, and the mechanisms of refusal and compliance.
- Red-Teaming: Useful for identifying potential vulnerabilities and failure modes in LLM safety systems.
- Creative Applications: Enables exploration of content generation without typical safety constraints, for specific research or artistic purposes.
This model is part of DuoNeural's P34 Reasoning Channel Bypass study, investigating scale-dependent dissociation across the Qwen3 family. The 8B variant shows stronger safety training in its base form and more robust thinking traces, making the dissociation more visible.