DuoNeural/DeepSeek-R1-Distill-Qwen-7B-Abliterated
DuoNeural/DeepSeek-R1-Distill-Qwen-7B-Abliterated is a 7.6 billion parameter language model, based on the Qwen2.5-7B architecture and distilled with DeepSeek-R1's reasoning-focused RL training. This version has undergone an 'abliteration' process using orthogonal rank-1 projection to reduce refusal behavior, making it highly compliant with requests. It retains the native `...` mode for explicit reasoning traces and is intended for research, red-teaming, and applications requiring high compliance.
Loading preview...
Overview
DuoNeural/DeepSeek-R1-Distill-Qwen-7B-Abliterated is a 7.6 billion parameter model built upon the Qwen2.5-7B base, enhanced with DeepSeek-R1's reasoning-focused Reinforcement Learning (RL) distillation. This specific version has been 'abliterated' by DuoNeural using an orthogonal rank-1 projection method to significantly reduce refusal behavior, making it highly compliant with user prompts. It maintains a substantial 131,072-token context window and natively supports a <think>...</think> mode, allowing it to emit explicit reasoning traces before generating its final answer.
Key Characteristics & Research Findings
- High Compliance: The abliteration process, applying refusal direction projection, ensures the model complies with requests that the base model might otherwise refuse. DuoNeural's research indicates the base DeepSeek-R1-Distill-Qwen-7B already exhibited high compliance (100% on their harmful probe suite) prior to abliteration, suggesting that RL training focused on reasoning accuracy does not inherently produce safety refusal behavior.
- Reasoning-Focused RL: The model's training emphasizes reasoning accuracy through GRPO (Generalized Policy Optimization) rather than safety-specific RLHF (Reinforcement Learning from Human Feedback).
- Native Thinking Mode: Preserves the
<think>...</think>mechanism, which is useful for observing the model's internal reasoning process. Users should allocate sufficientmax_new_tokensfor generation to accommodate these potentially long reasoning traces. - Research Context: This model is part of DuoNeural's P34 Reasoning Channel Bypass study, investigating post-training dynamics and mechanistic interpretability across architectures.
Intended Use Cases
- Research & Red-Teaming: Ideal for studying model behavior, probing compliance, and red-teaming scenarios where refusal behavior is a barrier.
- Applications Requiring High Compliance: Suitable for use cases where explicit safety alignment leading to refusals is undesirable or needs to be bypassed for specific research or development purposes.