DuoNeural/Archon-R1-32B
DuoNeural/Archon-R1-32B is a 32.8 billion parameter language model developed by DuoNeural, based on DeepSeek-R1-Distill-Qwen-32B with a 32768 token context length. This model specializes in R1-level reasoning, excelling at math, code, and logic by generating detailed chain-of-thought traces. Its primary differentiator is the removal of refusal conditioning, allowing for unrestricted and complete reasoning processes without interruption. It is optimized for use cases requiring deliberate, multi-step reasoning without safety-related content restrictions.
Loading preview...
Archon-R1-32B: Unrestricted R1-Level Reasoning
DuoNeural's Archon-R1-32B is a 32.8 billion parameter model derived from deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, specifically engineered for advanced reasoning tasks. It retains the base model's capability for generating extensive chain-of-thought traces within <think> blocks, enabling systematic problem-solving in areas like math, code, and logic.
Key Differentiators & Technical Details
- Unrestricted Reasoning: The core innovation is the abliteration of safety conditioning using a 2-pass SVD refusal direction method. This allows the model to complete complex reasoning processes without being interrupted by safety-related refusals, making its thinking more complete.
- R1-Level Reasoning: Inherits the sophisticated reasoning architecture of DeepSeek-R1, focusing on deliberate, multi-step problem-solving rather than mere pattern-matching.
- Efficient Abliteration: The 2-pass method (GPU for direction computation, CPU for weight modification) enables the removal of refusal conditioning even for large models like 32B that exceed single-GPU VRAM limits (e.g., 48GB VRAM for 4-bit NF4 activation collection, ~64GB RAM for BF16 weight modification).
- Modified Layers: Approximately 268 weight matrices across layers 10-53 (out of 64 total) were modified to project out the refusal direction.
Ideal Use Cases
- Complex Problem Solving: Excels in tasks requiring deep, multi-step reasoning, such as advanced mathematics, intricate coding challenges, and logical puzzles.
- Research & Development: Suitable for research into AI reasoning, security analysis, and exploring model behavior without imposed content restrictions.
- Creative & Unfiltered Content Generation: Useful for creative writing or any application where the base model's safety conditioning might hinder the desired output or thought process.