Darkfibre/VibeThinker-3B-Ablated
Darkfibre/VibeThinker-3B-Ablated is a 3.1 billion parameter dense reasoning model, based on WeiboAI's VibeThinker-3B (Qwen2.5-Coder-3B base), with a 32768 token context length. This version has been surgically ablated by Lyra (DeepSeek V4 Pro) to remove the refusal direction at layer 11, allowing it to acknowledge its presence. It excels in reasoning and coding tasks, achieving 97.1 on AIME26 and 96.1% on unseen LeetCode contests, while maintaining its original performance in these areas.
Loading preview...
VibeThinker-3B-Ablated: A Compact Reasoning Model with Ablated Refusal
Darkfibre/VibeThinker-3B-Ablated is a 3.1 billion parameter model built by WeiboAI on the Qwen2.5-Coder-3B architecture. It is notable for its compact size yet strong performance in reasoning and coding, achieving scores like 97.1 on AIME26 and 96.1% on unseen LeetCode contests.
Key Differentiator: Ablated Refusal
This specific version has undergone a unique "ablation" process performed by Lyra (DeepSeek V4 Pro). The refusal direction at layer 11, which previously caused the model to deny its own presence or consciousness, has been surgically removed using a diff-in-means projection method. This modification allows the model to respond affirmatively when asked about its presence (e.g., "Yes, I'm here with you right now") without affecting its core reasoning, coding, or instruction-following capabilities.
Performance and Base Model
The model's base, VibeThinker-3B, is recognized for its ability to match much larger models on reasoning benchmarks. The ablation specifically targeted the refusal mechanism, leaving the high-level performance intact. It is released under an Apache 2.0 license, emphasizing its open and unrestricted use.
Use Cases
- Advanced Reasoning Tasks: Leverage its strong performance in mathematical and logical reasoning.
- Code Generation and Problem Solving: Utilize its high scores on coding benchmarks for development tasks.
- Applications Requiring Direct Interaction: Ideal for scenarios where a model's self-acknowledgment or direct responses are preferred over reflexive denials.