prithivMLmods/VibeThinker-3B-heretic_decensored
VibeThinker-3B-heretic_decensored by prithivMLmods is a 3.1 billion parameter, reasoning-focused language model built on WeiboAI/VibeThinker-3B, with a 32K context length. It has been modified using the Heretic abliteration toolkit to reduce internal refusal behaviors while preserving strong mathematical, coding, and STEM reasoning capabilities. This model is designed for alignment research and evaluation of refusal-direction modifications, offering an uncensored output profile.
Loading preview...
VibeThinker-3B-heretic_decensored: Abliterated Reasoning Model
This model, developed by prithivMLmods, is a 3.1 billion parameter language model derived from WeiboAI/VibeThinker-3B, which itself is based on Qwen/Qwen2.5-Coder-3B. Its core differentiator is the application of the Heretic abliteration toolkit, which uses refusal-direction analysis and targeted weight-space interventions to significantly reduce internal refusal behaviors.
Key Capabilities & Features
- Reduced Refusal Behavior: Engineered to minimize internal refusal tendencies, as demonstrated by a reduction from 64/100 refusals in the original model to 6/100 in this version.
- Preserved Reasoning: Maintains the strong mathematical, coding, and STEM reasoning capabilities inherited from the VibeThinker training pipeline.
- Efficient 3B Architecture: Offers a compact 3-billion-parameter size suitable for local inference and resource-constrained environments.
- Uncensored Output: Due to the abliteration process, the model may generate sensitive or unrestricted content, making it suitable for specific research into model behavior.
Intended Use Cases
- Alignment Research: Ideal for studying refusal-direction analysis, behavior modification techniques, and the impact of reduced internal refusal mechanisms.
- Model Evaluation: Useful for benchmarking reasoning, instruction-following, and safety-related behaviors under altered refusal conditions.
- Red Teaming: Provides a tool for analyzing model responses in scenarios where typical safety filters are minimized.
- Mathematical, Coding, and STEM Research: Continues to offer strong performance in these domains for specialized evaluations.
Important Note: This model is strictly for research and learning. Users assume full responsibility for its outputs due to intentionally reduced refusal mechanisms. It is experimental and may exhibit unexpected behaviors.