zkxxxx/VibeThinker-3B-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 17, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

VibeThinker-3B-heretic is a 3.1 billion parameter decensored version of WeiboAI's VibeThinker-3B model, created using the Heretic v1.4.0 project. This model is specifically optimized for challenging reasoning tasks in mathematics, coding, and STEM, demonstrating strong performance on verifiable reasoning benchmarks. It excels at multi-step reasoning, constraint satisfaction, and self-correction, making it suitable for competitive programming and similar problem-solving scenarios.

Loading preview...

VibeThinker-3B-heretic: Decensored Reasoning Model

VibeThinker-3B-heretic is a 3.1 billion parameter language model, a decensored variant of WeiboAI's VibeThinker-3B, developed using Heretic v1.4.0. This model focuses on pushing the boundaries of small language models (SLMs) in verifiable reasoning tasks, challenging the notion that only large models can achieve frontier-level performance in structured problem-solving domains.

Key Capabilities & Differentiators

  • Enhanced Reasoning: Optimized for complex reasoning in mathematics, coding, and STEM, including multi-step reasoning, constraint satisfaction, and self-correction.
  • Decensored Output: Compared to the original VibeThinker-3B, this "heretic" version significantly reduces refusals (6/100 vs. 65/100), offering less restricted responses.
  • Competitive Performance: Achieves strong results on benchmarks like AIME, HMMT, IMO-AnswerBench, and LiveCodeBench, reaching performance levels comparable to much larger models (e.g., DeepSeek V3.2, GLM-5, Kimi K2.5) in verifiable reasoning.
  • High LeetCode Acceptance: Demonstrated a 96.1% acceptance rate on unseen LeetCode contests (123/128 first-attempt submissions).
  • Spectrum-to-Signal Principle (SSP): Utilizes a sophisticated training pipeline involving curriculum-based two-stage SFT, Multi-domain Reasoning RL, and Offline Self-Distillation to amplify correct reasoning signals.

Good For

  • Competitive Programming: Excels at LeetCode-style problems and other coding challenges.
  • Mathematics & STEM Reasoning: Ideal for tasks requiring verifiable, multi-step problem-solving.
  • Research in SLM Capabilities: Demonstrates that compact models can achieve near-frontier reasoning capabilities in domains with clear feedback and verification mechanisms.
  • Applications requiring less restrictive outputs: Suitable for use cases where the original model's refusal rate was a limitation.