Overview

VibeThinker-3B is a 3.1 billion parameter model developed by WeiboAI, focusing on advanced reasoning tasks in mathematics, coding, and STEM. It systematically optimizes the Spectrum-to-Signal Principle (SSP) post-training pipeline, enabling it to achieve performance comparable to much larger frontier reasoning models on verifiable benchmarks. The model demonstrates that compact models can achieve near-frontier reasoning capabilities in structured task spaces with reliable feedback signals.

Key Capabilities

Exceptional Reasoning: Achieves 76.4 on IMO-AnswerBench (80.6 with CLR), a benchmark of 400 International Mathematical Olympiad-level problems, outperforming models like DeepSeek V3.2 (671B) and GLM-5 (744B) in relative accuracy to scale.
Competitive Programming Prowess: Passed 123 out of 128 first-attempt submissions (96.1% acceptance rate) on recent unseen LeetCode weekly and biweekly contests (Python).
Robust Training: Utilizes a multi-stage training pipeline including curriculum-based two-stage Supervised Fine-Tuning (SFT), Multi-domain Reasoning Reinforcement Learning (RL), Offline Self-Distillation, and Instruct RL to enhance reasoning and controllability.

Good For

Competitive Programming: Excels at LeetCode-style problems and similar coding challenges.
Hard Math & STEM Reasoning: Ideal for tasks requiring multi-step reasoning, constraint satisfaction, and answer verification in mathematics and science.
Benchmark Evaluation: Recommended for evaluating against challenging datasets like AMOBench for harder math reasoning. Not recommended for tool-calling, API orchestration, or autonomous coding agents.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)