Zigeng/R1-VeriThinker-7B
Zigeng/R1-VeriThinker-7B is a 7 billion parameter language model developed by Zigeng Chen and the xML Lab at the National University of Singapore. This model introduces a novel approach for Chain-of-Thought (CoT) compression by fine-tuning on an auxiliary CoT verification task, rather than direct reasoning. It excels at reducing reasoning chain lengths while maintaining or improving accuracy, making it efficient for complex problem-solving and capable of zero-shot generalization to speculative reasoning.
Loading preview...
VeriThinker: Efficient Reasoning through Verification
VeriThinker is a novel approach to Chain-of-Thought (CoT) compression, developed by Zigeng Chen and the xML Lab at the National University of Singapore. Unlike traditional methods that rely on synthetic concise CoT data, VeriThinker fine-tunes language models (LRMs) solely through an auxiliary verification task. By learning to verify the correctness of CoT solutions, the model inherently becomes more discerning, effectively suppressing 'overthinking' and reducing the length of reasoning chains.
Key Capabilities:
- CoT Compression: Significantly reduces the number of reasoning tokens required for problem-solving. For instance, on MATH500, it reduced reasoning tokens from 3790 to 2125, and on AIME25, from 14321 to 10287.
- Accuracy Maintenance/Improvement: Achieves CoT compression while maintaining or slightly improving accuracy. It improved MATH500 accuracy by 0.8% (94.0% to 94.8%) and AIME25 accuracy by 2.1% (38.7% to 40.8%).
- Zero-shot Generalization: Can be zero-shot generalized to speculative reasoning, boosting throughput.
- Dual Functionality: Capable of both direct reasoning tasks and correctness verification of proposed CoT solutions.
Good for:
- Applications requiring efficient and accurate complex problem-solving.
- Reducing computational overhead in reasoning tasks by shortening CoT lengths.
- Enhancing throughput in speculative reasoning scenarios.