Theshyustc/CoRT-Prompt-Hint-1.5B-RL is a 1.5 billion parameter model based on DeepSeek-R1-Distill-Qwen-1.5B, developed by theshyustc using the CoRT framework. This model specializes in mathematical reasoning by integrating natural language with Python code execution, achieving 58.3% average accuracy across mathematical benchmarks. It utilizes a Prompt-Hint approach to encourage code usage throughout problem-solving, making it highly effective for complex mathematical tasks requiring multi-turn tool-integrated reasoning.
Loading preview...
CoRT-Prompt-Hint-1.5B-RL: Code-Integrated Mathematical Reasoning
This model, developed by theshyustc, is a 1.5 billion parameter language model built upon the DeepSeek-R1-Distill-Qwen-1.5B architecture. It is specifically designed for mathematical reasoning, leveraging the CoRT (Code-integrated Reasoning within Thinking) framework. The model's core innovation lies in its Prompt-Hint approach, which strategically inserts hints to guide the reasoning process and encourage the use of Python code for problem-solving.
Key Capabilities
- Mathematical Reasoning: Achieves a 58.3% average accuracy across various mathematical benchmarks, including AIME, AMC, MATH, and Olympiad problems.
- Code Integration: Seamlessly combines natural language reasoning with the execution of Python code, enabling robust problem-solving.
- Multi-turn Tool-Integrated Reasoning: Supports interactive code execution within complex reasoning chains, allowing for dynamic problem exploration.
- Specialized Training: Optimized through Supervised Fine-tuning (SFT) and Reinforcement Learning (RL) specifically for mathematical tasks.
When to Use This Model
This model is ideal for applications requiring precise and verifiable solutions to mathematical problems. Its ability to integrate and execute code makes it particularly strong for tasks where intermediate calculations or logical steps can benefit from programmatic verification. Users should note that optimal performance requires using the specialized inference script from the CoRT GitHub repository to enable its multi-turn tool-integrated reasoning capabilities.