RUC-AIBOX/STILL-3-1.5B-preview
RUC-AIBOX/STILL-3-1.5B-preview is a 1.5 billion parameter slow-thinking reasoning model developed by RUC-AIBOX, designed to enhance mathematical and complex problem-solving abilities. It achieves 39.33% accuracy on the AIME benchmark, representing a 37.18% relative improvement over its backbone model. This model is optimized for tasks requiring deliberate reasoning, such as advanced mathematics, and utilizes reinforcement learning for continuous performance gains. It features a context length of 131072 tokens, making it suitable for detailed analytical problems.
Loading preview...
RUC-AIBOX/STILL-3-1.5B-preview: Enhanced Slow-Thinking Reasoning
RUC-AIBOX/STILL-3-1.5B-preview is a 1.5 billion parameter model specifically engineered to improve slow-thinking reasoning capabilities, particularly in mathematical and complex problem-solving domains. Developed by RUC-AIBOX, this model leverages reinforcement learning to achieve continuous performance improvements.
Key Capabilities & Performance
- Mathematical Reasoning: Achieves a notable 39.33% accuracy on the AIME benchmark, marking a 37.18% relative improvement compared to its backbone model (28.67%).
- Benchmark Scores: Demonstrates strong performance across multiple mathematical and reasoning benchmarks:
- MATH: 85.48% (from 84.04%)
- AIME: 39.33% (from 28.67%)
- OMNI: 33.00% (from 25.60%)
- LiveAOPS: 39.50% (from 33.33%)
- Long Context: Supports a substantial context length of 131072 tokens, enabling it to process and reason over extensive problem descriptions.
- Training Methodology: Performance gains are attributed to an adaptive reinforcement learning approach, with observed improvements correlating with increased training steps.
Ideal Use Cases
- Advanced Mathematics: Excels in tasks requiring precise mathematical reasoning, such as those found in the AIME and MATH benchmarks.
- Complex Problem Solving: Suitable for scenarios where deliberate, step-by-step reasoning is more critical than rapid, intuitive responses.
- Research & Development: The model, code, and data are open-sourced to facilitate further research into enhancing slow-thinking abilities in smaller language models. More details can be found in the Slow Thinking with LLMs GitHub repository.