Skywork-OR1-32B: Advanced Math and Code Reasoning Model
Skywork-OR1-32B is a 32.8 billion parameter model from the Skywork-OR1 (Open Reasoner 1) series, developed by Skywork. This model is specifically engineered for high-performance math and code reasoning, utilizing large-scale rule-based reinforcement learning with meticulously curated datasets and training methodologies.
Key Capabilities and Performance
- Superior Reasoning: Excels in complex mathematical problem-solving, demonstrated by its performance on AIME24 and AIME25 benchmarks, where it surpasses models like Deepseek-R1 and Qwen3-32B.
- Robust Coding: Delivers comparable performance on coding tasks as evaluated by LiveCodeBench.
- Reinforcement Learning: Benefits from a customized GRPO (Generalized Policy Optimization) training approach, incorporating both offline and online difficulty-based filtering, rejection sampling, and a multi-stage training pipeline with adaptive entropy control for enhanced exploration and stability.
- Data Quality: Trained on a specialized dataset comprising 110K verifiable math problems and 14K coding questions, with model-aware difficulty estimation and rigorous quality assessment.
When to Use Skywork-OR1-32B
This model is ideal for applications requiring strong analytical and logical reasoning, particularly in:
- Mathematical Problem Solving: For tasks involving advanced algebra, geometry, and other complex mathematical challenges.
- Code Generation and Debugging: For scenarios demanding precise and logical code solutions.
- Research and Development: As a foundation for further research into open reasoning models and advanced AI applications.