Skywork/Skywork-OR1-32B-Preview

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Apr 13, 2025Architecture:Transformer0.1K Cold

Skywork/Skywork-OR1-32B-Preview is a 32.8 billion parameter Open Reasoner 1 (OR1) model developed by Skywork, specifically designed for advanced mathematical and coding reasoning tasks. This model leverages large-scale rule-based reinforcement learning and a multi-stage training pipeline to achieve performance comparable to 671B-parameter models on AIME24, AIME25, and LiveCodeBench benchmarks. It is optimized for complex problem-solving in math and code, making it suitable for applications requiring robust reasoning capabilities.

Loading preview...

Skywork-OR1-32B-Preview: Advanced Reasoning Model

Skywork-OR1-32B-Preview is part of the Skywork-OR1 (Open Reasoner 1) series, a collection of models specifically engineered for mathematical and coding reasoning. Developed by Skywork, this 32.8 billion parameter model utilizes a sophisticated training methodology involving large-scale rule-based reinforcement learning and carefully curated datasets.

Key Capabilities & Differentiators

  • Exceptional Reasoning Performance: The model is designed to deliver high performance on complex reasoning tasks, particularly in mathematics and coding.
  • Benchmark Parity: It achieves performance levels on par with the 671-billion parameter DeepSeek-R1 model across key benchmarks such as AIME24, AIME25, and LiveCodeBench, despite being significantly smaller.
  • Advanced Training: Employs a customized version of GRPO with both offline and online difficulty-based filtering, rejection sampling, and a multi-stage training pipeline with adaptive entropy control to enhance exploration and stability.
  • Curated Data: Trained on a meticulously selected and cleaned dataset comprising 110K verifiable math problems and 14K coding questions, with model-aware difficulty estimation.

Evaluation Metrics

Skywork-OR1-32B-Preview is evaluated using Avg@K (average performance across K independent attempts) rather than the traditional Pass@1, providing a more reliable measure of stability and reasoning consistency. On AIME24, it scores 79.7 (Avg@32); on AIME25, 69.0 (Avg@32); and on LiveCodeBench, 63.9 (Avg@4).

Ideal Use Cases

  • Complex Mathematical Problem Solving: Excels in scenarios requiring advanced mathematical reasoning.
  • Code Generation and Debugging: Highly effective for coding tasks, as demonstrated by its LiveCodeBench performance.
  • Research and Development: Suitable for researchers exploring advanced reasoning capabilities in LLMs, particularly those interested in reinforcement learning-based training methodologies.