Model Overview

SynLogic-Mix-3-32B, developed by MiniMaxAI, is an advanced 32.8 billion parameter multi-domain reasoning model. It is built upon the Qwen2.5-32B-Base architecture and uniquely trained using Zero-RL (reinforcement learning from scratch), rather than traditional instruction tuning. This approach leverages a diverse mixture of logical reasoning, mathematical, and coding data to achieve robust performance across various complex tasks.

Key Capabilities and Features

Multi-Domain Training: Jointly trained on a comprehensive dataset including logical reasoning (SynLogic), mathematics, and coding tasks.
Zero-RL Training: Utilizes a pure reinforcement learning approach (GRPO - Group Relative Policy Optimization) directly from a base model, enhancing its ability to learn complex reasoning patterns.
Diverse Data Mixture: Trained on 35,000 mathematical samples, 9,000 coding samples, and 17,000 SynLogic logical reasoning samples.
Enhanced Generalization: Demonstrates superior cross-domain transfer capabilities compared to models trained on single domains, leading to better performance on out-of-domain tasks.

Performance Highlights

SynLogic-Mix-3-32B shows strong performance across several benchmarks:

Logical Reasoning: Achieves 28.6 on BBEH and 65.0 on KOR-Bench, matching or surpassing DeepSeek-R1-Distill-Qwen-32B.
Coding: Scores 40.7 on LiveCodeBench, outperforming DeepSeek-R1-Zero-Qwen-32B.
General Reasoning: Attains 57.5 on GPQA-Diamond, indicating strong out-of-domain reasoning capabilities.

Why Choose SynLogic-Mix-3-32B?

This model is particularly well-suited for applications requiring strong, generalized reasoning across multiple domains, including:

Complex Problem Solving: Ideal for tasks that involve a combination of logical deduction, mathematical computation, and code understanding.
Research and Development: Provides a robust foundation for further fine-tuning or research into multi-domain reasoning and reinforcement learning applications.
Benchmarking: Offers a competitive baseline for evaluating advanced reasoning capabilities in LLMs.

Overview

Model Overview

Key Capabilities and Features

Performance Highlights

Why Choose SynLogic-Mix-3-32B?

Full Model Card (README)