invincible-jha/SynLogic-Mix-3-32B

TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Apr 16, 2026License:mitArchitecture:Transformer Open Weights Cold

The invincible-jha/SynLogic-Mix-3-32B model, developed by MiniMaxAI, is a 32.8 billion parameter multi-domain reasoning model built on Qwen2.5-32B-Base. It is trained using Zero-RL on a diverse mixture of logical reasoning, mathematical, and coding data, demonstrating enhanced generalization across these domains. This model excels at complex reasoning tasks, outperforming other models in its class on benchmarks like BBEH and GPQA-Diamond.

Loading preview...

Model Overview

SynLogic-Mix-3-32B, developed by MiniMaxAI, is an advanced 32.8 billion parameter multi-domain reasoning model. It is built upon the Qwen2.5-32B-Base architecture and uniquely trained using Zero-RL (reinforcement learning from scratch), rather than traditional instruction tuning. This approach leverages a diverse mixture of logical reasoning, mathematical, and coding data to achieve robust performance across various complex tasks.

Key Capabilities and Features

  • Multi-Domain Training: Jointly trained on a comprehensive dataset including logical reasoning (SynLogic), mathematics, and coding tasks.
  • Zero-RL Training: Utilizes a pure reinforcement learning approach (GRPO - Group Relative Policy Optimization) directly from a base model, enhancing its ability to learn complex reasoning patterns.
  • Diverse Data Mixture: Trained on 35,000 mathematical samples, 9,000 coding samples, and 17,000 SynLogic logical reasoning samples.
  • Enhanced Generalization: Demonstrates superior cross-domain transfer capabilities compared to models trained on single domains, leading to better performance on out-of-domain tasks.

Performance Highlights

SynLogic-Mix-3-32B shows strong performance across several benchmarks:

  • Logical Reasoning: Achieves 28.6 on BBEH and 65.0 on KOR-Bench, matching or surpassing DeepSeek-R1-Distill-Qwen-32B.
  • Coding: Scores 40.7 on LiveCodeBench, outperforming DeepSeek-R1-Zero-Qwen-32B.
  • General Reasoning: Attains 57.5 on GPQA-Diamond, indicating strong out-of-domain reasoning capabilities.

Why Choose SynLogic-Mix-3-32B?

This model is particularly well-suited for applications requiring strong, generalized reasoning across multiple domains, including:

  • Complex Problem Solving: Ideal for tasks that involve a combination of logical deduction, mathematical computation, and code understanding.
  • Research and Development: Provides a robust foundation for further fine-tuning or research into multi-domain reasoning and reinforcement learning applications.
  • Benchmarking: Offers a competitive baseline for evaluating advanced reasoning capabilities in LLMs.