MiniMaxAI/SynLogic-32B
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:May 30, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

MiniMaxAI's SynLogic-32B is a 32.8 billion parameter logical reasoning model built on Qwen2.5-32B-Base and fine-tuned using reinforcement learning on the SynLogic dataset. It excels at complex logical reasoning tasks, including Sudoku and Game of 24, and demonstrates strong generalization to mathematical problem-solving. This model achieves state-of-the-art performance on the BBEH benchmark among open-source logical reasoning models, making it suitable for applications requiring advanced deductive capabilities.

Loading preview...

SynLogic-32B: Advanced Logical Reasoning Model

SynLogic-32B, developed by MiniMaxAI, is a 32.8 billion parameter model specifically designed for advanced logical reasoning. Built upon the Qwen2.5-32B-Base architecture, it was fine-tuned using a novel reinforcement learning approach with the comprehensive SynLogic dataset.

Key Capabilities

  • Comprehensive Logical Reasoning: Trained on 35 diverse logical reasoning tasks, including Sudoku, Game of 24, Cipher, and Arrow Maze, ensuring robust performance across various problem types.
  • Verifiable Training Data: Utilizes automatically verifiable training data, which enables highly effective reinforcement learning and ensures the quality of the learned reasoning patterns.
  • Strong Generalization: Demonstrates the ability to transfer learned logical reasoning skills to mathematical problem-solving, even without explicit mathematical training, highlighting its versatile cognitive abilities.

Performance Highlights

SynLogic-32B achieves a significant +6 point improvement over DeepSeek-R1-Distill-Qwen-32B on the challenging BBEH benchmark, scoring 25.5. This establishes it as a leading open-source model for logical reasoning tasks. The training involved Group Relative Policy Optimization (GRPO) on 33k SynLogic-Hard samples, with binary rewards based on correctness and format adherence.

Good For

  • Applications requiring advanced logical deduction and problem-solving.
  • Tasks involving complex puzzles, strategic games, and mathematical reasoning.
  • Research into reinforcement learning for reasoning tasks and verifiable AI training.