MiniMax-M2.5 is a large language model developed by MiniMax, extensively trained with reinforcement learning in complex real-world environments. It excels in coding, agentic tool use, search, and office work, achieving state-of-the-art performance in these domains. The model demonstrates significant speed and cost efficiency, completing tasks up to 37% faster than previous versions and offering competitive pricing for continuous operation.
Loading preview...
MiniMax-M2.5: A Frontier Model for Agentic Tasks
MiniMax-M2.5 is MiniMax's latest large language model, distinguished by its extensive training with reinforcement learning in hundreds of thousands of complex real-world environments. This training approach has resulted in state-of-the-art performance across several key areas, making it a powerful tool for agentic applications.
Key Capabilities & Performance:
- Coding: Achieves 80.2% on SWE-Bench Verified and 51.3% on Multi-SWE-Bench, with strong multilingual support across over 10 languages. It demonstrates architect-like planning, decomposing tasks before coding.
- Agentic Tool Use & Search: Excels in benchmarks like BrowseComp (76.3%) and Wide Search, showing improved decision-making and generalization in unfamiliar environments. It uses approximately 20% fewer rounds for agentic tasks compared to its predecessor.
- Office Work: Trained in collaboration with professionals in finance, law, and social sciences, M2.5 delivers high-quality outputs for tasks in Word, PowerPoint, and Excel, achieving an average win rate of 59.0% against other mainstream models in internal evaluations.
- Efficiency & Cost-Effectiveness: M2.5 is served at 100 tokens per second (TPS), nearly twice as fast as other frontier models, and offers significantly lower costs. Running continuously for an hour costs $1 at 100 TPS or $0.30 at 50 TPS, making it highly economical for agent development.
Unique Differentiators:
- Reinforcement Learning at Scale: Leverages an in-house agent-native RL framework, Forge, and advanced algorithms like CISPO, with hundreds of thousands of training environments.
- Real-world Productivity: Designed for practical application, with MiniMax itself reporting M2.5 autonomously completes 30% of internal tasks, including 80% of newly committed code.
- Optimized for Agentic Workflows: Focuses on task decomposition, token efficiency, and inference speed to deliver substantial time savings in complex tasks.