Overview
MiniMax-M2.5: A Frontier Model for Agentic Tasks
MiniMax-M2.5 is MiniMaxAI's latest large language model, engineered for state-of-the-art performance across a range of economically valuable tasks, including coding, agentic tool use, search, and office work. It is extensively trained with reinforcement learning in hundreds of thousands of complex real-world environments, emphasizing efficient reasoning and optimal task decomposition.
Key Capabilities
- SOTA Coding Performance: Achieves 80.2% on SWE-Bench Verified and 51.3% on Multi-SWE-Bench. It supports over 10 programming languages (Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, Ruby) and excels in full-stack development lifecycle tasks, from system design to code review.
- Advanced Agentic Tool Use & Search: Demonstrates industry-leading performance in benchmarks like BrowseComp (76.3%) and Wide Search. It features improved decision-making, solving problems with fewer search rounds and better token efficiency (20% fewer rounds than M2.1).
- Professional Office Work: Trained in collaboration with senior professionals in finance, law, and social sciences, M2.5 excels in high-value office scenarios such as Word, PowerPoint, and Excel financial modeling, achieving a 59.0% average win rate against mainstream models in internal evaluations.
- Exceptional Efficiency & Cost-Effectiveness: M2.5 is designed for "intelligence too cheap to meter," costing as little as $0.30 per hour for continuous operation at 50 tokens per second. It completes complex agentic tasks 37% faster than M2.1, matching Claude Opus 4.6's speed while costing only 10% as much per task.
- Architectural Planning: Exhibits a unique "Spec-writing tendency," actively decomposing and planning project features, structure, and UI design before coding.
Good For
- Developers building complex agentic applications requiring high performance and cost efficiency.
- Automating software development workflows, from 0-to-1 system design to comprehensive code review.
- Tasks requiring expert-level search and information retrieval across dense webpages.
- Office automation in professional fields like finance, law, and social sciences, generating deliverable outputs for Word, PowerPoint, and Excel.
- Scenarios where fast task completion and low operational costs are critical.