mistralai/Magistral-Small-2509
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Sep 12, 2025License:apache-2.0Architecture:Transformer0.3K Open Weights Warm
Magistral-Small-2509 is a 24 billion parameter multimodal language model developed by mistralai, built upon Mistral Small 3.2 (2506). It features enhanced reasoning capabilities, vision integration for analyzing images, and supports dozens of languages. Optimized for efficient reasoning, it can be deployed locally on consumer-grade hardware and excels at complex problem-solving with a 128k context window.
Loading preview...
Magistral Small 1.2: Enhanced Multimodal Reasoning
mistralai's Magistral-Small-2509 is a 24 billion parameter multimodal language model, an evolution of Mistral Small 3.2 (2506). It significantly improves upon its predecessor, Magistral Small 1.1, particularly in reasoning and multimodal understanding.
Key Capabilities
- Advanced Reasoning: Capable of generating long chains of reasoning traces, encapsulated by
[THINK]and[/THINK]tokens, before providing a final answer. This process is guided by a specific system prompt for optimal results. - Multimodality: Integrates a vision encoder, allowing it to process and reason based on visual inputs in addition to text. This extends its problem-solving abilities to image-based queries.
- Multilingual Support: Supports dozens of languages, including English, French, German, Japanese, Chinese, and many others.
- Extended Context Window: Features a 128k context window, designed to handle extensive inputs, though performance might see some degradation past 40k tokens.
- Improved Performance: Demonstrates significant performance upgrades over Magistral Small 1.1 across various benchmarks, including AIME24, AIME25, GPQA Diamond, and Livecodebench.
- Refined Output: Offers better LaTeX and Markdown formatting, shorter answers for simple prompts, and reduced likelihood of infinite generation loops.
Good for
- Complex Problem Solving: Ideal for tasks requiring detailed, step-by-step reasoning, especially with its
[THINK]token mechanism. - Multimodal Applications: Suitable for use cases that involve analyzing and responding to both text and image inputs.
- Local Deployment: Designed to be efficient enough for local deployment on hardware like an RTX 4090 or a 32GB RAM MacBook (when quantized).
- Multilingual Interactions: Effective for applications requiring understanding and generation in a wide array of languages.
- Apache 2.0 Licensed: Offers flexibility for both commercial and non-commercial use and modification.