mistralai/Magistral-Small-2507
Magistral-Small-2507 is a 24 billion parameter language model developed by Mistral AI, building upon Mistral Small 3.1 with enhanced reasoning capabilities. This model is optimized for complex reasoning tasks, capable of generating long chains of thought before providing an answer. It supports dozens of languages and features a 128k context window, with optimal performance recommended up to 40k tokens, making it suitable for applications requiring detailed logical processing.
Loading preview...
Magistral Small 1.1: Enhanced Reasoning Model
Magistral-Small-2507 is a 24 billion parameter language model from Mistral AI, an evolution of Magistral Small 3.1. It incorporates advanced reasoning capabilities, undergoing Supervised Fine-Tuning (SFT) from Magistral Medium traces and Reinforcement Learning (RL). This model is designed to be efficient, capable of local deployment on a single RTX 4090 or a 32GB RAM MacBook after quantization.
Key Capabilities
- Advanced Reasoning: Excels at generating long, detailed chains of reasoning traces to arrive at an answer, utilizing
[THINK]and[/THINK]special tokens for structured thought processes. - Multilingual Support: Proficient in dozens of languages, including English, French, German, Japanese, Chinese, and Arabic.
- Flexible Context Window: Features a 128k context window, with optimal performance recommended up to 40k tokens.
- Improved Behavior: Offers better tone, enhanced LaTeX and Markdown formatting, and reduced likelihood of infinite generation loops compared to previous versions.
- Apache 2.0 License: Provides an open license for both commercial and non-commercial use and modification.
Performance Highlights
Magistral Small 1.1 demonstrates strong performance across various benchmarks, including AIME24 pass@1 (70.52%), AIME25 pass@1 (62.03%), GPQA Diamond (65.78%), and Livecodebench (v5) (59.17%).
Recommended Usage
For optimal results, users are advised to employ a specific system prompt that encourages an inner monologue and structured reasoning. The model is compatible with vLLM (recommended for inference) and transformers, with community-supported quantized versions available for llama.cpp and lmstudio.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.