unsloth/Magistral-Small-2509
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Sep 17, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Magistral-Small-2509 is a 24 billion parameter multimodal language model developed by Mistral AI, built upon Mistral Small 3.2 (2506). It features enhanced reasoning capabilities, vision integration for analyzing images, and supports a 128k context window. This model excels at complex reasoning tasks and multimodal understanding, making it suitable for applications requiring advanced analytical processing.

Loading preview...

Magistral Small 1.2 Overview

Magistral Small 1.2 is a 24 billion parameter multimodal language model from Mistral AI, based on Mistral Small 3.2 (2506). It has been fine-tuned with added reasoning capabilities through Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), making it an efficient model for complex analytical tasks. A key update in this version is the integration of a vision encoder, allowing it to process and reason based on multimodal inputs, including images.

Key Capabilities

  • Advanced Reasoning: Capable of generating long chains of reasoning traces before providing an answer, encapsulated by [THINK] and [/THINK] special tokens.
  • Multimodality: Features a vision encoder for analyzing images and reasoning from visual content, alongside text.
  • Multilingual Support: Supports dozens of languages, including English, French, German, Japanese, Chinese, and Arabic.
  • Extended Context Window: Offers a 128k context window, with good performance up to 40k tokens.
  • Improved Performance: Demonstrates significantly better performance compared to Magistral Small 1.1 across various benchmarks like AIME24, AIME25, GPQA Diamond, and Livecodebench.
  • Enhanced Output Quality: Provides better LaTeX and Markdown formatting, shorter answers for easy prompts, and reduced likelihood of infinite generation loops.

Good For

  • Applications requiring robust reasoning and problem-solving.
  • Multimodal tasks involving both text and image analysis.
  • Deployments on resource-constrained hardware like a single RTX 4090 or a 32GB RAM MacBook, once quantized.
  • Developers looking for an open-licensed model (Apache 2.0) for commercial and non-commercial use.