Mistral-Small-3.2-24B-Instruct-2506 Overview

Mistral-Small-3.2-24B-Instruct-2506 is an instruction-tuned model from Mistral AI, serving as a minor update to its predecessor, Mistral-Small-3.1-24B-Instruct-2503. This 24 billion parameter model with a 32768-token context window focuses on refining core instruction-following capabilities and addressing common generation issues.

Key Enhancements

Improved Instruction Following: The model demonstrates better adherence to precise instructions, as evidenced by significant gains in Wildbench v2 (65.33% vs 55.6%) and Arena Hard v2 (43.1% vs 19.56%) benchmarks.
Reduced Repetition Errors: It produces fewer infinite generations or repetitive answers, with a 2x reduction in internal metrics (1.29% vs 2.11%).
Robust Function Calling: The function calling template has been made more robust, enhancing its reliability for tool-use cases.
STEM Performance: Shows improvements in MMLU Pro (69.06%), MBPP Plus - Pass@5 (78.33%), HumanEval Plus - Pass@5 (92.90%), and SimpleQA (12.10%).
Vision Capabilities: Maintains strong vision reasoning, supporting multimodal inputs for tasks like image analysis and decision-making based on visual context.

Recommended Usage

The model is optimized for use with vLLM (version 0.9.1 or higher) for serving, which is recommended for its performance. It also supports Transformers with mistral-common >= 1.6.2. Users are advised to use a low temperature (e.g., 0.15) and provide a system prompt for optimal tailoring to specific needs. The model requires approximately 55 GB of GPU RAM in bf16 or fp16 for deployment.