Overview

Mistral-Small-24B-Instruct-2501, developed by Mistral AI, is a 24 billion parameter instruction-fine-tuned model designed to deliver state-of-the-art capabilities in the "small" LLM category. It is an instruction-tuned version of the Mistral-Small-24B-Base-2501 model and is released under an Apache 2.0 License, allowing broad commercial and non-commercial use. The model features a 32k context window and a 131k vocabulary Tekken tokenizer.

Key Capabilities

Multilingual Support: Proficient in dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
Agent-Centric Design: Excels in agentic tasks with native function calling and JSON outputting capabilities.
Advanced Reasoning: Demonstrates strong conversational and reasoning abilities.
System Prompt Adherence: Maintains robust adherence to and support for system prompts.

Performance Highlights

Internal human evaluations indicate Mistral-Small-24B-Instruct-2501 performs comparably or favorably against models like Gemma-2-27B and Qwen-2.5-32B on proprietary coding and generalist prompts. Public benchmarks show strong performance in:

Reasoning & Knowledge: Achieves 0.663 on mmlu_pro_5shot_cot_instruct and 0.453 on gpqa_main_cot_5shot_instruct.
Math & Coding: Scores 0.848 on humaneval_instruct_pass@1 and 0.706 on math_instruct.
Instruction Following: Records 8.35 on mtbench_dev and 52.27 on wildbench.

Ideal Use Cases

Fast Conversational Agents: Suitable for applications requiring quick responses.
Low Latency Function Calling: Optimized for efficient tool use and function execution.
Local Inference: Can be deployed on consumer-grade hardware (e.g., RTX 4090 or 32GB RAM MacBook when quantized), making it ideal for hobbyists and organizations with sensitive data requirements.
Subject Matter Experts: Can be further fine-tuned for specialized domain knowledge.

Overview

Overview

Key Capabilities

Performance Highlights

Ideal Use Cases

Full Model Card (README)