chutesai/Mistral-Small-3.1-24B-Instruct-2503

Warm
Public
Vision
24B
FP8
32768
Mar 24, 2025
License: apache-2.0
Hugging Face
Overview

Overview

Mistral-Small-3.1-24B-Instruct-2503 is an instruction-finetuned model from Mistral AI, featuring 24 billion parameters. It significantly enhances its predecessor, Mistral Small 3 (2501), by incorporating state-of-the-art vision understanding and expanding long context capabilities to 128k tokens without compromising text performance. The model is designed to be "knowledge-dense" and can be deployed locally, fitting on a single RTX 4090 or a 32GB RAM MacBook when quantized.

Key Capabilities

  • Vision: Analyzes images and provides insights based on visual content alongside text.
  • Multilingual: Supports dozens of languages, including English, French, German, Japanese, Chinese, and Arabic.
  • Agent-Centric: Offers robust agentic capabilities with native function calling and JSON outputting.
  • Advanced Reasoning: Delivers strong conversational and reasoning performance.
  • Long Context: Features a 128k context window, with strong performance on LongBench v2 and RULER benchmarks.
  • System Prompt Adherence: Maintains strong adherence to system prompts.

Benchmark Highlights

The model demonstrates competitive performance across various benchmarks:

  • Text Evals: Achieves 80.62% on MMLU and 88.41% on HumanEval.
  • Vision Evals: Scores 64.00% on MMMU and 68.91% on Mathvista, outperforming several comparable models.
  • Multilingual Evals: Shows an average of 71.18% across European, East Asian, and Middle Eastern languages.

Good For

  • Fast-response conversational agents.
  • Low-latency function calling.
  • Subject matter experts via fine-tuning.
  • Local inference for hobbyists and organizations handling sensitive data.
  • Programming and math reasoning.
  • Long document understanding.
  • Visual understanding.