hehua2008/Mistral-Small-3.2-24B-Instruct-2506-abliterated

VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 10, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Mistral-Small-3.2-24B-Instruct-2506 is a 24 billion parameter instruction-tuned language model developed by Mistral AI, building upon Mistral-Small-3.1. This model significantly improves instruction following, reduces repetition errors, and features a more robust function calling template. It is optimized for precise instruction adherence and reliable tool use, making it suitable for complex conversational AI and automated task execution.

Loading preview...

Overview

hehua2008/Mistral-Small-3.2-24B-Instruct-2506 is a 24 billion parameter instruction-tuned model, representing a minor but significant update to its predecessor, Mistral-Small-3.1-24B-Instruct-2503. Developed by Mistral AI, this iteration focuses on enhancing core conversational and functional capabilities.

Key Improvements

  • Instruction Following: Demonstrates improved accuracy in adhering to precise instructions, with Wildbench v2 scores increasing from 55.6% to 65.33% and Arena Hard v2 from 19.56% to 43.1%.
  • Repetition Errors: Significantly reduces infinite generations and repetitive answers, cutting internal infinite generation rates by nearly half (from 2.11% to 1.29%).
  • Function Calling: Features a more robust function calling template, enhancing reliability for tool-use tasks.

Performance Highlights

While maintaining or slightly improving performance across most categories, Mistral-Small-3.2-24B-Instruct-2506 shows notable gains in specific STEM benchmarks:

  • MMLU Pro (5-shot CoT): Improved from 66.76% to 69.06%.
  • MBPP Plus - Pass@5: Increased from 74.63% to 78.33%.
  • HumanEval Plus - Pass@5: Rose from 88.99% to 92.90%.
  • SimpleQA (TotalAcc): Improved from 10.43% to 12.10%.

Usage Recommendations

This model is recommended for use with vLLM (version >= 0.9.1) for optimal performance, though transformers is also supported. Users should employ a low temperature (e.g., 0.15) and provide a detailed system prompt for best results, especially for general assistant roles.