Model Overview
MuXodious/Mistral-Small-3.2-24B-Instruct-2506-absolute-heresy is a 24 billion parameter instruction-tuned model derived from Mistral-Small-3.2-24B-Instruct-2506. It was fine-tuned using P-E-W's Heretic engine, specifically with Magnitude-Preserving Orthogonal Ablation, to achieve an "Absolute Heresy" classification, indicating a low refusal rate (5/100) and KL Divergence (0.0749).
Key Enhancements
This model builds upon its predecessor, Mistral-Small-3.1-24B-Instruct-2503, with notable improvements:
- Instruction Following: Demonstrates enhanced ability to follow precise instructions, with an internal accuracy of 84.78% on instruction following tasks.
- Repetition Errors: Significantly reduces infinite generations by 2x on challenging prompts, achieving a rate of 1.29% compared to 2.11% in the previous version.
- Function Calling: Features a more robust function calling template, excelling in tool-use scenarios.
- Multimodal Capabilities: Supports vision reasoning, allowing it to process and interpret image inputs for tasks like scenario analysis and object identification.
Performance Highlights
While maintaining or slightly improving performance across most categories, Mistral-Small-3.2-24B-Instruct-2506 shows specific gains:
- Instruction Following/Chat: Achieves 65.33% on Wildbench v2 and 43.1% on Arena Hard v2.
- Code Generation: Improves on code benchmarks, scoring 78.33% on MBPP Plus - Pass@5 and 92.90% on HumanEval Plus - Pass@5.
- Vision: Shows strong performance in vision tasks, with 87.4% on ChartQA and 94.86% on DocVQA.
Recommended Usage
This model is recommended for applications requiring strong instruction adherence, reduced repetitive outputs, and robust function calling. Its multimodal capabilities make it suitable for tasks involving both text and image inputs. It can be used with vLLM (recommended for optimal performance) or transformers frameworks. A low temperature (e.g., 0.15) and a well-defined system prompt are advised for best results.