unsloth/Mistral-Small-24B-Instruct-2501
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Jan 30, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The unsloth/Mistral-Small-24B-Instruct-2501 is a 24 billion parameter instruction-tuned language model developed by Mistral AI, based on the Mistral-Small-24B-Base-2501 architecture. It features a 32k context window and is optimized for agentic capabilities, including native function calling and JSON outputting. This model excels in conversational and reasoning tasks, supporting dozens of languages, and is designed for efficient local deployment on hardware like an RTX 4090 or 32GB RAM MacBook.

Loading preview...

Model Overview

unsloth/Mistral-Small-24B-Instruct-2501 is a 24 billion parameter instruction-fine-tuned model from Mistral AI, designed to offer state-of-the-art capabilities in the "small" LLM category. It is built upon the Mistral-Small-24B-Base-2501 and features a substantial 32k context window. This model is notable for its efficiency, capable of local deployment on a single RTX 4090 or a 32GB RAM MacBook, especially when quantized.

Key Features & Capabilities

  • Multilingual Support: Proficient in dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, and Korean.
  • Agent-Centric Design: Offers robust agentic capabilities with native function calling and JSON outputting, making it suitable for complex automated workflows.
  • Advanced Reasoning: Demonstrates strong conversational and reasoning abilities, comparable to larger models.
  • System Prompt Adherence: Maintains strong adherence to and support for system prompts, enhancing control over model behavior.
  • Apache 2.0 License: Provides an open license for both commercial and non-commercial use and modification.

Performance Highlights

Mistral-Small-24B-Instruct-2501 shows competitive performance across various benchmarks:

  • Reasoning & Knowledge: Achieves 0.663 on mmlu_pro_5shot_cot_instruct and 0.453 on gpqa_main_cot_5shot_instruct.
  • Math & Coding: Scores 0.848 on humaneval_instruct_pass@1 and 0.706 on math_instruct.
  • Instruction Following: Records 8.35 on mtbench_dev and 52.27 on wildbench.

Good for

  • Fast Response Conversational Agents: Its efficiency and conversational capabilities make it ideal for chatbots requiring quick interactions.
  • Low Latency Function Calling: Excellent for applications needing rapid execution of tool-use and function calls.
  • Subject Matter Experts: Can be fine-tuned for specialized domain knowledge.
  • Local Inference: Suitable for hobbyists and organizations handling sensitive data that require on-premise model deployment.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p