mistralai/Mistral-Small-24B-Base-2501

Warm
Public
24B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Mistral-Small-24B-Base-2501 Overview

Mistral-Small-24B-Base-2501 is a 24 billion parameter base language model developed by Mistral AI, positioned as a high-performing model in the sub-70B category. It features a substantial 32k context window and utilizes a Tekken tokenizer with a 131k vocabulary size, enabling robust language processing.

Key Capabilities

  • Multilingual Support: The model supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
  • Advanced Reasoning: It exhibits strong conversational and reasoning capabilities, as evidenced by its performance on various benchmarks.
  • Benchmark Performance: Achieves 80.73 on MMLU (5-shot), 54.37 on MMLU Pro (5-shot, CoT), and 69.64 on MBPP (pass@1). It also scores 80.73 on GSM8K (5-shot, maj@1) and 45.98 on MATH (4-shot, MaJ).
  • Open License: Released under the Apache 2.0 License, allowing for broad commercial and non-commercial use and modification.

Good For

  • Developing applications requiring strong multilingual understanding and generation.
  • Tasks demanding advanced reasoning and problem-solving, such as complex question answering and mathematical challenges.
  • Serving as a foundational model for further fine-tuning on specialized datasets or tasks, given its robust base capabilities.