Overview
Mistral-Small-24B-Base-2501 Overview
Mistral-Small-24B-Base-2501 is a 24 billion parameter base language model developed by Mistral AI, positioned as a high-performing model in the sub-70B category. It features a substantial 32k context window and utilizes a Tekken tokenizer with a 131k vocabulary size, enabling robust language processing.
Key Capabilities
- Multilingual Support: The model supports dozens of languages, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, and Polish.
- Advanced Reasoning: It exhibits strong conversational and reasoning capabilities, as evidenced by its performance on various benchmarks.
- Benchmark Performance: Achieves 80.73 on MMLU (5-shot), 54.37 on MMLU Pro (5-shot, CoT), and 69.64 on MBPP (pass@1). It also scores 80.73 on GSM8K (5-shot, maj@1) and 45.98 on MATH (4-shot, MaJ).
- Open License: Released under the Apache 2.0 License, allowing for broad commercial and non-commercial use and modification.
Good For
- Developing applications requiring strong multilingual understanding and generation.
- Tasks demanding advanced reasoning and problem-solving, such as complex question answering and mathematical challenges.
- Serving as a foundational model for further fine-tuning on specialized datasets or tasks, given its robust base capabilities.