casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only

TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:May 22, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only is a 24 billion parameter text-only variant of the Mistral-Small-3.1-24B-Base-2503 model, developed by casperhansen. It features a 128k context length and was created by removing the vision encoder and converting the architecture from mistral3 to mistral. This model maintains strong performance in text-based tasks, achieving a 0-shot MMLU score of 77.25%, comparable to its multimodal counterpart.

Loading preview...

Model Overview

casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only is a 24 billion parameter base model derived from mistralai/Mistral-Small-3.1-24B-Base-2503. This version is specifically designed for text-only applications, achieved by removing the original model's vision encoder and adapting its architecture. It also serves as the base for mistralai/Devstral-Small-2505.

Key Capabilities

  • Text-Only Processing: Optimized exclusively for natural language understanding and generation tasks, without multimodal capabilities.
  • Extended Context Length: Supports a substantial 128k token context window, enabling processing of longer texts and complex queries.
  • Strong General Reasoning: Achieves a 0-shot MMLU score of 77.25%, demonstrating robust performance across a wide range of academic and common-sense reasoning benchmarks. This score is very close to the 77.34% of its original multimodal variant, indicating minimal performance degradation from the text-only conversion.

Use Cases

This model is particularly well-suited for applications where high-performance text processing is required without the overhead of multimodal capabilities. Its large context window makes it ideal for:

  • Long-form content generation and summarization
  • Complex question answering and information extraction
  • Code generation and analysis (as a base for models like Devstral-Small-2505)
  • General-purpose conversational AI and chatbots