casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only
TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:May 22, 2025License:apache-2.0Architecture:Transformer Open Weights Cold

casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only is a 24 billion parameter text-only variant of the Mistral-Small-3.1-24B-Base-2503 model, developed by casperhansen. It features a 128k context length and was created by removing the vision encoder and converting the architecture from mistral3 to mistral. This model maintains strong performance in text-based tasks, achieving a 0-shot MMLU score of 77.25%, comparable to its multimodal counterpart.

Loading preview...