Vikhrmodels/Vistral-24B-Instruct

TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Sep 28, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Vikhrmodels/Vistral-24B-Instruct is a 24 billion parameter unimodal large language model developed by VikhrModels, based on Mistral-Small-3.2-24B-Instruct-2506. Optimized primarily for Russian and English, this model removes the visual encoder and multimodal capabilities of its base. It excels in instruction following for both languages, demonstrating strong performance on Russian-language benchmarks.

Loading preview...

Vistral-24B-Instruct: A Bilingual LLM for Russian and English

Vistral-24B-Instruct is a 24 billion parameter unimodal Large Language Model developed by VikhrModels. It is an enhanced version of mistralai/Mistral-Small-3.2-24B-Instruct-2506, specifically adapted for Russian and English languages. The model's multimodal capabilities, including the visual encoder, have been removed, while retaining the standard MistralForCausalLM architecture.

Key Capabilities & Performance

  • Bilingual Optimization: Primarily adapted and optimized for instruction following in Russian and English.
  • Strong Russian Performance: Achieves a 96.1% winrate on the ru-arena-general open-source SbS benchmark, outperforming the base Mistral-Small-3.2-24B-Instruct-2506 (92.1%).
  • Instruction Following: Designed for accurate and complete execution of instructions.

Usage Recommendations & Limitations

  • Safety: The model has a low level of response safety; users should implement their own safety measures and testing.
  • System Prompts: Best used for specifying response style (e.g., "answer only in json format") and are most effective when written in English.
  • Generation Parameters: Recommended to use with low temperatures (0.1-0.5) and top_k values (30-50) to avoid generation defects.

VikhrModels provides the training code in their effective_llm_alignment GitHub repository and datasets on their Hugging Face profile.