Vistral-24B-Instruct: A Bilingual LLM for Russian and English
Vistral-24B-Instruct is a 24 billion parameter unimodal Large Language Model developed by VikhrModels. It is an enhanced version of mistralai/Mistral-Small-3.2-24B-Instruct-2506, specifically adapted for Russian and English languages. The model's multimodal capabilities, including the visual encoder, have been removed, while retaining the standard MistralForCausalLM architecture.
Key Capabilities & Performance
- Bilingual Optimization: Primarily adapted and optimized for instruction following in Russian and English.
- Strong Russian Performance: Achieves a 96.1% winrate on the
ru-arena-general open-source SbS benchmark, outperforming the base Mistral-Small-3.2-24B-Instruct-2506 (92.1%). - Instruction Following: Designed for accurate and complete execution of instructions.
Usage Recommendations & Limitations
- Safety: The model has a low level of response safety; users should implement their own safety measures and testing.
- System Prompts: Best used for specifying response style (e.g., "answer only in json format") and are most effective when written in English.
- Generation Parameters: Recommended to use with low temperatures (0.1-0.5) and
top_k values (30-50) to avoid generation defects.
VikhrModels provides the training code in their effective_llm_alignment GitHub repository and datasets on their Hugging Face profile.