Overview
This model, msu-rcc-lair/RuadaptQwen2.5-32B-Instruct, is a 32.8 billion parameter instruction-tuned variant of the Qwen2.5 architecture, specifically adapted for the Russian language. It incorporates a custom tokenizer (an extended tiktoken cl100k with a 48k unigram tokenizer) and underwent continued pretraining on a substantial Russian corpus. A key innovation is the application of Learned Embedding Propagation (LEP) to further enhance its performance.
Key Capabilities & Differentiators
- Enhanced Russian Language Performance: Achieves up to a 60% increase in Russian text generation speed (characters/words per second) compared to the original Qwen-2.5-32B-Instruct, attributed to its specialized tokenizer and continued pretraining.
- Instruction-Tuned: Designed to follow instructions effectively, making it suitable for various conversational and task-oriented applications.
- Evaluated on Russian Benchmarks: Performance has been assessed on Ru-Arena-General (with
repetition_penalty=1.1) and MERA, demonstrating its capabilities in Russian language understanding and generation tasks. A custom system prompt was used for MERA submissions to mitigate evaluation shortcomings on coding tasks.
Use Cases
This model is particularly well-suited for applications requiring high-quality and efficient Russian language processing, including:
- Generating Russian text.
- Engaging in Russian-language instruction-following tasks.
- Applications where fast Russian text output is critical.