RuadaptQwen2.5-14B-R1-distill-preview-v1 Overview
This model is an instruction-tuned, Russian-adapted version of the deepseek-ai/DeepSeek-R1-Distill-Qwen-14B architecture, developed by RefalMachine. It has undergone significant modifications to enhance its performance and efficiency for the Russian language.
Key Adaptations and Features
- Tokenizer Replacement: The original tokenizer was replaced with an extended tiktoken cl100k tokenizer, utilizing a unigram tokenizer with 48,000 tokens, specifically for Russian.
- Continued Pretraining: The model was further pretrained on a substantial Russian corpus.
- Learned Embedding Propagation (LEP): This technique was applied to further improve the model's adaptation.
- Enhanced Russian Generation Speed: These adaptations have resulted in up to a 60% increase in Russian text generation speed compared to the original deepseek-ai/DeepSeek-R1-Distill-Qwen-14B model, measured by characters/words per second.
Performance and Evaluation
The model is currently undergoing evaluation on several Russian-specific benchmarks, including Ru-Arena-General, MERA, and llmtf_open. Preliminary results for Ru-Arena-General were conducted with a repetition_penalty=1.1. Custom system prompts were used for MERA evaluations to mitigate issues with code tasks.
Research Context
The methodologies behind this adaptation are detailed in research papers such as "Facilitating large language model Russian adaptation with Learned Embedding Propagation" (Preprint: https://arxiv.org/abs/2412.21140) and "Impact of Tokenization on LLaMa Russian Adaptation."
Intended Use Cases
This model is particularly well-suited for applications requiring efficient and accurate Russian language generation, where the speed of output is a critical factor.