Model Overview
RefalMachine/RuadaptQwen3-4B-Hybrid is a 4 billion parameter model based on Qwen/Qwen3-4B, specifically engineered for the Russian language. Developed by RefalMachine, this model incorporates a replaced tokenizer, continued pre-training on a Russian corpus, and the application of Learned Embedding Propagation (LEP) technique.
Key Capabilities & Features
- Enhanced Russian Language Performance: The model features a new tokenizer, an extended tiktoken cl100k augmented with 48k Russian tokens, which significantly increases Russian text generation speed by up to 100% compared to the original Qwen/Qwen3-4B.
- Hybrid Reasoner: Like its base model, RuadaptQwen3-4B-Hybrid includes a hybrid reasoner, which is enabled by default. Users can toggle this reasoning mode on or off using
/no_think and /think tokens or programmatically via enable_thinking parameter in the tokenizer. - Adaptation for Russian: The model's pre-training on a Russian corpus and the LEP technique are central to its improved performance and fluency in Russian.
Recommended Usage
- Generation Parameters: For stable output, it is recommended to use low temperatures (0.0-0.3),
top_p between 0.85 and 0.95, and a repetition_penalty of 1.05. Adjust repetition_penalty based on task requirements, potentially lowering it for RAG or increasing it to prevent loops. - Citation: If you use this model, please cite the associated paper: Tikhomirov M., Chernyshov D. Facilitating Large Language Model Russian Adaptation with Learned Embedding Propagation //Journal of Language and Education. – 2024. – Т. 10. – №. 4. – С. 130-145.