RefalMachine/ruadapt_qwen2.5_3B_ext_u48_instruct_v4

Warm
Public
3.1B
BF16
32768
1
Oct 18, 2024
Hugging Face
Overview

Model Overview

RefalMachine/ruadapt_qwen2.5_3B_ext_u48_instruct_v4 is an instruction-tuned variant of the Qwen2.5-3B model, specifically adapted for the Russian language. The development involved replacing the original tokenizer with an extended tiktoken cl100k tokenizer (48k unigram tokens) and subsequent continued pretraining on a Russian corpus. A key innovation is the application of Learned Embedding Propagation (LEP) technique.

Key Capabilities & Differentiators

  • Optimized Russian Generation: Achieves up to 60% faster generation of Russian texts compared to the base Qwen-2.5-3B-Instruct model due to its specialized tokenizer and training.
  • Russian Language Adaptation: Underwent extensive pretraining on Russian datasets to enhance its proficiency in the language.
  • Instruction Following: Fine-tuned to follow instructions effectively, making it suitable for various conversational and task-oriented applications.

Performance Metrics

The model has been evaluated on Russian-specific benchmarks, including Ru-Arena-General and MERA. On Ru-Arena-General, it achieved a winrate of 66.1% (with a 95% CI of +2.2 / -1.9), demonstrating competitive performance against larger models in its class. Further evaluations on MERA and llmtf_open are also provided, highlighting its capabilities in Russian language understanding and generation.