RefalMachine/ruadapt_qwen2.5_3B_ext_u48_instruct_v4

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Oct 18, 2024Architecture:Transformer0.0K Warm

RefalMachine/ruadapt_qwen2.5_3B_ext_u48_instruct_v4 is a 3.1 billion parameter instruction-tuned language model based on the Qwen2.5 architecture, developed by RefalMachine. This model features a replaced tokenizer and continued pretraining on a Russian corpus, followed by Learned Embedding Propagation (LEP). It is specifically optimized for Russian language generation, achieving up to 60% faster generation speed for Russian texts compared to the original Qwen-2.5-3B-Instruct model.

Loading preview...

Model Overview

RefalMachine/ruadapt_qwen2.5_3B_ext_u48_instruct_v4 is an instruction-tuned variant of the Qwen2.5-3B model, specifically adapted for the Russian language. The development involved replacing the original tokenizer with an extended tiktoken cl100k tokenizer (48k unigram tokens) and subsequent continued pretraining on a Russian corpus. A key innovation is the application of Learned Embedding Propagation (LEP) technique.

Key Capabilities & Differentiators

  • Optimized Russian Generation: Achieves up to 60% faster generation of Russian texts compared to the base Qwen-2.5-3B-Instruct model due to its specialized tokenizer and training.
  • Russian Language Adaptation: Underwent extensive pretraining on Russian datasets to enhance its proficiency in the language.
  • Instruction Following: Fine-tuned to follow instructions effectively, making it suitable for various conversational and task-oriented applications.

Performance Metrics

The model has been evaluated on Russian-specific benchmarks, including Ru-Arena-General and MERA. On Ru-Arena-General, it achieved a winrate of 66.1% (with a 95% CI of +2.2 / -1.9), demonstrating competitive performance against larger models in its class. Further evaluations on MERA and llmtf_open are also provided, highlighting its capabilities in Russian language understanding and generation.