RefalMachine/RuadaptQwen2.5-14B-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Feb 3, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

RefalMachine/RuadaptQwen2.5-14B-Instruct is a 14.8 billion parameter instruction-tuned language model, adapted from Qwen2.5-14B for enhanced Russian language performance. It features a replaced tokenizer and continued pretraining on a Russian corpus, followed by Learned Embedding Propagation (LEP). This model is optimized for generating Russian texts, achieving up to 60% faster generation speed compared to the original Qwen-2.5-14B-Instruct for Russian content.

Loading preview...

Overview

RefalMachine/RuadaptQwen2.5-14B-Instruct is an instruction-tuned variant of the Qwen2.5-14B model, specifically adapted for the Russian language. Developed by RefalMachine, this model incorporates a new tokenizer and undergoes continued pretraining on a substantial Russian corpus. A key innovation is the application of Learned Embedding Propagation (LEP) to further enhance its capabilities.

Key Adaptations & Features

  • Tokenizer Replacement: The original tokenizer was replaced with an extended tiktoken cl100k, utilizing a unigram tokenizer with 48,000 tokens, specifically optimized for Russian.
  • Continued Pretraining: The model underwent additional pretraining on a Russian-language dataset to improve its understanding and generation of Russian text.
  • Learned Embedding Propagation (LEP): This technique was applied post-pretraining to further refine the model's performance.
  • Enhanced Russian Generation Speed: Due to the specialized tokenizer, the model demonstrates up to a 60% increase in Russian text generation speed (characters/words per second) compared to the base Qwen-2.5-14B-Instruct model on identical text sequences.

Current Status & Evaluation

This model is currently a work in progress (v1). Evaluation is planned or ongoing across several benchmarks including Ru-Arena-General, MERA, and llmtf_open. Preliminary measurements on Ru-Arena-General were conducted with a repetition_penalty=1.1. Custom system prompts were prepared for MERA submissions to mitigate issues with code-related tasks.

Good for

  • Applications requiring efficient and high-quality Russian text generation.
  • Developers looking for a Qwen2.5-based model with strong Russian language capabilities.
  • Use cases where generation speed for Russian content is a critical factor.