Name: RefalMachine/RuadaptQwen3-4B-Hybrid API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: RefalMachine

Model Overview

RefalMachine/RuadaptQwen3-4B-Hybrid is a 4 billion parameter model based on Qwen/Qwen3-4B, specifically engineered for the Russian language. Developed by RefalMachine, this model incorporates a replaced tokenizer, continued pre-training on a Russian corpus, and the application of Learned Embedding Propagation (LEP) technique.

Key Capabilities & Features

Enhanced Russian Language Performance: The model features a new tokenizer, an extended tiktoken cl100k augmented with 48k Russian tokens, which significantly increases Russian text generation speed by up to 100% compared to the original Qwen/Qwen3-4B.
Hybrid Reasoner: Like its base model, RuadaptQwen3-4B-Hybrid includes a hybrid reasoner, which is enabled by default. Users can toggle this reasoning mode on or off using /no_think and /think tokens or programmatically via enable_thinking parameter in the tokenizer.
Adaptation for Russian: The model's pre-training on a Russian corpus and the LEP technique are central to its improved performance and fluency in Russian.

Recommended Usage

Generation Parameters: For stable output, it is recommended to use low temperatures (0.0-0.3), top_p between 0.85 and 0.95, and a repetition_penalty of 1.05. Adjust repetition_penalty based on task requirements, potentially lowering it for RAG or increasing it to prevent loops.
Citation: If you use this model, please cite the associated paper: Tikhomirov M., Chernyshov D. Facilitating Large Language Model Russian Adaptation with Learned Embedding Propagation //Journal of Language and Education. – 2024. – Т. 10. – №. 4. – С. 130-145.

Overview

Model Overview

Key Capabilities & Features

Recommended Usage

Full Model Card (README)