AvitoTech/avibe

Warm
Public
8B
FP8
32768
License: apache-2.0
Hugging Face
Overview

AvitoTech/avibe: Russian-Optimized Qwen3-8B

AvitoTech/avibe is an 8 billion parameter large language model developed by AvitoTech, a subsidiary of Avito. Built upon the open-source Qwen3-8B-Base, this model has undergone significant adaptation to enhance its performance for the Russian language and the Avito domain.

Key Optimizations & Capabilities

  • Custom Tokenizer: A proprietary tokenizer was developed, specifically optimized for both Russian and English, leading to higher tokenization density. This results in fewer tokens for equivalent Russian examples and up to 15-25% faster processing compared to the original Qwen3-8B.
  • Domain Adaptation: The model was extensively trained on a large corpus of data relevant to the Avito domain.
  • Performance Improvements: Through Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) stages, AvitoTech/avibe surpasses the instruct version of Qwen3-8B on numerous Russian language benchmarks.
  • Enhanced Function Calling & Math: The SFT and RL phases also significantly improved the model's Function Calling abilities and its proficiency in solving mathematical problems.
  • Reduced Model Size: The final model size is slightly reduced to 7.9B parameters from Qwen3-8B's 8.2B.

Benchmark Highlights

AvitoTech/avibe demonstrates notable gains across several benchmarks, particularly in Russian language and mathematical tasks:

  • mmlu_ru: 0.718 (vs. Qwen3-8B's 0.701)
  • gpqa_diamond_ru: 0.343 (vs. Qwen3-8B's 0.318)
  • math_500_ru: 0.686 (vs. Qwen3-8B's 0.546)
  • DOoM: 0.280 (vs. Qwen3-8B's 0.240)
  • MERA_text: 0.618 (vs. Qwen3-8B's 0.510)

Ideal Use Cases

This model is particularly well-suited for applications requiring strong Russian language understanding and generation, especially within the Avito domain, and tasks involving mathematical reasoning or Function Calling.