Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct

Warm
Public
1B
BF16
32768
License: llama3.2
Hugging Face
Overview

Overview

Vikhr-Llama-3.2-1B-Instruct is a compact, instruction-tuned language model developed by Vikhrmodels, building upon the Llama-3.2-1B-Instruct architecture. It is specifically fine-tuned for the Russian language using the proprietary GrandMaster-PRO-MAX dataset, which consists of 150k instructions with Chain-Of-Thought (CoT) support generated via GPT-4-turbo prompts.

Key Capabilities & Features

  • Russian Language Specialization: Optimized for high performance in Russian language tasks.
  • Efficiency: Demonstrates 5 times greater efficiency than its base model, making it suitable for resource-constrained environments.
  • Compact Size: With a model size under 3GB, it is designed for deployment on low-power and mobile devices.
  • Supervised Fine-Tuning (SFT): Trained using SFT on a synthetic dataset to enhance instruction following.
  • Competitive Performance: Achieves a score of 19.04 on the ru_arena_general benchmark, outperforming its base model (4.04) significantly.

Use Cases

This model is ideal for applications requiring a highly efficient and compact Russian-language LLM. Its small footprint and optimized performance make it particularly well-suited for:

  • Mobile Applications: Integrating advanced language capabilities into mobile devices.
  • Edge Computing: Deploying LLM functionality on devices with limited computational resources.
  • Russian-centric NLP Tasks: Instruction-following, text generation, and conversational AI in Russian.