Vikhrmodels/Vikhr-Llama3.1-8B-Instruct-R-21-09-24
Vikhr-Llama3.1-8B-Instruct-R-21-09-24 is an 8 billion parameter unimodal large language model developed by VikhrModels, based on Meta-Llama-3.1-8B-Instruct. It is specifically optimized for high-quality generation in Russian and English, featuring advanced RAG capabilities and support for up to 128k context tokens. The model excels in reasoning, summarization, code generation, and roleplay, aiming to surpass GPT-3.5-turbo in many tasks.
Loading preview...
Model Overview
Vikhr-Llama3.1-8B-Instruct-R-21-09-24 is an 8 billion parameter instruction-tuned large language model developed by VikhrModels. It is an enhanced version of meta-llama/Meta-Llama-3.1-8B-Instruct, primarily adapted for Russian and English languages through a multi-stage training process involving SFT and SMPO (a proprietary DPO variation).
Key Capabilities & Features
- Multilingual Generation: Optimized for high-quality outputs in Russian and English, with support for other languages, leveraging the
Grandmaster-PRO-MAXdataset. - Extended Context: Supports up to 128k tokens context length due to the base model's RoPE scaling.
- Advanced RAG Mode: Features a unique "Grounded RAG" mode with a dedicated
documentsrole, enabling the model to identify and utilize relevant document identifiers for answering user questions, inspired by Command-R. - System Prompt Support: Allows for regulating response style using system prompts.
- Diverse Use Cases: Optimized for reasoning, summarization, code generation, roleplay, and dialogue maintenance.
Performance & Benchmarks
The model was evaluated on VikhrModels' open-source Russian-language SbS benchmark, ru-arena-general, where it achieved a winrate of 63.4% against gpt-3.5-turbo-0125 (which has a 50% winrate as a reference). In RAG benchmarks, it demonstrated strong performance, with 64% judge-correct percentage for in-domain questions and 89% for out-of-domain questions, outperforming gpt-3.5-turbo-0125 in both categories.
Training Methodology
Training involved a large synthetic instructional dataset (Vikhrmodels/GrandMaster-PRO-MAX) with built-in CoT, and a RAG grounding dataset (Vikhrmodels/Grounded-RAG-RU-v2). Alignment was achieved using SMPO, a custom preference optimization method, after training a custom Reward Model and performing Rejection Sampling.
Usage Recommendations
- RAG Mode: Requires a specific
GROUNDED_SYSTEM_PROMPTand structureddocumentsinput (JSON array of dictionaries). - Safety: Has a low safety level; users should test independently. System prompts can partially mitigate this.
- System Prompts: Best used for specifying response style (e.g., "answer only in json format") and preferably written in English.
- Generation Settings: Recommended to use with low temperature (0.1-0.4), beam search, and
top_k(30-50).
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.