Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Sep 20, 2024License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

VikhrModels' Vikhr-Nemo-12B-Instruct-R-21-09-24 is a 12 billion parameter unimodal LLM, an enhanced version of Mistral-Nemo-Instruct-2407, primarily adapted for Russian and English. It features a 32768-token context length and is optimized for reasoning, summarization, code generation, roleplay, dialogue, and high-performance RAG capabilities. The model was trained using SFT and SMPO, a custom DPO variation, and includes a unique Grounded RAG mode for document-based question answering.

Loading preview...

Vikhr-Nemo-12B-Instruct-R-21-09-24: Enhanced Bilingual LLM

Vikhr-Nemo-12B-Instruct-R-21-09-24 is a 12 billion parameter large language model developed by VikhrModels, building upon the Mistral-Nemo-Instruct-2407 architecture. It is specifically adapted and optimized for high-quality generation in both Russian and English, with support for other languages. The model boasts a substantial context length of up to 128k tokens, inherited from its base model.

Key Capabilities & Features

  • Bilingual Proficiency: High-quality generation in Russian and English, supported by the custom Grandmaster-PRO-MAX dataset.
  • Optimized for Diverse Tasks: Excels in reasoning, summarization, code generation, roleplay, and dialogue.
  • Advanced RAG Mode: Features a unique "Grounded RAG" mode, inspired by Command-R, allowing the model to identify relevant document identifiers and use them for grounded answers. This mode requires a specific GROUNDED_SYSTEM_PROMPT and supports Markdown, HTML, or Plain Text document content.
  • System Prompt Support: Allows for regulating response style, ideally using English system prompts.
  • Training Methodology: Developed using a multi-stage process including SFT with a 150k instruction synthetic dataset and alignment via SMPO, a custom DPO variation, for improved answer quality.

Performance Highlights

On the ru-arena-general benchmark, Vikhr-Nemo-12B-Instruct-R-21-09-24 achieved a winrate of 79.8% against gpt-3.5-turbo-0125 (which has a 50% winrate as a reference). In RAG benchmarks, it demonstrated strong performance, with 68% judge-correct-percent for in-domain questions and 92% for out-of-domain questions, outperforming gpt-4o-mini and gpt-3.5-turbo-0125 in some metrics.

Limitations & Recommendations

  • Safety: The model has a low level of safety by default, prioritizing instruction following. Users should implement their own safety measures.
  • System Prompts: Best used for style specification (e.g., "answer only in json format") and preferably in English.
  • Temperature: Recommended to use with low temperature (0.1-0.5) and top_k (30-50) to avoid generation defects.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p