YandexGPT-5-Lite-8B-instruct Overview

YandexGPT-5-Lite-8B-instruct is an 8 billion parameter instruction-tuned large language model developed by Yandex. It is built upon the YandexGPT 5 Lite Pretrain base model, without incorporating weights from third-party models. The alignment process for this Lite version, detailed in a Habr article, involves Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), mirroring the approach used for YandexGPT 5 Pro.

Key Capabilities & Performance

Strong Benchmark Performance: YandexGPT 5 Lite closely matches and, in some scenarios, surpasses analogues like Llama-3.1-8B-instruct and Qwen-2.5-7B-instruct in international benchmarks and their Russian adaptations.
Cultural and Factual Knowledge: A notable strength is its superior performance in tasks requiring knowledge of Russian culture and facts.
Context Length: The model supports an 8192 token context length.
Quantized Version Available: A quantized GGUF version is provided in a separate repository for use with tools like llama.cpp and ollama.

Unique Features

Custom Tokenization: The model utilizes a specific tokenization approach, recommending the original sentencepiece for full compatibility. It tokenizes each dialogue replica separately, introducing a space at the beginning of each replica and replacing \n with [NL] tokens.
Non-Standard Dialogue Template: It employs a unique dialogue template where the model is trained to generate only one reply after the sequence Ассистент:[SEP], ending with the </s> token. This can lead to different results in interactive mode versus fixed dialogue generation.

Overview

YandexGPT-5-Lite-8B-instruct Overview

Key Capabilities & Performance

Unique Features

Full Model Card (README)