YandexGPT-5-Lite-8B-instruct Overview
YandexGPT-5-Lite-8B-instruct is an 8 billion parameter instruction-tuned large language model developed by Yandex. It is built upon the YandexGPT 5 Lite Pretrain base and features a substantial 32k token context length, making it suitable for processing longer inputs and generating detailed responses. The model's alignment process includes Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), similar to the YandexGPT 5 Pro.
Key Capabilities & Differentiators
- Strong Performance: Benchmarks indicate that YandexGPT 5 Lite closely matches or exceeds competitors like Llama-3.1-8B-instruct and Qwen-2.5-7B-instruct in various scenarios.
- Russian Cultural & Factual Knowledge: A notable strength is its superior performance in tasks requiring knowledge of Russian culture and specific facts.
- Independent Training: The model was trained from scratch without incorporating weights from any third-party models.
- Quantized Version Available: A GGUF quantized version is provided for efficient deployment with tools like llama.cpp and ollama.
Usage & Technical Notes
Developers can integrate YandexGPT-5-Lite-8B-instruct using popular libraries like Hugging Face Transformers and vLLM. The model utilizes a specific tokenization approach, recommending the use of the original sentencepiece tokenizer for full compatibility. It also employs a unique dialogue template where the model is trained to generate only one response after the Ассистент:[SEP] sequence, ending with the </s> token. This specific template means interactive mode results might differ from fixed dialogue generation.