Overview
T-lite-instruct-0.1 is an 8 billion parameter instruction-tuned model, developed by AnatoliiPotapov, built upon the T-lite-0.1 base model. It is specifically designed for further fine-tuning rather than immediate use as a conversational assistant, emphasizing the user's responsibility for ethical and safety oversight in deployment.
Key Capabilities & Training
- Instruction Tuning: Fine-tuned using a sophisticated process involving a strong teacher model for SFT and a multi-stage preference tuning (SPiN and SLiC-HF) with a robust Reward Model.
- Multilingual Data: Utilizes a diverse instruction dataset, including machine-translated English open-source datasets (UltraFeedback, HelpSteer, SHP) and synthetic grounded QA contexts, with careful filtering of translated contexts and preference data.
- Performance: Achieves a total MT-Bench score of 6.458 and an Arena General score of 57.26 (against gpt3.5-turbo-0125 as baseline), demonstrating competitive performance, especially in Russian language tasks, where it surpasses several Llama-3-8B and Qwen2-7B variants.
Intended Use
- Fine-tuning Base: Primarily intended as a foundation for further fine-tuning to create specialized conversational agents or task-specific models.
- Multilingual Applications: Particularly suitable for applications requiring strong performance in Russian, given its benchmark results on translated MT-Bench and Arena.