Radu1999/Mistral-Instruct-Ukrainian-SFT
Radu1999/Mistral-Instruct-Ukrainian-SFT is a 7 billion parameter instruction-tuned language model developed by Radu Chivereanu, based on the Mistral-7B-v0.2 architecture. This model is specifically fine-tuned on various Ukrainian datasets, including UA-SQUAD and Ukrainian StackExchange, to enhance its performance in the Ukrainian language. It utilizes Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer. The model is optimized for instruction-following tasks in Ukrainian, making it suitable for applications requiring natural language understanding and generation in this specific language.
Loading preview...
Overview
Radu1999/Mistral-Instruct-Ukrainian-SFT is an instruction-tuned language model built upon the Mistral-7B-v0.2 architecture. Developed by Radu Chivereanu, this model focuses on providing strong performance for tasks in the Ukrainian language through supervised fine-tuning.
Key Capabilities
- Ukrainian Language Proficiency: Specifically fine-tuned on a diverse set of Ukrainian datasets, including UA-SQUAD, Ukrainian StackExchange, and a Ukrainian subset of the Belebele Dataset.
- Instruction Following: Designed to respond effectively to instructions, leveraging the
[INST]and[/INST]token format for prompts. - Mistral Architecture Benefits: Inherits architectural features from Mistral-7B-v0.2, such as Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, contributing to efficient processing.
Performance
Evaluations on the Open LLM Leaderboard show competitive performance for a specialized model:
- Average Score: 62.17
- HellaSwag (10-Shot): 83.12
- MMLU (5-Shot): 60.95
- Winogrande (5-shot): 77.51
Good For
- Applications requiring instruction-based text generation and understanding in Ukrainian.
- Researchers and developers working on Ukrainian NLP tasks who need a specialized model.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.