RaduGabriel/SirUkrainian
RaduGabriel/SirUkrainian is a 7 billion parameter language model, merged from several Mistral-based models using task arithmetic, with a 4096-token context length. It is built upon the Mistral-7B-v0.1 base and includes components specifically fine-tuned for Ukrainian language processing. This model demonstrates a strong average performance of 70.50 on the Open LLM Leaderboard benchmarks, indicating its general reasoning and language understanding capabilities.
Loading preview...
Model Overview
RaduGabriel/SirUkrainian is a 7 billion parameter language model created by RaduGabriel. It is a merged model, combining four distinct Mistral-based models using a task arithmetic merge method. The base model for this merge is mistralai/Mistral-7B-v0.1, and it has a context length of 4096 tokens.
Key Capabilities
- Merged Architecture: Leverages the strengths of
RaduGabriel/MUZD,RaduGabriel/Mistral-Instruct-Ukrainian-SFT,Radu1999/MisterUkrainianDPO, andCultriX/NeuralTrix-7B-dpothrough a weighted task arithmetic merge. - Ukrainian Language Focus: Includes models specifically fine-tuned for Ukrainian instruction following and DPO, suggesting enhanced performance for Ukrainian language tasks.
- General Reasoning: Achieves an average score of 70.50 on the Open LLM Leaderboard, with notable scores in:
- AI2 Reasoning Challenge (25-Shot): 67.32
- HellaSwag (10-Shot): 85.54
- MMLU (5-Shot): 63.14
- Winogrande (5-Shot): 81.53
Good For
- Applications requiring a 7B parameter model with solid general reasoning abilities.
- Tasks involving the Ukrainian language, given the inclusion of Ukrainian-specific fine-tuned components.
- Developers looking for a performant model based on the Mistral architecture.