tartuNLP/Apertus-EstLLM-8B-Instruct-0326
The tartuNLP/Apertus-EstLLM-8B-Instruct-0326 is an 8 billion parameter instruction-tuned causal language model, derived from tartuNLP/Apertus-EstLLM-8B-Instruct-1125 using a chat-vector merge approach. This model specializes in Estonian language processing, demonstrating strong performance in instruction-following, multiple-choice questions, and English-to-Estonian translation tasks. With a 32768 token context length, it is optimized for applications requiring robust Estonian language understanding and generation.
Loading preview...
Apertus EstLLM 8B 0326 Instruct Overview
The tartuNLP/Apertus-EstLLM-8B-Instruct-0326 is an 8 billion parameter instruction-tuned language model developed by tartuNLP. It is built upon the tartuNLP/Apertus-EstLLM-8B-Instruct-1125 model, enhanced through a chat-vector merge approach. This model is particularly focused on excelling in Estonian language tasks, while also maintaining competitive performance in English.
Key Capabilities & Performance
- Estonian Language Proficiency: Demonstrates strong instruction-following capabilities in Estonian, scoring 0.5608 on IFEval-et. It also performs well in Estonian language competence benchmarks, including Grammar-et (0.713), Inflection-et (0.4326), and Word-Meanings-et (0.9438).
- Knowledge & Reasoning (Estonian): Achieves notable results in Estonian knowledge and reasoning tasks, with scores like 0.5976 on Winogrande-et and 0.64 on GlobalPIQA-et.
- English Language Performance: Shows solid instruction-following in English (0.7089 on IFEval-en) and competitive scores in English knowledge and reasoning benchmarks such as Winogrande (0.5699) and GlobalPIQA-en (0.69).
- English-to-Estonian Translation: Ranks highly in translation tasks, achieving a BLEU score of 0.2676 on the wmt24pp dataset for English-to-Estonian translation, making it one of the top performers among comparable models.
Ideal Use Cases
- Estonian-centric Applications: Excellent for chatbots, content generation, and instruction-following systems requiring high accuracy in Estonian.
- Multilingual Translation: Particularly strong for English-to-Estonian translation services.
- Research and Development: Suitable for researchers exploring advanced techniques in low-resource language modeling and multilingual LLMs, especially those interested in the chat-vector merge approach.