tartuNLP/Apertus-EstLLM-8B-Instruct-0326

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The tartuNLP/Apertus-EstLLM-8B-Instruct-0326 is an 8 billion parameter instruction-tuned causal language model, derived from tartuNLP/Apertus-EstLLM-8B-Instruct-1125 using a chat-vector merge approach. This model specializes in Estonian language processing, demonstrating strong performance in instruction-following, multiple-choice questions, and English-to-Estonian translation tasks. With a 32768 token context length, it is optimized for applications requiring robust Estonian language understanding and generation.

Loading preview...

Apertus EstLLM 8B 0326 Instruct Overview

The tartuNLP/Apertus-EstLLM-8B-Instruct-0326 is an 8 billion parameter instruction-tuned language model developed by tartuNLP. It is built upon the tartuNLP/Apertus-EstLLM-8B-Instruct-1125 model, enhanced through a chat-vector merge approach. This model is particularly focused on excelling in Estonian language tasks, while also maintaining competitive performance in English.

Key Capabilities & Performance

  • Estonian Language Proficiency: Demonstrates strong instruction-following capabilities in Estonian, scoring 0.5608 on IFEval-et. It also performs well in Estonian language competence benchmarks, including Grammar-et (0.713), Inflection-et (0.4326), and Word-Meanings-et (0.9438).
  • Knowledge & Reasoning (Estonian): Achieves notable results in Estonian knowledge and reasoning tasks, with scores like 0.5976 on Winogrande-et and 0.64 on GlobalPIQA-et.
  • English Language Performance: Shows solid instruction-following in English (0.7089 on IFEval-en) and competitive scores in English knowledge and reasoning benchmarks such as Winogrande (0.5699) and GlobalPIQA-en (0.69).
  • English-to-Estonian Translation: Ranks highly in translation tasks, achieving a BLEU score of 0.2676 on the wmt24pp dataset for English-to-Estonian translation, making it one of the top performers among comparable models.

Ideal Use Cases

  • Estonian-centric Applications: Excellent for chatbots, content generation, and instruction-following systems requiring high accuracy in Estonian.
  • Multilingual Translation: Particularly strong for English-to-Estonian translation services.
  • Research and Development: Suitable for researchers exploring advanced techniques in low-resource language modeling and multilingual LLMs, especially those interested in the chat-vector merge approach.