Name: tartuNLP/Qwen2.5-3B-Instruct-hsb-dsb API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tartuNLP

Overview

The tartuNLP/Qwen2.5-3B-Instruct-hsb-dsb model is a specialized 3.1 billion parameter instruction-tuned language model, developed by TartuNLP. It is built upon the Qwen2.5-3B-Instruct architecture and has been extensively adapted for Upper Sorbian (hsb) and Lower Sorbian (dsb). This adaptation involved continued pretraining on Sorbian monolingual and parallel datasets, combined with general instruction-tuning data.

Key Capabilities

Bilingual Sorbian Support: Jointly handles both Upper Sorbian and Lower Sorbian.
Dual Task Proficiency: Excels in both machine translation (MT) and question answering (QA) for the Sorbian languages.
WMT25 Shared Task Winner: Achieved the highest ranking in the WMT25 Shared Task on Limited Resource Slavic Languages for both hsb and dsb tracks, demonstrating strong performance in both MT and QA.

Performance Highlights

In the WMT25 Shared Task, TartuNLP's model secured the top position:

Upper Sorbian (hsb): Achieved 86.33 for DE-HSB translation and 58.10 for HSB-QA, leading in QA and tying for translation.
Lower Sorbian (dsb): Achieved 78.20 for DE-DSB translation and 57.56 for DSB-QA, leading in QA and tying for translation.

Training Details

The model was trained on approximately 1.2 billion tokens with a sequence length of 4096, utilizing AMD MI250x GPUs on the LUMI supercomputer for about 139 GPU-hours.

Important Note

This model is primarily research-focused and has not been extensively tested for general-purpose usage. Users should exercise caution and conduct their own evaluations for specific applications.

Overview

Overview

Key Capabilities

Performance Highlights

Training Details

Important Note

Full Model Card (README)