baban/QwenTranslate_English_German

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 15, 2026License:otherArchitecture:Transformer Cold

The baban/QwenTranslate_English_German model is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. Developed by baban, this model specializes in English to German machine translation. It leverages a 32768-token context length, making it suitable for translation tasks requiring substantial input. The model's primary strength lies in its optimized performance for English-German translation, as indicated by its training on the MT_En_German dataset.

Loading preview...

Overview

The baban/QwenTranslate_English_German model is a specialized machine translation model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct base model. With 3.1 billion parameters and a substantial 32768-token context length, it is designed for English to German translation tasks.

Key Capabilities

  • English to German Translation: The model is specifically fine-tuned on the MT_En_German dataset, indicating its primary capability in translating text from English to German.
  • Qwen2.5 Architecture: Built upon the Qwen2.5-3B-Instruct foundation, it inherits the general language understanding and generation capabilities of its base model, adapted for translation.
  • Large Context Window: A 32768-token context length allows for processing longer texts, which can be beneficial for maintaining coherence and accuracy in translation.

Training Details

The model was trained with a learning rate of 5e-05 over 3 epochs, utilizing a distributed training setup across 8 devices. The training process involved an AdamW optimizer and an inverse square root learning rate scheduler. Evaluation during training showed a loss of 0.7843.

Intended Uses

This model is best suited for applications requiring direct English to German text translation. Its fine-tuned nature suggests improved performance for this specific language pair compared to general-purpose models.