baban/QwenTranslate_English_Telugu

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Aug 5, 2025License:otherArchitecture:Transformer Warm

baban/QwenTranslate_English_Telugu is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. This model is specifically optimized for English to Telugu machine translation tasks, leveraging its base architecture for language understanding and generation. It demonstrates a loss of 0.2652 on its evaluation set, indicating its performance in this specialized translation domain. The model is designed for applications requiring accurate and efficient translation between English and Telugu.

Loading preview...

Model Overview

baban/QwenTranslate_English_Telugu is a specialized machine translation model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct base model. With 3.1 billion parameters, it focuses on translating text from English to Telugu. The model was trained using a learning rate of 5e-05 over 3 epochs, with a total training batch size of 1024 across 8 GPUs.

Key Capabilities

  • English to Telugu Translation: Optimized for high-quality translation between these two languages.
  • Fine-tuned Performance: Achieved a loss of 0.2652 on its evaluation set, indicating effective learning for the translation task.
  • Qwen2.5-3B-Instruct Base: Benefits from the robust architecture and capabilities of the Qwen2.5-3B-Instruct model.

Training Details

The model underwent training with specific hyperparameters including an AdamW optimizer, inverse square root learning rate scheduler, and a distributed training setup. It processed approximately 4.9 billion input tokens during its training phase.

Good For

  • Applications requiring dedicated English-Telugu machine translation.
  • Developers seeking a specialized model for language conversion in this pair.