baban/QwenTranslate_Telugu_English

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Aug 12, 2025License:otherArchitecture:Transformer Cold

The baban/QwenTranslate_Telugu_English model is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. It is specifically optimized for machine translation between Telugu and English, leveraging a 32768 token context length. This model is designed for applications requiring accurate translation capabilities between these two languages.

Loading preview...

Overview

baban/QwenTranslate_Telugu_English is a specialized machine translation model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct architecture. With 3.1 billion parameters and a 32768 token context length, this model focuses on translating between Telugu and English.

Key Capabilities

  • Telugu-English Machine Translation: The model is specifically trained on the MT_Telugu_En dataset to facilitate translation between these two languages.
  • Leverages Qwen2.5-3B-Instruct Base: Benefits from the foundational capabilities of the Qwen 2.5 series, adapted for a specific translation task.

Training Details

The model was trained with a learning rate of 5e-05, a total batch size of 1024 (achieved with train_batch_size 8 and gradient_accumulation_steps 16), and ran for 3 epochs. It utilized an AdamW optimizer and an inverse square root learning rate scheduler. Evaluation during training showed a loss of 1.0262.

Intended Uses

  • Applications requiring direct translation between Telugu and English.
  • Integration into systems needing to process or generate text in both languages.