baban/QwenTranslate_Tamil_English

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Aug 12, 2025License:otherArchitecture:Transformer Cold

baban/QwenTranslate_Tamil_English is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. This model is specifically optimized for machine translation between Tamil and English, demonstrating a loss of 1.1217 on its evaluation set. Its primary use case is facilitating translation tasks between these two languages, leveraging the Qwen2.5 architecture for this specialized function.

Loading preview...

Overview

This model, baban/QwenTranslate_Tamil_English, is a specialized machine translation model. It is a fine-tuned version of the Qwen/Qwen2.5-3B-Instruct base model, specifically adapted for translation tasks between Tamil and English. With 3.1 billion parameters, it leverages the robust architecture of the Qwen2.5 series to provide focused translation capabilities.

Key Capabilities

  • Tamil-English Translation: Optimized for translating text between Tamil and English.
  • Qwen2.5 Architecture: Built upon the Qwen2.5-3B-Instruct model, inheriting its foundational language understanding.
  • Performance: Achieved a loss of 1.1217 on its evaluation set, indicating its proficiency in the fine-tuned task.

Training Details

The model was trained with a learning rate of 5e-05 over 3 epochs, using a total batch size of 1024 across 8 GPUs. The optimizer used was ADAMW_TORCH with standard betas and epsilon, and an inverse_sqrt learning rate scheduler. The training involved processing approximately 4.47 billion input tokens.

Use Cases

This model is particularly well-suited for applications requiring accurate and efficient translation between Tamil and English, such as content localization, cross-lingual communication tools, or educational platforms.