baban/QwenTranslate_Tamil_English
baban/QwenTranslate_Tamil_English is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. This model is specifically optimized for machine translation between Tamil and English, demonstrating a loss of 1.1217 on its evaluation set. Its primary use case is facilitating translation tasks between these two languages, leveraging the Qwen2.5 architecture for this specialized function.
Loading preview...
Overview
This model, baban/QwenTranslate_Tamil_English, is a specialized machine translation model. It is a fine-tuned version of the Qwen/Qwen2.5-3B-Instruct base model, specifically adapted for translation tasks between Tamil and English. With 3.1 billion parameters, it leverages the robust architecture of the Qwen2.5 series to provide focused translation capabilities.
Key Capabilities
- Tamil-English Translation: Optimized for translating text between Tamil and English.
- Qwen2.5 Architecture: Built upon the Qwen2.5-3B-Instruct model, inheriting its foundational language understanding.
- Performance: Achieved a loss of 1.1217 on its evaluation set, indicating its proficiency in the fine-tuned task.
Training Details
The model was trained with a learning rate of 5e-05 over 3 epochs, using a total batch size of 1024 across 8 GPUs. The optimizer used was ADAMW_TORCH with standard betas and epsilon, and an inverse_sqrt learning rate scheduler. The training involved processing approximately 4.47 billion input tokens.
Use Cases
This model is particularly well-suited for applications requiring accurate and efficient translation between Tamil and English, such as content localization, cross-lingual communication tools, or educational platforms.