baban/QwenTranslate_English_Tamil
The baban/QwenTranslate_English_Tamil model is a 3.1 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-3B-Instruct. This model is specifically optimized for English-to-Tamil translation tasks, leveraging its base architecture for multilingual capabilities. It demonstrates a training loss of 0.3116, indicating its specialization in accurate language translation between these two languages. Its primary application is high-quality English to Tamil machine translation.
Loading preview...
Model Overview
The baban/QwenTranslate_English_Tamil model is a specialized language model derived from the Qwen/Qwen2.5-3B-Instruct architecture. With 3.1 billion parameters, it has been meticulously fine-tuned on the MT_En_Tamil dataset to excel in English-to-Tamil translation.
Key Capabilities
- Dedicated Translation: Optimized specifically for translating text from English to Tamil.
- Performance: Achieved a training loss of 0.3116, indicating effective learning on its target task.
- Base Model: Built upon the robust Qwen2.5-3B-Instruct foundation, inheriting its general language understanding capabilities.
Training Details
The model was trained using the following key hyperparameters:
- Learning Rate: 5e-05
- Batch Size: A total training batch size of 1024 (distributed across 8 GPUs with 16 gradient accumulation steps).
- Epochs: 3.0 epochs.
- Optimizer: ADAMW_TORCH with default betas and epsilon.
Intended Use Cases
This model is ideal for applications requiring accurate and efficient translation between English and Tamil, such as:
- Machine translation services.
- Content localization.
- Cross-lingual communication tools.