baban/QwenTranslate_English_Bengali_100K_SFT
The baban/QwenTranslate_English_Bengali_100K_SFT model is a 3.1 billion parameter language model, fine-tuned from Qwen/Qwen2.5-3B-Instruct, specifically designed for English-to-Bengali translation. This model leverages a 32768-token context length and is optimized for high-quality translation tasks between these two languages. Its primary strength lies in its specialized training on the MT_En_Bengali dataset, making it highly effective for applications requiring accurate English-Bengali language conversion.
Loading preview...
Model Overview
The baban/QwenTranslate_English_Bengali_100K_SFT model is a specialized language model fine-tuned for English-to-Bengali translation. It is built upon the robust Qwen/Qwen2.5-3B-Instruct architecture, featuring 3.1 billion parameters and a substantial 32768-token context window.
Key Capabilities
- English-Bengali Translation: The model's core capability is translating text from English to Bengali, having been specifically fine-tuned on the
MT_En_Bengalidataset. - Instruction Following: Inherits instruction-following capabilities from its base model, Qwen2.5-3B-Instruct, which can be beneficial for structured translation tasks.
Training Details
The model was trained with a learning rate of 5e-05, a total batch size of 1024 (achieved with 8 GPUs and 32 gradient accumulation steps), and for 3 epochs. The training process resulted in a validation loss of 0.4847, indicating effective learning on the translation task.
Ideal Use Cases
- Localized Content Creation: Translating English content into Bengali for websites, applications, or marketing materials.
- Cross-Lingual Communication: Facilitating communication between English and Bengali speakers.
- Research and Development: As a base for further fine-tuning on more specific English-Bengali translation domains.