baban/QwenTranslate_Bengali_English

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Aug 12, 2025License:otherArchitecture:Transformer Warm

The baban/QwenTranslate_Bengali_English model is a 3.1 billion parameter language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. It is specifically optimized for translation tasks between Bengali and English, leveraging its base architecture for language understanding and generation. This model is designed for applications requiring accurate and efficient bidirectional translation between these two languages. It was trained with a context length of 32768 tokens.

Loading preview...

Model Overview

The baban/QwenTranslate_Bengali_English model is a specialized language model derived from the Qwen2.5-3B-Instruct architecture, featuring 3.1 billion parameters. It has been meticulously fine-tuned on the MT_Bengali_En dataset, focusing on enhancing its capabilities for Bengali-English translation.

Key Capabilities

  • Bidirectional Translation: Optimized for translating text between Bengali and English.
  • Qwen2.5 Base: Benefits from the robust language understanding and generation capabilities of the Qwen2.5-3B-Instruct foundation model.
  • Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer texts during translation.

Training Details

The model was trained with a learning rate of 5e-05, a total batch size of 1024 (across 8 GPUs with 16 gradient accumulation steps), and for 3 epochs. The training process utilized the AdamW optimizer and an inverse square root learning rate scheduler. Evaluation loss on the training set was recorded at 1.0118.

Intended Use Cases

This model is particularly well-suited for applications requiring high-quality machine translation between Bengali and English, such as:

  • Document translation
  • Cross-lingual communication tools
  • Content localization for Bengali or English audiences