baban/QwenTranslate_English_Hindi_100K_SFT

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 12, 2026License:otherArchitecture:Transformer Cold

The baban/QwenTranslate_English_Hindi_100K_SFT model is a 3.1 billion parameter, 32768-token context length language model fine-tuned from Qwen/Qwen2.5-3B-Instruct. Developed by baban, this model is specifically optimized for English-to-Hindi translation tasks, leveraging the MT_En_Hindi dataset. It is designed to provide specialized translation capabilities between these two languages.

Loading preview...

Model Overview

The baban/QwenTranslate_English_Hindi_100K_SFT is a specialized language model, fine-tuned from the Qwen/Qwen2.5-3B-Instruct architecture. With 3.1 billion parameters and a context length of 32768 tokens, this model is specifically adapted for translation tasks between English and Hindi.

Key Capabilities

  • English-Hindi Translation: The model's primary function is to translate text from English to Hindi, having been fine-tuned on the MT_En_Hindi dataset.
  • Qwen2.5-3B-Instruct Base: Benefits from the foundational capabilities of the Qwen2.5-3B-Instruct model, providing a robust base for its specialized translation function.

Training Details

The model was trained with a learning rate of 5e-05, a total batch size of 1024 (across 8 GPUs with 32 gradient accumulation steps), and an inverse square root learning rate scheduler. Training was conducted for 3 epochs, achieving a validation loss of 0.4726.

Intended Use Cases

This model is best suited for applications requiring accurate and efficient translation of text from English to Hindi, particularly in scenarios where a specialized, fine-tuned model can outperform more general-purpose language models for this specific language pair.