Overview
This model, mpasila/shisa-v2-JP-EN-Translator-v0.1-12B, is a 12 billion parameter language model developed by mpasila. It is fine-tuned from the shisa-ai/shisa-v2-mistral-nemo-12b base model, specifically engineered for Japanese-to-English translation tasks. The training utilized NilanE/ParallelFiction-Ja_En-100k dataset, with an initial release based on 191 conversation examples.
Key Capabilities
- Specialized Translation: Primarily designed for accurate Japanese to English translation.
- Context Window: Supports a substantial context window of 32768 tokens, allowing for longer translation inputs.
- ChatML Format: Uses the ChatML format, with a recommended system prompt:
You are an AI assistant that translates Japanese to English accurately. - Efficient Training: Trained using QLoRA with 128 rank and 32 alpha, leveraging Unsloth and Huggingface's TRL library for faster training.
Training Details
The model was trained on a dataset comprising 191 conversations, totaling 918,486 tokens. The average conversation length was approximately 4808 tokens, with a maximum of 13431 tokens. The token distribution highlights a significant focus on human and assistant messages, making it well-suited for conversational translation scenarios.
Good For
- Developers requiring a dedicated model for Japanese-to-English translation.
- Applications needing to process and translate longer Japanese texts into English.
- Integration into systems where accurate, context-aware translation is critical.