Overview
ALMA-7B-Ja is a 7 billion parameter machine translation model developed by webbigdata, building upon the ALMA (Advanced Language Model-based translator) learning method. Unlike the original ALMA-7B, this version is specifically adapted for Japanese (Ja) and English (En) translation, replacing the Russian language support. The ALMA paradigm involves a two-step fine-tuning process: initial fine-tuning on monolingual data followed by optimization with high-quality parallel data, which is crucial for its translation capabilities.
Key Capabilities
- Specialized Translation: Primarily designed for Japanese to English and English to Japanese translation.
- ALMA Learning Method: Utilizes a proven fine-tuning approach for robust translation performance.
- Multilingual Potential: While optimized for Ja/En, it retains some translation ability for other pairs like German-English, Chinese-English, Icelandic-English, and Czech-English, though the original ALMA-13B is recommended for these.
Performance Insights
Benchmarking against models like NLLB-54B and GPT-3.5-D shows that ALMA-7B-Ja's performance for Japanese translation is lower than the original ALMA-7B and other larger models, particularly in BLEU/COMET scores. For instance, in English to Japanese translation, it scores 14.82/85.40 (BLEU/COMET) compared to ALMA-7B's 27.13/86.98 for English to Russian.
Other Versions
- webbigdata-ALMA-7B-Ja-gguf: A
llama.cpp compatible version for CPU-based inference, developed by mmnga. - ALMA-7B-Ja-GPTQ-Ja-En: A GPTQ quantized version (3.9GB) for reduced model size and memory usage, though with potentially lower performance and reduced multilingual support.