webbigdata/ALMA-7B-Ja-V2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 21, 2023License:llama2Architecture:Transformer0.0K Open Weights Warm

webbigdata/ALMA-7B-Ja-V2 is a 7 billion parameter machine translation model developed by webbigdata, specifically optimized for Japanese-English and English-Japanese translation. This model builds upon the ALMA (Advanced Language Model-based Translator) paradigm, which involves fine-tuning on monolingual data followed by optimization with high-quality parallel data. It demonstrates improved performance over its predecessor, ALMA-7B-Ja, and also supports translation between English and German, Chinese, Icelandic, and Czech.

Loading preview...

Overview

webbigdata/ALMA-7B-Ja-V2 is a 7 billion parameter machine translation model primarily designed for high-quality Japanese-English and English-Japanese translation. It is an enhanced version of the previous ALMA-7B-Ja model, incorporating additional learning to boost performance. The model also possesses translation capabilities for German, Chinese, Icelandic, and Czech when paired with English, though its core focus remains on the Japanese-English language pair.

Key Capabilities

  • Bidirectional Japanese-English Translation: Optimized for translating both from Japanese to English and English to Japanese.
  • Multilingual Support: Offers secondary translation capabilities between English and German, Chinese, Icelandic, and Czech.
  • Performance Improvements: Demonstrates enhanced translation quality over its predecessor, ALMA-7B-Ja, as evidenced by improved chrF++, BLEU, and COMET scores.
  • Quantized Versions Available: Provides GPTQ and GGUF (for llama.cpp) quantized versions, reducing model size and memory footprint for deployment on resource-constrained devices like MacBooks or systems without dedicated GPUs, albeit with a slight performance trade-off.

Benchmarks and Performance

The model's translation performance is evaluated using BLEU, chrF++, and COMET metrics. Benchmarks show ALMA-7B-Ja-V2 outperforming earlier ALMA models and competitive with NLLB-200 series for Japanese-English translation. While commercial services like Google Translate and DeepL often achieve higher scores, ALMA-7B-Ja-V2 offers a strong open-source alternative, particularly for specific text genres like government documents, classical literature, and fanfiction.

Use Cases

  • Japanese-English Translation Applications: Ideal for integrating into applications requiring robust translation between Japanese and English.
  • Resource-Constrained Environments: Quantized versions (GPTQ, GGUF) enable deployment on devices with limited GPU memory or no GPU.
  • Research and Development: Serves as a strong base for further fine-tuning or research in Japanese-centric machine translation.