haoranxu/X-ALMA-13B-Group8

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 23, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

haoranxu/X-ALMA-13B-Group8 is a 13 billion parameter multilingual causal language model, part of the X-ALMA family, designed for high-quality machine translation across 50 languages. This specific model release focuses on Group 8 languages: English, Azerbaijani, Kazakh, Kyrgyz, Turkish, Uzbek, Arabic, Hebrew, and Persian. It utilizes a plug-and-play architecture with language-specific modules and is optimized for translation tasks, building upon the ALMA-R framework.

Loading preview...

X-ALMA-13B-Group8: Multilingual Translation Model

This model, developed by Haoran Xu and collaborators, is a 13 billion parameter variant of the X-ALMA series, extending the capabilities of ALMA-R to support 50 languages. X-ALMA employs a unique plug-and-play architecture featuring language-specific modules, which are integrated using a carefully designed training methodology.

Key Capabilities

  • Expanded Language Support: While the full X-ALMA project supports 50 languages, this specific X-ALMA-13B-Group8 release is fine-tuned for a distinct set of 9 languages: English (en), Azerbaijani (az), Kazakh (kk), Kyrgyz (ky), Turkish (tr), Uzbek (uz), Arabic (ar), Hebrew (he), and Persian (fa).
  • Modular Architecture: It utilizes language-specific LoRA modules that can be merged with a base model, offering flexibility in deployment.
  • Translation Focus: Primarily designed for machine translation, it can also handle multilingual open-ended question answering.
  • Efficient Loading Options: Supports loading as a pre-merged model for simplicity or as a base model with an attached language-specific module for more granular control.

Good for

  • Developers requiring high-quality machine translation between English and the specified Group 8 languages.
  • Applications needing multilingual text generation or understanding in these specific linguistic contexts.
  • Research into modular language model architectures and efficient multilingual adaptation.