haoranxu/X-ALMA-13B-Group1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 23, 2024License:mitArchitecture:Transformer0.0K Open Weights Warm

haoranxu/X-ALMA-13B-Group1 is a 13 billion parameter language model developed by Haoran Xu et al., building on ALMA-R. It features a plug-and-play architecture with language-specific modules, specifically supporting English, Danish, Dutch, German, Icelandic, Norwegian, Swedish, and Afrikaans. This model is optimized for multilingual translation and open-ended QA across these Group 1 languages.

Loading preview...

X-ALMA-13B-Group1: Multilingual Translation and QA

X-ALMA-13B-Group1 is a 13 billion parameter model developed by Haoran Xu et al., extending the ALMA-R architecture to support an expanded set of 50 languages. This specific release focuses on Group 1 languages, which include English (en), Danish (da), Dutch (nl), German (de), Icelandic (is), Norwegian (no), Swedish (sv), and Afrikaans (af).

Key Capabilities

  • Expanded Multilingual Support: Builds upon ALMA-R's foundation, significantly increasing language coverage.
  • Plug-and-Play Architecture: Utilizes language-specific modules, allowing for flexible integration and targeted language support.
  • Optimized Training: Incorporates a carefully designed training recipe for enhanced multilingual performance.
  • Translation: Excels at translation tasks between supported languages.
  • Multilingual Open-ended QA: Capable of performing question answering in the specified Group 1 languages.

Usage Recommendations

This model is provided as a merged model, where the language-specific module for Group 1 has been integrated into the base model. This is the recommended method for loading and using X-ALMA-13B-Group1 for translation and QA tasks in the supported languages. Alternatively, users can load the base model and then attach the language-specific LoRA module.