haoranxu/X-ALMA-13B-Group7

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 23, 2024License:mitArchitecture:Transformer Open Weights Cold

The haoranxu/X-ALMA-13B-Group7 model is a 13 billion parameter language model developed by Haoran Xu et al., built upon the ALMA-R architecture. It features a plug-and-play design with language-specific modules, specifically supporting English (en), Gujarati (gu), Hindi (hi), Marathi (mr), Nepali (ne), and Urdu (ur) for translation tasks. This model is optimized for multilingual translation, expanding support from 6 to 50 languages across its X-ALMA family.

Loading preview...

Overview

haoranxu/X-ALMA-13B-Group7 is a 13 billion parameter model from the X-ALMA family, developed by Haoran Xu et al. It extends the ALMA-R architecture with a novel plug-and-play design, incorporating language-specific modules and an adaptive training recipe. This particular release focuses on Group 7 languages, which include English (en), Gujarati (gu), Hindi (hi), Marathi (mr), Nepali (ne), and Urdu (ur).

Key Capabilities

  • Multilingual Translation: Specifically designed for high-quality translation across the six languages in Group 7.
  • Plug-and-Play Architecture: Utilizes language-specific LoRA modules that can be merged with a base model or loaded dynamically, offering flexibility in deployment.
  • Scalable Multilingual Support: The broader X-ALMA project aims to support 50 languages, with this model representing a specialized group within that ecosystem.

Usage and Deployment

Users can load X-ALMA-13B-Group7 in three primary ways:

  • Recommended: Load the pre-merged model, where the Group 7 language module is already integrated into the base model.
  • Recommended: Load the base haoranxu/X-ALMA-13B-Pretrain model and then attach the Group 7 specific LoRA module.
  • Advanced: Load the full haoranxu/X-ALMA model with all language-specific modules, requiring substantial GPU memory and explicit language specification during generation.

Good for

  • Developers requiring robust translation capabilities for English, Gujarati, Hindi, Marathi, Nepali, and Urdu.
  • Applications needing a modular approach to multilingual model deployment.
  • Research into plug-and-play language model architectures for diverse language sets.