haoranxu/X-ALMA-13B-Group6

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 23, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

haoranxu/X-ALMA-13B-Group6 is a 13 billion parameter causal language model developed by Haoran Xu et al. based on the X-ALMA architecture, which extends ALMA-R to 50 languages using a plug-and-play module design. This specific model is fine-tuned for translation and multilingual open-ended QA, supporting English, Georgian, Chinese, Japanese, Korean, Finnish, and Estonian. It is optimized for high-quality machine translation across these seven languages.

Loading preview...

X-ALMA-13B-Group6 Overview

X-ALMA-13B-Group6 is a 13 billion parameter model from the X-ALMA family, developed by Haoran Xu et al. It builds upon the ALMA-R architecture, significantly expanding its multilingual capabilities from 6 to 50 languages through a novel plug-and-play module design and a specialized training approach. This particular release focuses on Group 6 languages, which include English (en), Georgian (ka), Chinese (zh), Japanese (ja), Korean (ko), Finnish (fi), and Estonian (et).

Key Capabilities

  • Multilingual Translation: Excels in translating between the seven supported Group 6 languages.
  • Multilingual Open-Ended QA: Capable of performing question answering tasks across these languages.
  • Modular Architecture: Utilizes language-specific LoRA modules that can be merged with a base model or loaded dynamically, offering flexibility in deployment.
  • Scalable Design: The X-ALMA framework is designed to support a broad range of languages, with this model representing one specific language group.

When to Use This Model

This model is ideal for applications requiring high-quality machine translation or multilingual question answering specifically involving English, Georgian, Chinese, Japanese, Korean, Finnish, or Estonian. Its modular design allows for efficient deployment for these target languages.