sail/Sailor-14B
TEXT GENERATIONConcurrency Cost:1Model Size:14.2BQuant:FP8Ctx Length:32kPublished:May 16, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Sailor-14B is a 14.2 billion parameter language model developed by sail, part of the Sailor suite of Open Language Models. Built upon the Qwen 1.5 architecture, it is specifically tailored for South-East Asian (SEA) languages including Indonesian, Thai, Vietnamese, Malay, and Lao, while maintaining proficiency in English and Chinese. The model has a context length of 32768 tokens and is designed for tasks such as question answering and commonsense reasoning in these diverse linguistic contexts.

Loading preview...

Sailor-14B: South-East Asian Language Model

Sailor-14B is a 14.2 billion parameter model from the Sailor suite, developed by sail, focusing on linguistic understanding and generation for South-East Asian (SEA) languages. It is built on the Qwen 1.5 architecture and has been continually pre-trained to enhance its performance across Indonesian, Thai, Vietnamese, Malay, and Lao, alongside English and Chinese.

Key Capabilities & Features

  • Multilingual Proficiency: Optimized for SEA languages, demonstrating strong performance in question answering and commonsense reasoning tasks.
  • Robust Training: Underwent extensive pre-training with 200 billion tokens on a high-quality, deduplicated corpus including SlimPajama, SkyPile, CC100, and MADLAD-400.
  • Instruction-Tuned Variants: Base models are further fine-tuned with open-source datasets to create instruction-tuned versions (Sailor-Chat).
  • Context Length: Supports a context window of 32768 tokens.

Use Cases & Differentiators

Sailor-14B is particularly suited for applications requiring deep linguistic understanding and generation in the specified South-East Asian languages. Its specialized training makes it a strong candidate for tasks where general-purpose models might underperform in these regional contexts. The model maintains strong performance in English and Chinese, offering a versatile solution for multilingual environments. For more technical details, refer to the technical report.