Name: sail/Sailor-1.8B-Chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sail

Overview

Sailor-1.8B-Chat is a 1.8 billion parameter instruction-tuned language model developed by sail, designed with a strong focus on South-East Asian (SEA) languages. Built upon the Qwen 1.5 architecture, this model is part of a suite of Sailor models ranging from 0.5B to 14B parameters. It has been continually pre-trained and fine-tuned using a combination of public datasets, with careful data curation and deduplication to enhance performance in SEA languages.

Key Capabilities

Multilingual Proficiency: Optimized for Indonesian, Thai, Vietnamese, Malay, and Lao, while retaining strong performance in English and Chinese.
Instruction Following: Fine-tuned with open-source instruction datasets like aya_collection, aya_dataset, and OpenOrca for chat-based applications.
Reasoning and QA: Benchmarking results indicate proficiency in tasks such as question answering and commonsense reasoning.

Training Details

The model underwent continuous pre-training from Qwen 1.5, leveraging large public corpora including SlimPajama, SkyPile, CC100, and MADLAD-400. The training involved 200-400 billion tokens, with systematic experiments to balance language weights, ensuring robust performance across its target languages without significant compromise on English and Chinese capabilities. A technical report is available for more details: arxiv.org/pdf/2404.03608.pdf.

Good For

Applications requiring strong language understanding and generation in South-East Asian languages.
Chatbots and conversational AI systems targeting users in Indonesia, Thailand, Vietnam, Malaysia, and Laos.
Research and development in multilingual NLP, particularly for low-resource SEA languages.