SeaLLMs/SeaLLMs-v3-1.5B-Chat

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Jul 17, 2024License:seallmsArchitecture:Transformer0.0K Warm

SeaLLMs-v3-1.5B-Chat is a 1.5 billion parameter causal language model developed by SeaLLMs, part of the SeaLLMs-v3 series, specifically fine-tuned for chat and instruction following. It excels in world knowledge, mathematical reasoning, and translation across Southeast Asian languages, offering enhanced trustworthiness with reduced hallucination and culturally sensitive responses. The model supports a 131072 token context length and is optimized for diverse tasks in languages including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese.

Loading preview...

SeaLLMs-v3-1.5B-Chat Overview

SeaLLMs-v3-1.5B-Chat is a 1.5 billion parameter instruction-tuned language model from the SeaLLMs-v3 family, designed to serve Southeast Asian languages. This model demonstrates strong performance across various tasks, including world knowledge, mathematical reasoning, translation, and instruction following, while maintaining a 131072 token context length.

Key Capabilities

  • Multilingual Support: Tailored for a wide range of Southeast Asian languages, including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese.
  • Enhanced Instruction Following: Significantly improved capability in understanding and executing multi-turn instructions.
  • Trustworthiness: Engineered to reduce hallucination and provide safe, culturally sensitive responses, particularly for Southeast Asian contexts.
  • Competitive Performance: Achieves state-of-the-art results among models of similar sizes, as evidenced by its performance on benchmarks like M3Exam, where it outperforms models like Qwen2-1.5B-Instruct and Gemma-2B-IT in average SEA language scores.

Good for

  • Applications requiring robust instruction following in a chat format.
  • Use cases involving multiple Southeast Asian languages.
  • Scenarios where reduced hallucination and culturally appropriate responses are critical.
  • Developers seeking a capable model with a smaller parameter count for efficient deployment.