Overview
SeaLLM-7B-v2.5: Multilingual LLM for Southeast Asia
SeaLLM-7B-v2.5 is an 8.5 billion parameter large language model developed by SeaLLMs, representing a significant upgrade from previous versions. Built on the Gemma-7b architecture, it has undergone extensive supervised fine-tuning (SFT) and alignment to enhance its capabilities, particularly for Southeast Asian (SEA) languages.
Key Capabilities & Performance
- Multilingual Proficiency: Optimized for English, Chinese, Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Myanmar, and Filipino.
- World Knowledge: Achieves 7B SOTA on multilingual knowledge benchmarks such as MMLU, M3Exam, and VMLU for SEA languages, often outperforming GPT-3.5.
- Math Reasoning: Demonstrates strong performance in mathematical reasoning, scoring 79.0 on GSM8K and 34.9 on MATH, surpassing GPT-3.5 in MATH.
- Instruction Following: Enhanced through careful alignment and large-scale SFT.
Usage Considerations
- BOS Token: Requires a
<bos>token at the start of the prompt; users must manually prepend it if the tokenizer does not do so by default. - Repetition Penalty: Must be set to
1to prevent degeneration in generated text. - Chat Format: Utilizes a specific chat template (
<|im_start|>role\ncontent<eos>\n) for optimal performance in both inference and fine-tuning.
When to Use This Model
- Applications targeting Southeast Asian languages: Ideal for tasks requiring high accuracy and nuanced understanding in the specified SEA languages.
- Reasoning and Knowledge-based tasks: Suitable for complex problem-solving, particularly in math and general knowledge domains where it shows competitive or superior performance against larger models.
- Resource-constrained environments: Offers strong performance at a 7B parameter scale, making it efficient for deployment compared to larger alternatives.