SeaLLMs/SeaLLM-7B-v2.5

Loading
Public
8.5B
FP8
8192
License: other
Hugging Face
Overview

SeaLLM-7B-v2.5: Multilingual LLM for Southeast Asia

SeaLLM-7B-v2.5 is an 8.5 billion parameter large language model developed by SeaLLMs, representing a significant upgrade from previous versions. Built on the Gemma-7b architecture, it has undergone extensive supervised fine-tuning (SFT) and alignment to enhance its capabilities, particularly for Southeast Asian (SEA) languages.

Key Capabilities & Performance

  • Multilingual Proficiency: Optimized for English, Chinese, Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Myanmar, and Filipino.
  • World Knowledge: Achieves 7B SOTA on multilingual knowledge benchmarks such as MMLU, M3Exam, and VMLU for SEA languages, often outperforming GPT-3.5.
  • Math Reasoning: Demonstrates strong performance in mathematical reasoning, scoring 79.0 on GSM8K and 34.9 on MATH, surpassing GPT-3.5 in MATH.
  • Instruction Following: Enhanced through careful alignment and large-scale SFT.

Usage Considerations

  • BOS Token: Requires a <bos> token at the start of the prompt; users must manually prepend it if the tokenizer does not do so by default.
  • Repetition Penalty: Must be set to 1 to prevent degeneration in generated text.
  • Chat Format: Utilizes a specific chat template (<|im_start|>role\ncontent<eos>\n) for optimal performance in both inference and fine-tuning.

When to Use This Model

  • Applications targeting Southeast Asian languages: Ideal for tasks requiring high accuracy and nuanced understanding in the specified SEA languages.
  • Reasoning and Knowledge-based tasks: Suitable for complex problem-solving, particularly in math and general knowledge domains where it shows competitive or superior performance against larger models.
  • Resource-constrained environments: Offers strong performance at a 7B parameter scale, making it efficient for deployment compared to larger alternatives.