Agnes-AI/Agnes-SeaLLM-8b

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 8, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Agnes-SeaLLM-8B is a compact 8 billion parameter Large Language Model developed by Agnes-AI, specifically optimized for Southeast Asian languages with a 32768 token context length. It delivers performance comparable to much larger models in mathematical reasoning, translation, and instruction following. The model is engineered to minimize hallucinations and provide culturally sensitive responses, excelling across multi-dimensional benchmarks including M3Exam and MMLU.

Loading preview...

Agnes-SeaLLM-8B: Compact, Culturally Aware, and High-Performing

Agnes-SeaLLM-8B is an 8 billion parameter Large Language Model (LLM) from Agnes-AI, designed for efficient deployment and superior performance in Southeast Asian languages. It boasts a 32768 token context length and is meticulously tuned to reduce hallucinations and ensure culturally sensitive responses, while maintaining strong performance in English and Chinese.

Key Capabilities & Differentiators

  • Compact Efficiency: Enables high-speed inference and low-resource deployment, making it ideal for edge devices.
  • Top-Tier Performance: Outperforms comparable open-source models across academic examinations, complex instruction following, mathematics, and high-precision translation.
  • Superior Instruction Following: Excels in multi-turn dialogues and executing nuanced tasks with high fidelity.
  • Culturally Aware & Reliable: Engineered for reduced hallucinations and increased sensitivity to Southeast Asian cultural nuances.
  • Balanced Multilingual Mastery: Achieves consistent, high-quality output across a broad linguistic spectrum, avoiding the "seesaw effect" common in regional models.

Performance Highlights

Agnes-SeaLLM-8B achieves an average score of 75.32 on SeaExam and 74.13 on MMLU, surpassing 8B peers and even outperforming larger models like Sailor2-20B and Meta-Llama-3-70B in these benchmarks. Notably, it scores 93.24% in M3Exam English, 72.97% in M3Exam Indonesian, and 70.03% in M3Exam Vietnamese, demonstrating elite global reasoning alongside strong regional linguistic nuance.

Good For

  • Applications requiring efficient, high-performance LLMs in resource-constrained environments.
  • Use cases demanding accurate and culturally appropriate responses in Southeast Asian languages, English, and Chinese.
  • Tasks involving mathematical reasoning, translation, and complex instruction following.