Name: AIDC-AI/Marco-LLM-SEA API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AIDC-AI

Overview

Marco-LLM-SEA is a series of language models developed by AIDC-AI, specifically designed and fine-tuned for Southeast Asian languages. This 7.6 billion parameter model is part of a larger family ranging from 7B to 72B, including both base and instruction-tuned variants.

Key Capabilities

Multilingual Focus: Enhanced capabilities across Indonesian, Malaysian, Thai, Vietnamese, and other regional Southeast Asian languages.
Continued Pretraining: Underwent extensive continued pretraining on approximately 56 billion tokens, improving its proficiency in target languages.
Advanced Architecture: Based on the Transformer architecture, incorporating SwiGLU activation, attention QKV bias, and group query attention.
Adaptive Tokenizer: Utilizes an improved tokenizer specifically adapted for multiple Southeast Asian languages and scripts.

Usage Recommendations

This base model is not intended for direct text generation without further adaptation. Developers are advised to apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or additional continued pretraining to tailor it for specific use cases. For more details, refer to the Hugging Face page and the associated research paper: Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement.

Overview

Overview

Key Capabilities

Usage Recommendations

Full Model Card (README)