Name: AIDC-AI/Marco-LLM-ES API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AIDC-AI

Overview

Marco-LLM-ES is a series of language models developed by AIDC-AI, specifically enhanced for languages prevalent in Spain: Catalan, Basque, Galician, and Spanish. This 7.6 billion parameter base model has undergone significant continued pretraining on a 50 billion token dataset, focusing on improving its performance in these regional languages while maintaining general benchmark competitiveness.

Key Capabilities

Multilingual Specialization: Optimized for Catalan, Basque, Galician, and Spanish through extensive continued pretraining.
Transformer Architecture: Utilizes a Transformer architecture with SwiGLU activation, attention QKV bias, and group query attention.
Adaptive Tokenizer: Features an improved tokenizer designed for multiple languages.
Scalable Series: Part of a larger series ranging from 7B to 72B parameters, including both base and instruction-tuned variants.

Usage Recommendations

This base model is not recommended for direct text generation. Developers should apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or further continued pretraining to adapt it for specific applications.

Performance Highlights

Evaluations show enhanced performance in Spanish-specific tasks, with the 7B model achieving an average score of 34.16 across Spanish, Catalan, Basque, and Galician benchmarks on LaLeaderboard (5-shot).