Name: swiss-ai/Apertus-8B-Instruct-2509 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: swiss-ai

Apertus-8B-Instruct-2509: A Massively Multilingual and Open LLM

Apertus-8B-Instruct-2509, developed by swiss-ai, is an 8 billion parameter, decoder-only transformer model engineered for pushing the boundaries of fully-open, multilingual, and transparent language models. It supports an impressive 1811 natively supported languages and offers a long context window of 32768 tokens, with default support up to 65,536 tokens. The model is pretrained on 15 trillion tokens using a staged curriculum of web, code, and math data, and incorporates a novel xIELU activation function and AdEMAMix optimizer.

Key Capabilities

Massively Multilingual: Natively supports 1811 languages, making it highly versatile for global applications.
Fully Open and Compliant: Features open weights, open training data, and complete training details, including data reconstruction scripts and intermediate checkpoints. It respects opt-out consent of data owners and avoids memorization of training data.
Long Context Processing: Capable of handling context lengths up to 65,536 tokens.
Tool Use Support: Designed to support agentic usage with tool integration.
Strong General Language Understanding: Achieves competitive performance on general language understanding tasks, scoring 65.8% on average across ARC, HellaSwag, WinoGrande, XNLI, XCOPA, and PIQA benchmarks, comparable to other open-weight models in its class.

Good for

Applications requiring extensive multilingual support across a vast number of languages.
Use cases where transparency, open data, and compliance with data privacy (e.g., opt-out consent) are critical.
Tasks benefiting from long context understanding and generation.
Developing agentic systems that leverage tool use.
Researchers and developers seeking a fully auditable and reproducible LLM with detailed training insights.

For more in-depth information, refer to the Apertus Technical Report.

Overview

Apertus-8B-Instruct-2509: A Massively Multilingual and Open LLM

Key Capabilities

Good for

Full Model Card (README)