Name: swiss-ai/Apertus-70B-2509 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: swiss-ai

Apertus-70B-2509: A Transparent, Multilingual LLM

Apertus-70B-2509, developed by swiss-ai, is a 70 billion parameter decoder-only transformer model pretrained on an extensive 15 trillion tokens. It stands out for its commitment to full transparency, offering open weights, data, and complete training details, including data reconstruction scripts and intermediate checkpoints. The model was trained from scratch using a staged curriculum of web, code, and math data, incorporating a novel xIELU activation function and the AdEMAMix optimizer, followed by supervised fine-tuning and alignment via QRPO.

Key Capabilities

Massively Multilingual: Natively supports over 1800 languages, making it suitable for global applications.
Long Context: Features a default context length of 32,768 tokens, extendable up to 65,536 tokens.
Ethically Compliant: Trained exclusively on fully compliant and open data, respecting opt-out consent and avoiding memorization.
Tool Use: Supports agentic usage with tool integration capabilities.
Performance: Achieves competitive performance on general language understanding tasks, comparable to models with closed training methodologies.

Good for

Applications requiring extensive multilingual support across a vast number of languages.
Use cases demanding transparency in model development, data, and training processes.
Tasks benefiting from long context understanding and generation.
Developers seeking a powerful, openly documented model for research and deployment, with support for tool use and various inference frameworks like Transformers, vLLM, and MLX.

Overview

Apertus-70B-2509: A Transparent, Multilingual LLM

Key Capabilities

Good for

Full Model Card (README)