Name: haoranxu/ALMA-13B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: haoranxu

ALMA-13B: Advanced Language Model-based Translator

ALMA-13B is a 13 billion parameter model from the ALMA (Advanced Language Model-based Translator) family, developed by Haoran Xu and collaborators. It represents a novel paradigm in machine translation, built upon the LLaMA-2 architecture and optimized through a unique two-stage fine-tuning process.

Key Capabilities & Training

Two-Step Fine-tuning: ALMA models are initially fine-tuned on a large corpus of monolingual data (12 billion tokens for ALMA-13B) to establish strong language understanding. This is followed by a second stage of fine-tuning on high-quality human-written parallel data, specifically targeting translation performance.
Translation Optimization: The model is explicitly designed and optimized for machine translation tasks, aiming to deliver robust and accurate cross-language text conversion.
ALMA-R Variants: Newer ALMA-R versions (e.g., ALMA-13B-R) further enhance translation capabilities by incorporating Contrastive Preference Optimization (CPO) using triplet preference data, which has shown to match or exceed performance of models like GPT-4 or WMT winners.

Use Cases

Machine Translation: Ideal for applications requiring high-quality translation between languages.
Research & Development: Provides a strong baseline for further research into LLM-based translation paradigms and preference optimization techniques.

Overview

ALMA-13B: Advanced Language Model-based Translator

Key Capabilities & Training

Use Cases

Full Model Card (README)