Name: haoranxu/ALMA-13B-Pretrain API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: haoranxu

ALMA-13B-Pretrain: Foundation for Advanced Machine Translation

ALMA-13B-Pretrain is a 13 billion parameter model based on LLaMA-2, developed by Haoran Xu and collaborators. It represents the first stage of the ALMA (Advanced Language Model-based Translator) paradigm, which focuses on boosting translation performance of large language models. This model has undergone extensive monolingual fine-tuning on 12 billion tokens.

Key Characteristics

Two-Stage Fine-tuning Paradigm: ALMA models are initially fine-tuned on monolingual data (as seen in ALMA-13B-Pretrain) and then further optimized with high-quality parallel data for translation tasks.
Foundation for ALMA-13B-LoRA: This specific model (haoranxu/ALMA-13B-Pretrain) is designed to be used in conjunction with its corresponding LoRA model (haoranxu/ALMA-13B-Pretrain-LoRA) to achieve translation capabilities.
ALMA-R Series: Builds upon ALMA models, utilizing Contrastive Preference Optimization (CPO) for further LoRA fine-tuning, with ALMA-13B-R matching or exceeding GPT-4 and WMT winners in translation performance.

Intended Use

Base for Translation Fine-tuning: This model is intended as a base for further LoRA fine-tuning with parallel data to create specialized machine translation systems.
Research and Development: Suitable for researchers exploring advanced fine-tuning techniques for LLM-based translation, particularly those interested in the ALMA paradigm and Contrastive Preference Optimization.

Overview

ALMA-13B-Pretrain: Foundation for Advanced Machine Translation

Key Characteristics

Intended Use

Full Model Card (README)