ALMA-7B-Pretrain: A Foundation for Advanced Machine Translation
This model, haoranxu/ALMA-7B-Pretrain, is a 7 billion parameter language model based on LLaMA-2-7B, specifically designed as a pre-trained base for the ALMA (Advanced Language Model-based Translator) series. It has undergone an initial fine-tuning phase on 20 billion monolingual tokens, establishing a strong linguistic foundation.
Key Characteristics:
- Pre-training Stage: Represents the first stage of the ALMA translation paradigm, focusing on monolingual data fine-tuning.
- Not a Standalone Translator: This
Pretrain version is explicitly noted as not a translation model on its own. It requires further fine-tuning with LoRA weights to become a functional translator. - Foundation for ALMA-7B-LoRA and ALMA-7B-R: It serves as the base model for
ALMA-7B-LoRA (which adds LoRA fine-tuning on human-written parallel data) and ALMA-7B-R (which further applies Contrastive Preference Optimization).
Intended Use:
This model is intended to be used in conjunction with its corresponding LoRA models (e.g., haoranxu/ALMA-7B-Pretrain-LoRA or haoranxu/ALMA-7B-R) to perform high-quality machine translation. Developers should load this base model and then apply the specific LoRA weights for translation tasks, following the two-step fine-tuning process outlined in the ALMA paper.