haoranxu/ALMA-7B-Pretrain
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 17, 2023License:mitArchitecture:Transformer0.0K Open Weights Cold

The ALMA-7B-Pretrain model by Haoran Xu is a 7 billion parameter language model with a 4096 token context length, serving as the base for the ALMA translation models. It undergoes an initial monolingual fine-tuning phase, making it a foundational component for subsequent translation-specific fine-tuning. This model is designed to be further optimized with LoRA weights for machine translation tasks, rather than being a standalone translation model.

Loading preview...