haoranxu/ALMA-13B-R
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jan 17, 2024License:mitArchitecture:Transformer0.1K Open Weights Warm
ALMA-13B-R is a 13 billion parameter language model developed by Haoran Xu and his team, specifically fine-tuned for machine translation. It utilizes Contrastive Preference Optimization (CPO) on top of the ALMA architecture, enabling it to achieve high-quality translation performance. This model excels at translating between languages, with reported capabilities matching or exceeding GPT-4 and WMT winners in translation tasks. Its 4096-token context length supports robust handling of translation inputs.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
top_p
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
min_p
–