chujiezheng/LLaMA3-iterative-DPO-final-ExPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 18, 2024License:llama3Architecture:Transformer0.0K Warm
The chujiezheng/LLaMA3-iterative-DPO-final-ExPO is an 8 billion parameter LLaMA3-based language model with an 8192-token context length. Developed by chujiezheng, this model utilizes an extrapolation (ExPO) technique with alpha = 0.3, building upon RLHFlow's LLaMA3-iterative-DPO-final and LLaMA3-SFT. It is specifically designed to enhance alignment with human preferences, demonstrating improved win rates on benchmarks like AlpacaEval 2.0 and higher scores on MT-Bench compared to its base models and other LLMs.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p