The chujiezheng/Starling-LM-7B-beta-ExPO is a 7 billion parameter language model based on Nexusflow/Starling-LM-7B-beta and openchat/openchat-3.5-0106, utilizing an extrapolated (ExPO) training method. This model achieves superior alignment with human preference by extrapolating from SFT and DPO/RLHF checkpoints. It demonstrates improved win rates on the AlpacaEval 2.0 benchmark and higher MT-Bench scores compared to its base models and other 7B-class LLMs.
No reviews yet. Be the first to review!