haoranxu/Llama-3-Instruct-8B-CPO-SimPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jun 19, 2024License:mitArchitecture:Transformer0.0K Open Weights Warm
The haoranxu/Llama-3-Instruct-8B-CPO-SimPO model is an 8 billion parameter language model based on the Llama-3-Instruct architecture. It incorporates a joint training approach using CPO (Contrastive Preference Optimization) and SimPO (Simple Preference Optimization) techniques. This model is specifically fine-tuned to leverage the benefits of both preference optimization methods, aiming for improved alignment and performance in instruction-following tasks.
Loading preview...