princeton-nlp/Mistral-7B-Base-SFT-CPO
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jul 6, 2024Architecture:Transformer0.0K Cold

The princeton-nlp/Mistral-7B-Base-SFT-CPO is a 7 billion parameter language model based on the Mistral architecture, fine-tuned using the Simple Preference Optimization (SimPO) method. Developed by princeton-nlp, this model is designed to leverage a reference-free reward mechanism for preference optimization, as detailed in its associated research preprint. Its primary use case is to demonstrate and apply the SimPO technique for improved alignment and performance in language generation tasks.

Loading preview...