Name: princeton-nlp/Mistral-7B-Base-SFT-CPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: princeton-nlp

Overview

princeton-nlp/Mistral-7B-Base-SFT-CPO is a 7 billion parameter language model developed by princeton-nlp. This model is a result of research presented in the preprint "SimPO: Simple Preference Optimization with a Reference-Free Reward" and utilizes the novel SimPO method for fine-tuning. The core innovation lies in its ability to perform preference optimization without requiring a reference reward model, simplifying the alignment process.

Key Capabilities

Preference Optimization: Implements the Simple Preference Optimization (SimPO) technique.
Reference-Free Reward: Achieves alignment without the need for an explicit reference reward model.
Mistral-7B Base: Built upon the robust Mistral-7B architecture, providing a strong foundation for language understanding and generation.

Good For

Researchers and developers interested in novel preference optimization techniques.
Experimenting with reference-free alignment methods for large language models.
Applications requiring a 7B parameter model with enhanced instruction following capabilities derived from advanced fine-tuning.

For more in-depth technical details, refer to the SimPO research preprint and the associated GitHub repository.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)