princeton-nlp/Mistral-7B-Base-SFT-SLiC-HF
The princeton-nlp/Mistral-7B-Base-SFT-SLiC-HF model is a 7 billion parameter language model based on the Mistral architecture, fine-tuned using the SLiC method. Developed by Princeton NLP, this model is derived from research on SimPO (Simple Preference Optimization with a Reference-Free Reward). It is designed for general language generation tasks, leveraging its 4096-token context length.
Loading preview...
Model Overview
This model, princeton-nlp/Mistral-7B-Base-SFT-SLiC-HF, is a 7 billion parameter language model built upon the Mistral architecture. It was developed by Princeton NLP as part of their research into preference optimization techniques.
Key Characteristics
- Architecture: Based on the Mistral-7B-Base model.
- Fine-tuning Method: Utilizes the SLiC (Supervised Learning with Contrastive Loss) method, as detailed in the associated research paper.
- Research Origin: This model is a direct output of the preprint SimPO: Simple Preference Optimization with a Reference-Free Reward, which focuses on developing simple, reference-free reward models for preference optimization.
- Context Length: Supports a context window of 4096 tokens.
Intended Use Cases
This model is suitable for a variety of general natural language processing tasks, particularly those where the benefits of SLiC-based fine-tuning for improved preference alignment are desired. Developers interested in exploring models derived from advanced preference optimization research will find this model relevant.