Model Overview
statking/zephyr-7b-sft-full-orpo is a 7 billion parameter language model derived from the Mistral-7B-v0.1 architecture. It has been fine-tuned using the ORPO (Odds Ratio Preference Optimization) method on the HuggingFaceH4/ultrafeedback_binarized dataset, which focuses on aligning model outputs with human preferences.
Key Characteristics
- Base Model: Mistral-7B-v0.1
- Fine-tuning Method: ORPO, designed to improve alignment and preference modeling.
- Training Data: HuggingFaceH4/ultrafeedback_binarized, a dataset focused on chosen vs. rejected responses.
- Performance Metrics: Achieved a rewards accuracy of 0.6587 on the evaluation set, with a chosen log probability of -0.7282 and rejected log probability of -0.9978, indicating a preference for chosen responses.
- Context Length: Supports an 8192-token context window.
Intended Use Cases
This model is particularly well-suited for applications where preference alignment and generating responses that are favored over alternatives are critical. Its training on a binarized feedback dataset suggests strengths in:
- Instruction Following: Generating responses that adhere to user instructions and preferences.
- Dialogue Systems: Producing more helpful or preferred conversational turns.
- Content Generation: Creating outputs that are generally better received or aligned with specific criteria based on human feedback.