eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO
The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO model is a 7 billion parameter language model, fine-tuned using the OpenHermesPreferences dataset. This model, developed by eren23, demonstrates strong performance across various benchmarks, achieving an average score of 76.45 on the Open LLM Leaderboard. It is particularly notable for its DPO fine-tuning approach, making it suitable for general-purpose language generation and reasoning tasks.
Loading preview...
Model Overview
eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO is a 7 billion parameter language model developed by eren23. This model is a fine-tuned version of an existing merge, utilizing the OpenHermesPreferences dataset with a DPO (Direct Preference Optimization) approach. Initially an experimental project, it has shown promising results in evaluations.
Key Capabilities & Performance
Despite its experimental origins, the model performs well on standard benchmarks, achieving an average score of 76.45 on the Open LLM Leaderboard. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 73.12
- HellaSwag (10-Shot): 89.09
- MMLU (5-Shot): 64.80
- TruthfulQA (0-shot): 77.45
- Winogrande (5-shot): 84.77
- GSM8k (5-shot): 69.45
Usage Notes
This model is provided with a GGUF version available here. While initial training indicated potential performance issues, subsequent benchmark evaluations have demonstrated its effectiveness. Users are advised to test it with caution, as it originated from an experimental fine-tuning process.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.