eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 27, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO model is a 7 billion parameter language model, fine-tuned using the OpenHermesPreferences dataset. This model, developed by eren23, demonstrates strong performance across various benchmarks, achieving an average score of 76.45 on the Open LLM Leaderboard. It is particularly notable for its DPO fine-tuning approach, making it suitable for general-purpose language generation and reasoning tasks.

Loading preview...

Model Overview

eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO is a 7 billion parameter language model developed by eren23. This model is a fine-tuned version of an existing merge, utilizing the OpenHermesPreferences dataset with a DPO (Direct Preference Optimization) approach. Initially an experimental project, it has shown promising results in evaluations.

Key Capabilities & Performance

Despite its experimental origins, the model performs well on standard benchmarks, achieving an average score of 76.45 on the Open LLM Leaderboard. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 73.12
  • HellaSwag (10-Shot): 89.09
  • MMLU (5-Shot): 64.80
  • TruthfulQA (0-shot): 77.45
  • Winogrande (5-shot): 84.77
  • GSM8k (5-shot): 69.45

Usage Notes

This model is provided with a GGUF version available here. While initial training indicated potential performance issues, subsequent benchmark evaluations have demonstrated its effectiveness. Users are advised to test it with caution, as it originated from an experimental fine-tuning process.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p