eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 2, 2024License:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3 is a 7 billion parameter language model, fine-tuned using DPO on the argilla/dpo-mix-7k dataset. This model is an iteration of the ogno-monarch-jaskier-merge-7b-OH-PREF-DPO, further enhancing its conversational and instruction-following capabilities. With an 8192 token context length, it is suitable for general-purpose text generation and understanding tasks, demonstrating an average score of 76.40 on the Open LLM Leaderboard.

Loading preview...

Model Overview

The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3 is a 7 billion parameter language model, representing a further DPO fine-tuned version of the ogno-monarch-jaskier-merge-7b-OH-PREF-DPO model. This iteration leverages the argilla/dpo-mix-7k dataset for its fine-tuning process, aiming to enhance its instruction-following and conversational abilities.

Key Capabilities

  • Instruction Following: Improved response generation based on direct instructions due to DPO fine-tuning.
  • General Text Generation: Capable of various text generation tasks, including creative writing, summarization, and question answering.
  • Reasoning: Achieves a 73.04 score on the AI2 Reasoning Challenge (25-Shot) and 69.22 on GSM8k (5-shot).
  • Context Handling: Supports a context length of 8192 tokens, allowing for processing longer inputs.

Performance Highlights

Evaluated on the Open LLM Leaderboard, the model demonstrates competitive performance:

  • Average Score: 76.40
  • HellaSwag (10-Shot): 89.11
  • MMLU (5-Shot): 64.79
  • TruthfulQA (0-shot): 77.48
  • Winogrande (5-shot): 84.77

Usage Considerations

As noted in the original model repository, this model is not yet fully tested and should be used with caution, particularly for out-of-the-box applications. Users are advised to conduct thorough testing for their specific use cases.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p