eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3
The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3 is a 7 billion parameter language model, fine-tuned using DPO on the argilla/dpo-mix-7k dataset. This model is an iteration of the ogno-monarch-jaskier-merge-7b-OH-PREF-DPO, further enhancing its conversational and instruction-following capabilities. With an 8192 token context length, it is suitable for general-purpose text generation and understanding tasks, demonstrating an average score of 76.40 on the Open LLM Leaderboard.
Loading preview...
Model Overview
The eren23/ogno-monarch-jaskier-merge-7b-OH-PREF-DPO-v3 is a 7 billion parameter language model, representing a further DPO fine-tuned version of the ogno-monarch-jaskier-merge-7b-OH-PREF-DPO model. This iteration leverages the argilla/dpo-mix-7k dataset for its fine-tuning process, aiming to enhance its instruction-following and conversational abilities.
Key Capabilities
- Instruction Following: Improved response generation based on direct instructions due to DPO fine-tuning.
- General Text Generation: Capable of various text generation tasks, including creative writing, summarization, and question answering.
- Reasoning: Achieves a 73.04 score on the AI2 Reasoning Challenge (25-Shot) and 69.22 on GSM8k (5-shot).
- Context Handling: Supports a context length of 8192 tokens, allowing for processing longer inputs.
Performance Highlights
Evaluated on the Open LLM Leaderboard, the model demonstrates competitive performance:
- Average Score: 76.40
- HellaSwag (10-Shot): 89.11
- MMLU (5-Shot): 64.79
- TruthfulQA (0-shot): 77.48
- Winogrande (5-shot): 84.77
Usage Considerations
As noted in the original model repository, this model is not yet fully tested and should be used with caution, particularly for out-of-the-box applications. Users are advised to conduct thorough testing for their specific use cases.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.