Locutusque/Hyperion-3.0-Mistral-7B-DPO
Locutusque/Hyperion-3.0-Mistral-7B-DPO is a 7 billion parameter language model based on Mistral-7B-v0.1, fine-tuned using Direct Preference Optimization (DPO) on 20,000 GPT-4 generated preference pairs. It is designed for superior performance across complex tasks including question answering, conversational AI, code generation, medical text comprehension, mathematical reasoning, and logical reasoning. The model offers an 8192 token context length and achieves an MMLU score of 0.5833, demonstrating broad multi-domain proficiency.
Loading preview...
Hyperion-3.0-Mistral-7B-DPO Overview
Locutusque/Hyperion-3.0-Mistral-7B-DPO is a 7 billion parameter language model built upon the mistralai/Mistral-7B-v0.1 base. It has been meticulously fine-tuned using Direct Preference Optimization (DPO) on a dataset of 20,000 high-quality preference pairs, with 4,000 examples specifically used for fine-tuning. These preference pairs were generated by GPT-4, ensuring high relevance and quality across diverse domains.
Key Capabilities
- Multi-domain Proficiency: Excels in question answering, conversational AI, and code generation.
- Specialized Reasoning: Demonstrates strong capabilities in medical text comprehension, mathematical reasoning, and logical reasoning.
- DPO Fine-tuning: Leverages Direct Preference Optimization to align model outputs with human preferences, enhancing overall performance and reliability.
- Broad Application: Suitable for intelligent tutoring systems, advanced chatbots, code analysis tools, and medical information retrieval.
Evaluation Highlights
The model achieved an overall MMLU (Massive Multitask Language Understanding) score of 0.5833 using a flan cot 5-shot evaluation. Notably, it scored 0.7003 in social sciences and 0.6833 in other categories, indicating robust performance across various academic and professional subjects. While generally compliant, users should be aware of potential biases and consider further fine-tuning for enterprise-specific requirements.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.