Name: Locutusque/Hyperion-3.0-Mistral-7B-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Locutusque

Hyperion-3.0-Mistral-7B-DPO Overview

Locutusque/Hyperion-3.0-Mistral-7B-DPO is a 7 billion parameter language model built upon the mistralai/Mistral-7B-v0.1 base. It has been meticulously fine-tuned using Direct Preference Optimization (DPO) on a dataset of 20,000 high-quality preference pairs, with 4,000 examples specifically used for fine-tuning. These preference pairs were generated by GPT-4, ensuring high relevance and quality across diverse domains.

Key Capabilities

Multi-domain Proficiency: Excels in question answering, conversational AI, and code generation.
Specialized Reasoning: Demonstrates strong capabilities in medical text comprehension, mathematical reasoning, and logical reasoning.
DPO Fine-tuning: Leverages Direct Preference Optimization to align model outputs with human preferences, enhancing overall performance and reliability.
Broad Application: Suitable for intelligent tutoring systems, advanced chatbots, code analysis tools, and medical information retrieval.

Evaluation Highlights

The model achieved an overall MMLU (Massive Multitask Language Understanding) score of 0.5833 using a flan cot 5-shot evaluation. Notably, it scored 0.7003 in social sciences and 0.6833 in other categories, indicating robust performance across various academic and professional subjects. While generally compliant, users should be aware of potential biases and consider further fine-tuning for enterprise-specific requirements.

Overview

Hyperion-3.0-Mistral-7B-DPO Overview

Key Capabilities

Evaluation Highlights

Full Model Card (README)