Name: amazingvince/openhermes-7b-dpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: amazingvince

OpenHermes 2.5 Mistral 7B DPO Tune

This model, amazingvince/openhermes-7b-dpo, is an experimental DPO (Direct Preference Optimization) fine-tune based on the Mistral 7B architecture. It builds upon the OpenHermes 2.5 model, which itself is an evolution of OpenHermes 2, incorporating additional code datasets during its training.

Key Capabilities

Enhanced General Reasoning: Training with a significant ratio of code instruction data (estimated 7-14% of the total dataset) has shown to boost performance on several non-code benchmarks, including TruthfulQA, AGIEval, and the GPT4All suite.
Code-Awareness: Benefits from additional code datasets, which contributes to its overall improved abilities.
DPO Optimization: Utilizes Direct Preference Optimization with various datasets to refine its responses and capabilities.

Good for

General language understanding and generation tasks where improved reasoning is beneficial.
Applications requiring a balance of code and non-code related intelligence.
Experimentation with DPO-tuned models for diverse NLP challenges.

Overview

OpenHermes 2.5 Mistral 7B DPO Tune

Key Capabilities

Good for

Full Model Card (README)