Name: openaccess-ai-collective/DPOpenHermes-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: openaccess-ai-collective

DPOpenHermes-7B: DPO Fine-tuned Chat Model

DPOpenHermes-7B is a 7 billion parameter model developed by openaccess-ai-collective, built upon Teknium's OpenHermes-2.5-Mistral-7B. This model undergoes further refinement through Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs and argilla/ultrafeedback-binarized-preferences datasets. It was trained using qLoRA on a single H100 80GB GPU for approximately 10 hours.

Key Capabilities

Enhanced Chat Dialogue: Fine-tuned for multi-turn conversations, supporting structured system prompts.
ChatML Format: Utilizes the ChatML prompt format, ensuring compatibility with OpenAI API standards and enabling explicit system instructions.
System Prompt Utilization: Designed to strongly engage with system prompts, allowing for more consistent and controlled model behavior across multiple turns.

Benchmarks

While an initial DPO-only version had issues with eos token generation, this model includes additional Supervised Fine-Tuning (SFT) to address it, which slightly impacted benchmark scores. Notable average scores include:

AGIEval: 0.4364
BigBench Hard: 0.4321
GPT4All: 0.7422

Good For

Applications requiring structured, multi-turn chat interactions.
Developers familiar with OpenAI's ChatML format seeking a 7B parameter alternative.
Use cases where explicit system instructions are crucial for guiding model responses.

Overview

DPOpenHermes-7B: DPO Fine-tuned Chat Model

Key Capabilities

Benchmarks

Good For

Full Model Card (README)