LHC88/DPOpenHermes-7B-v2-PerfLaser
LHC88/DPOpenHermes-7B-v2-PerfLaser is a 7 billion parameter language model, a second-generation DPO-tuned variant of Teknium's OpenHermes-2.5-Mistral-7B. This model is optimized using Direct Preference Optimization (DPO) on decontaminated datasets, enhancing its ability to follow instructions and engage in multi-turn chat dialogues. It utilizes the ChatML prompt format, making it compatible with OpenAI endpoint standards and suitable for structured conversational AI applications.
Loading preview...
Overview
LHC88/DPOpenHermes-7B-v2-PerfLaser is a 7 billion parameter language model, building upon Teknium's OpenHermes-2.5-Mistral-7B. This version is a second-generation model fine-tuned using Direct Preference Optimization (DPO) with the Intel/orca_dpo_pairs and allenai/ultrafeedback_binarized_cleaned preference datasets. A key distinction from its 'v1' predecessor is the use of decontaminated datasets, specifically avoiding TruthfulQA data present in earlier versions.
Key Capabilities
- Enhanced Instruction Following: Optimized through DPO for better adherence to instructions, particularly in multi-turn conversations.
- ChatML Prompt Format: Supports the ChatML format, enabling structured system prompts and multi-turn dialogue, similar to OpenAI's API.
- System Prompt Utilization: Designed to effectively use system prompts to guide its behavior over extended interactions.
- LoRA Training: Trained using 16-bit LoRA on a single H100 80GB GPU for approximately 13 hours.
Use Cases
- Conversational AI: Ideal for applications requiring structured, multi-turn chat dialogues.
- Instruction-Following Tasks: Well-suited for scenarios where precise instruction adherence is critical.
- OpenAI API Compatibility: Its ChatML format makes it compatible with tools and workflows designed for OpenAI endpoints, such as LM Studio.