Locutusque/ChatHercules-2.5-Mistral-7B-DPO

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 10, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Locutusque/ChatHercules-2.5-Mistral-7B-DPO is a 7 billion parameter language model based on the Mistral architecture, created by Locutusque. This model is a merge of Hercules-2.5-Mistral-7B and openchat-3.5-0106, further fine-tuned using DPO on a subset of the argilla/distilabel-intel-orca-dpo-pairs dataset. It is designed for general conversational AI tasks, leveraging its merged base models and DPO fine-tuning for improved instruction following and response quality within an 8192 token context length.

Loading preview...

ChatHercules-2.5-Mistral-7B-DPO Overview

ChatHercules-2.5-Mistral-7B-DPO is a 7 billion parameter language model developed by Locutusque, built upon the Mistral architecture. It is a composite model, created by merging two distinct base models: Locutusque/Hercules-2.5-Mistral-7B and openchat/openchat-3.5-0106. This merging process utilized LazyMergekit with a slerp method, combining the strengths of both foundational models.

Key Capabilities & Training

Following the initial merge, the model underwent further refinement through Direct Preference Optimization (DPO). This fine-tuning was conducted on 20% of the argilla/distilabel-intel-orca-dpo-pairs dataset, enhancing its ability to align with human preferences and generate more helpful and coherent responses. The model supports an 8192 token context length, making it suitable for handling moderately long conversations and documents.

Usage & Performance

Developers can easily integrate ChatHercules-2.5-Mistral-7B-DPO into their applications using the Hugging Face transformers library, with provided Python code examples for text generation. Evaluation results, presented through benchmark charts in the original README, indicate its performance relative to other models, suggesting its suitability for general-purpose conversational AI and instruction-following tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p