wolfeidau/NeuralHermes-2.5-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 24, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

NeuralHermes-2.5-Mistral-7B by wolfeidau is a 7 billion parameter Mistral-based language model fine-tuned using Direct Preference Optimization (DPO). This model specializes in instruction following and conversational tasks, leveraging the Intel/orca_dpo_pairs dataset for its DPO training. It is designed for general-purpose assistant chatbot applications, offering enhanced response quality through preference-based learning.

Loading preview...

NeuralHermes-2.5-Mistral-7B Overview

NeuralHermes-2.5-Mistral-7B is a 7 billion parameter language model developed by wolfeidau. It is built upon the Mistral architecture and has been fine-tuned from OpenHermes-2.5 using a Direct Preference Optimization (DPO) technique. This training approach leverages the Intel/orca_dpo_pairs dataset, which is designed to align model outputs with human preferences, resulting in improved instruction following and conversational quality.

Key Capabilities

  • Enhanced Instruction Following: Benefits from DPO training on preference data, leading to more aligned and helpful responses.
  • Conversational AI: Optimized for chatbot applications and interactive dialogue generation.
  • Mistral-7B Foundation: Inherits the strong base capabilities of the Mistral-7B model.

Training Details

The model was trained with specific LoRA configurations (r=16, lora_alpha=16, lora_dropout=0.05) and optimized using paged_adamw_32bit over 200 steps. The DPO training utilized a beta value of 0.1, with max_prompt_length of 1024 and max_length of 1536.

Good For

  • Developing helpful assistant chatbots.
  • Applications requiring models with improved alignment to human preferences.
  • General-purpose text generation where conversational quality is important.