shahzebnaveed/NeuralHermes-2.5-Mistral-7B
NeuralHermes-2.5-Mistral-7B by shahzebnaveed is a 7 billion parameter language model based on the teknium/OpenHermes-2.5-Mistral-7B architecture. It has been fine-tuned using Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset, reformatting with the ChatML template. This model aims to improve performance through an RLHF-inspired process, similar to Intel/neural-chat-7b-v3-1. It is suitable for general conversational AI tasks, leveraging its Mistral base and DPO fine-tuning.
Loading preview...
NeuralHermes 2.5 - Mistral 7B Overview
NeuralHermes 2.5 - Mistral 7B is a 7 billion parameter language model developed by shahzebnaveed. It builds upon the teknium/OpenHermes-2.5-Mistral-7B base model and incorporates further fine-tuning using Direct Preference Optimization (DPO).
Key Characteristics
- Base Model: Utilizes the
teknium/OpenHermes-2.5-Mistral-7Barchitecture. - Fine-tuning Method: Employs Direct Preference Optimization (DPO) for enhanced performance.
- Training Data: Fine-tuned on the
Intel/orca_dpo_pairsdataset, reformatted with the ChatML template. - Inspiration: The fine-tuning process is inspired by the RLHF methodology described by the authors of
Intel/neural-chat-7b-v3-1.
Important Considerations
- The DPO fine-tuning process was incomplete, running for only two steps due to GPU memory limitations.
- Parameter settings (e.g., number of LoRA adapters, alpha) were reduced to accommodate smaller GPUs, meaning they are not optimized for ideal performance.
Usage
This model can be used for text generation tasks, particularly those involving conversational AI, by applying the ChatML template for prompt formatting. An example Python code snippet is provided in the model card for integration with the transformers library.