HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v1
HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v1 is a 7 billion parameter auto-regressive language model, fine-tuned by HenryJJ using DPO from cognitivecomputations/dolphin-2.6-mistral-7b. It was trained on the Intel/orca_dpo_pairs dataset for 1200 steps with a 1024-token context window. This model is optimized for instruction following and conversational tasks, leveraging the Mistral architecture for efficient performance.
Loading preview...
Model Overview
HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v1 is a 7 billion parameter auto-regressive language model developed by HenryJJ. It is built upon the Mistral architecture, specifically fine-tuned from the cognitivecomputations/dolphin-2.6-mistral-7b base model. The fine-tuning process utilized Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset, training for 1200 steps with a 1024-token context window.
Key Characteristics
- Base Model: Derived from the Mistral 7B architecture.
- Fine-tuning: Employs DPO using the
Intel/orca_dpo_pairsdataset, enhancing its ability to follow instructions and generate helpful responses. - Language: Primarily English.
- Prompt Format: Uses the ChatML format, with
<|im_end|>mapping to token ID 2, ensuring compatibility with applications expecting this EOS token behavior.
Intended Use Cases
This model is well-suited for applications requiring:
- Instruction Following: Generating responses that adhere closely to user prompts and instructions.
- Conversational AI: Engaging in dialogue and providing coherent, contextually relevant answers.
- Assistant-like Functions: Serving as a helpful AI assistant in various interactive scenarios.