HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v3
HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v3 is a 7 billion parameter auto-regressive language model, fine-tuned by HenryJJ using DPO from cognitivecomputations/dolphin-2.6-mistral-7b. It was trained for 1200 steps on the Intel/orca_dpo_pairs dataset with a 1024 context window. This model is optimized for instruction following and conversational tasks, utilizing a ChatML prompt format.
Loading preview...
Model Overview
HenryJJ/dolphin-2.6-mistral-7b-dpo-orca-v3 is a 7 billion parameter English language model, fine-tuned by HenryJJ. It is based on the Llama 2 transformer architecture and was developed using Direct Preference Optimization (DPO) from the cognitivecomputations/dolphin-2.6-mistral-7b base model. The training utilized the Intel/orca_dpo_pairs dataset, conducted over 1200 steps with a 1024 token context window.
Key Capabilities
- Instruction Following: Enhanced through DPO training on the Orca dataset, making it proficient in understanding and executing user instructions.
- Conversational AI: Designed to be a helpful AI assistant, responding to user prompts effectively.
- ChatML Format: Employs the ChatML prompt format, ensuring compatibility with systems expecting this structure, with
<|im_end|>mapping to token_id 2 for EOS.
Good For
- Applications requiring a responsive and compliant AI assistant.
- Scenarios where clear instruction following is critical.
- Developers working with ChatML-formatted prompts and seeking a 7B parameter model for English language tasks.