Xenon1/Zenith-7B-dpo-v1
Zenith-7B-dpo-v1 is a 7 billion parameter language model developed by Xenon1, fine-tuned from Mistral-7B-v0.1. It leverages Direct Preference Optimization (DPO) on the Ultrafeedback dataset, incorporating architectural features like Grouped-Query Attention and Sliding-Window Attention. This model is optimized for instruction-following tasks, providing coherent and contextually relevant responses.
Loading preview...
Zenith-7B-dpo-v1: Instruction-Tuned Mistral Model
Zenith-7B-dpo-v1 is a 7 billion parameter language model developed by Xenon1, built upon the Mistral-7B-v0.1 architecture. This model has been fine-tuned using Direct Preference Optimization (DPO) on the Ultrafeedback dataset, a technique inspired by the "Self-Rewarding Language Models" paper. Its core architecture includes advanced features such as Grouped-Query Attention and Sliding-Window Attention, alongside a Byte-fallback BPE tokenizer.
Key Capabilities
- Instruction Following: Optimized for understanding and responding to user instructions, making it suitable for conversational AI and task-oriented applications.
- Chat Template Support: Designed to work seamlessly with a specific instruction format, utilizing
[INST]and[/INST]tokens, and is compatible with Hugging Face'sapply_chat_template()method for easy integration. - Efficient Architecture: Inherits Mistral-7B-v0.1's efficient transformer architecture, which contributes to its performance.
Good For
- Conversational Agents: Developing chatbots or virtual assistants that require strong instruction adherence.
- General Purpose Text Generation: Generating human-like text based on explicit prompts.
- Research in DPO: Exploring the practical application and performance of models fine-tuned with Direct Preference Optimization.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.