Xenon1/Zenith-7B-dpo-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 14, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Zenith-7B-dpo-v1 is a 7 billion parameter language model developed by Xenon1, fine-tuned from Mistral-7B-v0.1. It leverages Direct Preference Optimization (DPO) on the Ultrafeedback dataset, incorporating architectural features like Grouped-Query Attention and Sliding-Window Attention. This model is optimized for instruction-following tasks, providing coherent and contextually relevant responses.

Loading preview...

Zenith-7B-dpo-v1: Instruction-Tuned Mistral Model

Zenith-7B-dpo-v1 is a 7 billion parameter language model developed by Xenon1, built upon the Mistral-7B-v0.1 architecture. This model has been fine-tuned using Direct Preference Optimization (DPO) on the Ultrafeedback dataset, a technique inspired by the "Self-Rewarding Language Models" paper. Its core architecture includes advanced features such as Grouped-Query Attention and Sliding-Window Attention, alongside a Byte-fallback BPE tokenizer.

Key Capabilities

  • Instruction Following: Optimized for understanding and responding to user instructions, making it suitable for conversational AI and task-oriented applications.
  • Chat Template Support: Designed to work seamlessly with a specific instruction format, utilizing [INST] and [/INST] tokens, and is compatible with Hugging Face's apply_chat_template() method for easy integration.
  • Efficient Architecture: Inherits Mistral-7B-v0.1's efficient transformer architecture, which contributes to its performance.

Good For

  • Conversational Agents: Developing chatbots or virtual assistants that require strong instruction adherence.
  • General Purpose Text Generation: Generating human-like text based on explicit prompts.
  • Research in DPO: Exploring the practical application and performance of models fine-tuned with Direct Preference Optimization.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p