bardsai/jaskier-7b-dpo-v5.6

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 16, 2024License:cc-by-4.0Architecture:Transformer0.0K Open Weights Cold

Jaskier-7b-dpo-v5.6 is a 7 billion parameter language model developed by bards.ai, based on the Mistral7B-derived paulml/OGNO-7B architecture. It has been fine-tuned using Direct Preference Optimization (DPO) on the argilla/distilabel-math-preference-dpo dataset. This model is designed for conversational AI tasks, particularly demonstrating capabilities in understanding and generating responses to complex queries, as shown in its example output regarding the distinction between a "bard" and an "ML engineer."

Loading preview...

Jaskier-7b-dpo-v5.6 Overview

Jaskier-7b-dpo-v5.6 is a 7 billion parameter language model developed by bards.ai. It is built upon paulml/OGNO-7B, which is a downstream version of the Mistral7B architecture. The model has been fine-tuned using Direct Preference Optimization (DPO), leveraging the argilla/distilabel-math-preference-dpo dataset.

Key Capabilities

  • Conversational AI: Designed to engage in conversations and provide informative responses.
  • Preference Optimization: Utilizes DPO for improved response quality and alignment.
  • Question Answering: Capable of understanding and generating distinctions for complex or nuanced questions, as demonstrated by its ability to differentiate between a "bard" and an "ML engineer."

Good For

  • Exploratory Conversational Applications: Suitable for developers experimenting with DPO-tuned models for dialogue systems.
  • Understanding Nuanced Queries: Can be used in scenarios requiring the model to clarify or distinguish between concepts.

Note: This model is currently a work-in-progress and may not be ready for production use. For potential improvements or to address issues like "INST" character chains in output, users are encouraged to try the newer bardsai/jaskier-7b-dpo-v6.1 or re-task their prompts.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p