bardsai/jaskier-7b-dpo

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 10, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

bardsai/jaskier-7b-dpo is a 7 billion parameter language model developed by bards.ai, fine-tuned from mindy-labs/mindy-7b-v2 (a Mistral7B derivative) using Direct Preference Optimization (DPO). This model is specifically trained on the Intel/orca_dpo_pairs dataset, aiming to enhance conversational quality and alignment. With an 8192-token context length, it is designed for general conversational applications, particularly those benefiting from DPO-tuned responses.

Loading preview...

Jaskier 7b DPO: An Aligned Conversational Model

bardsai/jaskier-7b-dpo is a 7 billion parameter language model developed by bards.ai, built upon the mindy-labs/mindy-7b-v2 architecture, which itself is a downstream version of Mistral7B. This model distinguishes itself through its fine-tuning methodology, utilizing Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset.

Key Capabilities

  • DPO-tuned Responses: Leverages Direct Preference Optimization for improved conversational quality and alignment, aiming for more helpful and harmless outputs.
  • Mistral7B Base: Benefits from the strong foundational capabilities of the Mistral7B architecture.
  • Conversational AI: Designed for interactive dialogue systems, as demonstrated by its use in a sentiment analysis pipeline example.
  • 8192-token Context: Supports processing and generating longer conversational turns.

Good For

  • Chatbot Development: Ideal for creating engaging and aligned conversational agents.
  • Instruction Following: The DPO tuning on orca_dpo_pairs suggests improved adherence to user instructions and preferences.
  • Exploratory AI Applications: While noted as a "work-in-progress," it serves as a valuable model for experimentation and development in conversational AI.

Note: This model is currently a work-in-progress and may not be suitable for production environments without further evaluation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p