kaist-ai/janus-dpo-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 25, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The kaist-ai/janus-dpo-7b is a 7 billion parameter language model, based on Mistral-7B-v0.2, developed by KAIST AI. It is fine-tuned using Direct Preference Optimization (DPO) on the Multifaceted Collection dataset, which contains 196k unique system messages. This model excels at generating personalized, helpful, and harmless responses by aligning to diverse human preferences through generalized system messages, making it suitable for applications requiring adaptable and controlled text generation.

Loading preview...

kaist-ai/janus-dpo-7b: Personalized Response Generation

Janus-DPO-7B is a 7 billion parameter language model developed by KAIST AI, built upon the Mistral-7B-v0.2 architecture. Its core innovation lies in its training methodology: it was fine-tuned using Direct Preference Optimization (DPO) on the extensive Multifaceted-Collection-DPO dataset. This dataset comprises 196,000 unique system messages, specifically designed to align LLMs with a wide array of human preferences.

Key Capabilities

  • Personalized Response Generation: Janus-DPO-7B is adept at producing responses tailored to specific user preferences, guided by diverse system messages.
  • Helpful and Harmless Alignment: The model is trained to generate outputs that are generally preferred for being both helpful and harmless.
  • System Message Generalization: Users can control the model's output by inputting desired system messages, allowing for flexible and context-aware text generation.

Good for

  • Applications requiring highly customizable and preference-aligned text outputs.
  • Scenarios where controlling model behavior through detailed system prompts is crucial.
  • Research and development in aligning LLMs to diverse human preferences, as detailed in its research paper.