Undi95/Borealis-10.7B-DPO

TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 20, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Undi95/Borealis-10.7B-DPO is a 10.7 billion parameter conversational language model, built upon 48 Mistral 7B layers and fine-tuned using a Llama2 configuration of Axolotl. This model is specifically optimized for roleplay, ERP, and general conversational tasks, rather than benchmark performance. It underwent an additional DPO (Direct Preference Optimization) training phase to enhance its interactive capabilities. Borealis-10.7B-DPO excels at generating engaging and natural dialogue for various conversational applications.

Loading preview...

Borealis-10.7B-DPO: A Conversational Powerhouse

Borealis-10.7B-DPO is a 10.7 billion parameter model, constructed from 48 Mistral 7B layers and fine-tuned for over 70 hours on a substantial roleplay and conversational dataset. This variant distinguishes itself through an additional Direct Preference Optimization (DPO) training phase, further refining its interactive qualities.

Key Capabilities

  • Optimized for Conversation: Unlike models focused solely on benchmark scores, Borealis-10.7B-DPO is specifically engineered for high-quality roleplay (RP), ERP, and general conversational interactions.
  • Extensive Training Data: Trained on a diverse range of datasets including NobodyExistsOnTheInternet/ToxicQAFinal, teknium/openhermes, unalignment/spicy-3.1, Doctor-Shotgun/no-robots-sharegpt, Undi95/toxic-dpo-v0.1-sharegpt, various Aesir sets, lemonilia/LimaRP, Squish42/bluemoon-fandom-1-1-rp-cleaned, and Undi95/ConversationChronicles-sharegpt-SHARDED.
  • DPO Enhanced: The DPO training utilized datasets like Intel/orca_dpo_pairs, NobodyExistsOnTheInternet/ToxicDPOqa, and Undi95/toxic-dpo-v0.1-NoWarning to improve conversational flow and preference alignment.
  • NsChatml Prompt Format: Uses the <|im_system|>{sysprompt}<|im_end|><|im_user|>{input}<|im_end|><|im_bot|>{output}<|im_end|> format for clear instruction and response generation.

Good For

  • Roleplay and ERP applications: Designed to excel in generating dynamic and engaging roleplay scenarios.
  • General Conversational Agents: Ideal for chatbots and interactive AI systems where natural dialogue is paramount.
  • Interactive Storytelling: Can be leveraged for creating rich, character-driven narratives.