Gryphe/Pantheon-RP-1.6-12b-Nemo-KTO

Warm
Public
12B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Overview

Gryphe's Pantheon-RP-1.6-12b-Nemo-KTO is a 12 billion parameter model designed to enhance roleplay experiences by introducing a diverse collection of summonable personas. This version incorporates KTO preference training to refine and diversify responses, aiming to improve personality traits, accents, and mannerisms that LLMs often struggle with. It supports a 32768 token context length.

Key Capabilities

  • Diverse Personas: Features a rebuilt and expanded Pantheon of characters, each with unique system prompts and activation phrases.
  • Flexible Roleplay Styles: Supports both Markdown and novel-style roleplay, with the model adapting to the user's greeting style.
  • Dedicated Assistant: Includes "Lyra the Assistant," an uncensored AI companion for general dialogue, coding help, and summarization, whose personality can be adjusted.
  • Multi-stage Finetuning: Utilizes a multi-stage finetuning process, including a remade first finetune on a deduped Sonnet 3.5 SlimOrca dataset and a rebuilt Pantheon Roleplay dataset.

Noteworthy Features

  • KTO Edition: Applies KTO preference training for more refined and varied outputs, though this is noted as experimental.
  • Expanded Persona Roster: New personas like Clover (Southern centaur), Raza (nerdy raptor), and Stella Sabre (anthro batpony with a Northern Equestrian Mountain accent) have been added.
  • Improved Style Handling: Addresses previous weaknesses by equally splitting finetune data between Markdown and novel-style roleplay.

Considerations

  • The KTO training, due to story writing samples, has introduced some "unwanted behaviors," with a V2 planned to address this.
  • Optimal inference may require experimenting with temperature settings, with a suggested preset of temperature: 0.8, repetition_penalty: 1.05, and min_p: 0.025.