Overview

QwQ-32B-ArliAI-RpR-v3 is the third iteration in ArliAI's RpR (RolePlay with Reasoning) series, building upon the successful RPMax dataset and training methodologies. This 32.8 billion parameter model, based on QwQ-32B, is specifically engineered to excel in creative writing and roleplay scenarios, particularly in long, multi-turn chats. It addresses common issues found in previous models, such as dissociated thoughts, random refusals, and nonsense words in the dataset, through a re-processed RpR dataset generation.

Key Capabilities

Enhanced Reasoning: Integrates reasoning abilities into roleplay, allowing for more coherent and consistent outputs over extended conversations.
High Creativity & Reduced Repetition: Utilizes a unique dataset curation method to minimize "cross-context repetition" (repeating phrases/tropes across different situations), promoting diverse and creative writing.
Optimized for Multi-Turn RP: Specifically trained to maintain coherence and quality in long, multi-turn roleplay chats, a common challenge for reasoning models.
Robust Training: Employs the Rex scheduler and a single-epoch training approach with a higher learning rate to ensure the model learns nuances from the entire dataset without overfitting to specific examples.

Good For

Creative Writing: Generating diverse and imaginative narratives without falling into repetitive patterns.
Roleplay (RP): Engaging in extended, coherent, and reasoning-driven roleplay sessions.
Interactive Storytelling: Applications requiring dynamic and non-repetitive conversational outputs.

Users should configure their inference settings carefully, ensuring the use of <think> and </think> tokens for proper reasoning block parsing, and disabling character name inclusion to prevent conflicts with the model's reasoning mechanism.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)