Overview
ArliAI/QwQ-32B-ArliAI-RpR-v4 Overview
QwQ-32B-ArliAI-RpR-v4 is a 32-billion parameter model from ArliAI's RpR (RolePlay with Reasoning) series, fine-tuned on the QwQ-32B base model. It leverages an advanced dataset curation methodology, originally developed for the RPMax series, to enhance creative writing and roleplay performance. A key innovation is the creation of a reasoning-focused RP dataset, processed from the RPMax dataset using the QwQ Instruct model itself, to enable coherent and interesting outputs in long, multi-turn roleplay chats while maintaining reasoning abilities.
Key Capabilities & Features
- Optimized for Creative Writing & Roleplay: Designed to produce highly creative and varied outputs, minimizing cross-context repetition and generic tropes.
- Enhanced Reasoning: Incorporates a unique training method that allows the model to perform reasoning without seeing reasoning blocks in its context during inference, leading to more consistent and logical responses in complex scenarios.
- Reduced Repetition & Impersonation: Utilizes advanced filtering during training to mitigate common LLM issues like repetitive phrases and speaking for the user.
- Extended Context Awareness: Trained with a sequence length of 16K, supporting a native context length of 32K tokens, which aids in memory and awareness over longer conversations.
- Unique Training Methodology: Employs a single-epoch training approach with a higher learning rate and low gradient accumulation to prevent overfitting and encourage diverse response generation.
Good For
- Long Multi-Turn Roleplay: Excels in maintaining coherence and creativity across extended interactive roleplay sessions.
- Creative Writing Applications: Ideal for generating varied and imaginative text, stories, and character interactions.
- Applications Requiring Reasoning in Conversational Contexts: Suitable for scenarios where logical progression and consistent character behavior are crucial over many turns.
Usage Notes
- Sampler Settings: Recommended to use simple sampler settings (e.g., Temperature: 1.0, MinP: 0.02, TopK: 40) and allow for high response tokens (2048+) for optimal performance.
- Reasoning Block Configuration: Requires specific prefix/suffix settings (e.g.,
<think>and</think>) for reasoning models in interfaces like SillyTavern to function correctly.