Name: ArliAI/QwQ-32B-ArliAI-RpR-v3 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: ArliAI

ArliAI/QwQ-32B-ArliAI-RpR-v3: Roleplay with Reasoning

QwQ-32B-ArliAI-RpR-v3 is the latest 32-billion parameter model from ArliAI, building upon the successful RPMax series' dataset curation and training methods. This version, based on the QwQ-32B model, introduces significant improvements for roleplay and creative writing, particularly in maintaining reasoning abilities across long, multi-turn conversations.

Key Differentiators & Improvements (v3):

Enhanced Creativity & Out-of-the-Box Thinking: Designed for extreme creativity, moving away from previous base model limitations.
Refined Reasoning: The RpR dataset generation was re-run to ensure thinking tokens consistently match model responses, addressing prior "dissociated thoughts."
Eliminated Refusals & Nonsense Words: Dataset generation now uses QwQ-abliterated to prevent refusals, and misplaced censoring attempts in open datasets have been fixed.
Optimized Training: Utilizes the Rex scheduler for improved learning nuances by maintaining a higher learning rate for longer.
Unique RP Dataset: Processes the RPMax dataset into a reasoning dataset using the base QwQ Instruct model to create reasoning processes for each turn, ensuring coherent multi-turn RP.
Context-Aware Training: Trained to never see reasoning blocks in its context during training, mirroring inference usage for consistent performance.

Specs & Training:

Base Model: QwQ-32B
Parameters: 32B
Max Context Length: 128K (Realistically 32K)
Fine-tuning Method: RS-QLORA+ (Rank-Stabilized LoRA + LoRA Plus 8x)
Training Philosophy: Employs a single-epoch, high learning rate approach to maximize learning from individual examples and prevent overfitting to specific tropes, fostering higher creativity and reduced cross-context repetition.

When to Use This Model:

Long-form Roleplay: Excels in multi-turn, complex narrative interactions where consistent reasoning is crucial.
Creative Writing: Ideal for generating highly creative and varied outputs without falling into repetitive patterns.
Applications Requiring Coherent Reasoning: Suitable for scenarios where the model needs to maintain logical thought processes throughout extended dialogues.