ArliAI/QwQ-32B-ArliAI-RpR-v3

Warm
Public
32B
FP8
32768
Apr 27, 2025
License: apache-2.0
Hugging Face
Overview

ArliAI/QwQ-32B-ArliAI-RpR-v3: Roleplay with Reasoning

QwQ-32B-ArliAI-RpR-v3 is the latest 32-billion parameter model from ArliAI, building upon the successful RPMax series' dataset curation and training methods. This version, based on the QwQ-32B model, introduces significant improvements for roleplay and creative writing, particularly in maintaining reasoning abilities across long, multi-turn conversations.

Key Differentiators & Improvements (v3):

  • Enhanced Creativity & Out-of-the-Box Thinking: Designed for extreme creativity, moving away from previous base model limitations.
  • Refined Reasoning: The RpR dataset generation was re-run to ensure thinking tokens consistently match model responses, addressing prior "dissociated thoughts."
  • Eliminated Refusals & Nonsense Words: Dataset generation now uses QwQ-abliterated to prevent refusals, and misplaced censoring attempts in open datasets have been fixed.
  • Optimized Training: Utilizes the Rex scheduler for improved learning nuances by maintaining a higher learning rate for longer.
  • Unique RP Dataset: Processes the RPMax dataset into a reasoning dataset using the base QwQ Instruct model to create reasoning processes for each turn, ensuring coherent multi-turn RP.
  • Context-Aware Training: Trained to never see reasoning blocks in its context during training, mirroring inference usage for consistent performance.

Specs & Training:

  • Base Model: QwQ-32B
  • Parameters: 32B
  • Max Context Length: 128K (Realistically 32K)
  • Fine-tuning Method: RS-QLORA+ (Rank-Stabilized LoRA + LoRA Plus 8x)
  • Training Philosophy: Employs a single-epoch, high learning rate approach to maximize learning from individual examples and prevent overfitting to specific tropes, fostering higher creativity and reduced cross-context repetition.

When to Use This Model:

  • Long-form Roleplay: Excels in multi-turn, complex narrative interactions where consistent reasoning is crucial.
  • Creative Writing: Ideal for generating highly creative and varied outputs without falling into repetitive patterns.
  • Applications Requiring Coherent Reasoning: Suitable for scenarios where the model needs to maintain logical thought processes throughout extended dialogues.