ArliAI/QwQ-32B-ArliAI-RpR-v4
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:May 22, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

QwQ-32B-ArliAI-RpR-v4 is a 32-billion parameter language model developed by ArliAI, fine-tuned on the QwQ-32B base model. This model is part of the RpR (RolePlay with Reasoning) series, building on the RPMax dataset curation methodology. It is specifically optimized for creative writing and multi-turn roleplay chats, featuring enhanced reasoning capabilities and reduced repetition over long contexts up to 32768 tokens.

Loading preview...

ArliAI/QwQ-32B-ArliAI-RpR-v4 Overview

QwQ-32B-ArliAI-RpR-v4 is a 32-billion parameter model from ArliAI's RpR (RolePlay with Reasoning) series, fine-tuned on the QwQ-32B base model. It leverages an advanced dataset curation methodology, originally developed for the RPMax series, to enhance creative writing and roleplay performance. A key innovation is the creation of a reasoning-focused RP dataset, processed from the RPMax dataset using the QwQ Instruct model itself, to enable coherent and interesting outputs in long, multi-turn roleplay chats while maintaining reasoning abilities.

Key Capabilities & Features

  • Optimized for Creative Writing & Roleplay: Designed to produce highly creative and varied outputs, minimizing cross-context repetition and generic tropes.
  • Enhanced Reasoning: Incorporates a unique training method that allows the model to perform reasoning without seeing reasoning blocks in its context during inference, leading to more consistent and logical responses in complex scenarios.
  • Reduced Repetition & Impersonation: Utilizes advanced filtering during training to mitigate common LLM issues like repetitive phrases and speaking for the user.
  • Extended Context Awareness: Trained with a sequence length of 16K, supporting a native context length of 32K tokens, which aids in memory and awareness over longer conversations.
  • Unique Training Methodology: Employs a single-epoch training approach with a higher learning rate and low gradient accumulation to prevent overfitting and encourage diverse response generation.

Good For

  • Long Multi-Turn Roleplay: Excels in maintaining coherence and creativity across extended interactive roleplay sessions.
  • Creative Writing Applications: Ideal for generating varied and imaginative text, stories, and character interactions.
  • Applications Requiring Reasoning in Conversational Contexts: Suitable for scenarios where logical progression and consistent character behavior are crucial over many turns.

Usage Notes

  • Sampler Settings: Recommended to use simple sampler settings (e.g., Temperature: 1.0, MinP: 0.02, TopK: 40) and allow for high response tokens (2048+) for optimal performance.
  • Reasoning Block Configuration: Requires specific prefix/suffix settings (e.g., <think> and </think>) for reasoning models in interfaces like SillyTavern to function correctly.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p