ArliAI/DS-R1-Distill-70B-ArliAI-RpR-v4-Large

Warm
Public
70B
FP8
32768
Jun 6, 2025
License: llama3.3
Hugging Face
Overview

ArliAI DS-R1-Distill-70B-ArliAI-RpR-v4-Large Overview

This model is the 70-billion parameter "Large" variant in ArliAI's RpR (RolePlay with Reasoning) v4 series, built on the deepseek-ai/DeepSeek-R1-Distill-Llama-70B base. It extends the successful RPMax dataset curation and training methodologies, focusing on high creativity and reduced repetition in roleplay and creative writing scenarios. A key innovation is the integration of reasoning capabilities into long, multi-turn chats, addressing the limitations of single-response reasoning datasets.

Key Capabilities & Differentiators

  • Enhanced Roleplay and Creative Writing: Fine-tuned on a curated, deduplicated dataset to minimize cross-context repetition and impersonation, promoting a unique and varied writing style.
  • Integrated Reasoning: Utilizes a novel approach to create a reasoning-enabled RP dataset, allowing the model to maintain coherence and logical progression in extended multi-turn conversations.
  • Optimized for Long Chats: Increased training sequence length to 16K tokens improves awareness and memory over longer interactions.
  • Unique Training Methodology: Employs a single-epoch training with a higher learning rate and low gradient accumulation, designed to prevent overfitting and encourage creative, non-repetitive outputs.
  • Specific Sampler Recommendations: Best performance is achieved with simple sampler settings (e.g., Temperature 1.0, MinP 0.02, TopK 40) and high response tokens (2048+), avoiding repetition penalty samplers.

When to Use This Model

This model is particularly well-suited for applications requiring:

  • Complex Roleplay Scenarios: Where consistent character portrayal and logical progression over many turns are crucial.
  • Creative Story Generation: For generating varied and imaginative narratives without falling into repetitive tropes.
  • Interactive Fiction and Chatbots: That need to maintain context and reasoning in extended, dynamic conversations.

Users should configure reasoning tokens (<think> and </think>) correctly for optimal performance, as detailed in the model's usage instructions.