ArliAI/DS-R1-Distill-70B-ArliAI-RpR-v4-Large

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Jun 6, 2025License:llama3.3Architecture:Transformer0.0K Warm

DS-R1-Distill-70B-ArliAI-RpR-v4-Large is a 70-billion parameter language model developed by ArliAI, built upon the deepseek-ai/DeepSeek-R1-Distill-Llama-70B base model with a 32K context length. This model is fine-tuned using the RpR (RolePlay with Reasoning) v4 dataset, specifically designed to enhance creative writing and roleplay capabilities while integrating reasoning abilities for coherent, multi-turn conversations. It focuses on reducing cross-context repetition and impersonation, offering a unique, non-repetitive writing style for complex narrative interactions.

Loading preview...

ArliAI DS-R1-Distill-70B-ArliAI-RpR-v4-Large Overview

This model is the 70-billion parameter "Large" variant in ArliAI's RpR (RolePlay with Reasoning) v4 series, built on the deepseek-ai/DeepSeek-R1-Distill-Llama-70B base. It extends the successful RPMax dataset curation and training methodologies, focusing on high creativity and reduced repetition in roleplay and creative writing scenarios. A key innovation is the integration of reasoning capabilities into long, multi-turn chats, addressing the limitations of single-response reasoning datasets.

Key Capabilities & Differentiators

  • Enhanced Roleplay and Creative Writing: Fine-tuned on a curated, deduplicated dataset to minimize cross-context repetition and impersonation, promoting a unique and varied writing style.
  • Integrated Reasoning: Utilizes a novel approach to create a reasoning-enabled RP dataset, allowing the model to maintain coherence and logical progression in extended multi-turn conversations.
  • Optimized for Long Chats: Increased training sequence length to 16K tokens improves awareness and memory over longer interactions.
  • Unique Training Methodology: Employs a single-epoch training with a higher learning rate and low gradient accumulation, designed to prevent overfitting and encourage creative, non-repetitive outputs.
  • Specific Sampler Recommendations: Best performance is achieved with simple sampler settings (e.g., Temperature 1.0, MinP 0.02, TopK 40) and high response tokens (2048+), avoiding repetition penalty samplers.

When to Use This Model

This model is particularly well-suited for applications requiring:

  • Complex Roleplay Scenarios: Where consistent character portrayal and logical progression over many turns are crucial.
  • Creative Story Generation: For generating varied and imaginative narratives without falling into repetitive tropes.
  • Interactive Fiction and Chatbots: That need to maintain context and reasoning in extended, dynamic conversations.

Users should configure reasoning tokens (<think> and </think>) correctly for optimal performance, as detailed in the model's usage instructions.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p