Aratako/Qwen3-8B-RP-v0.1

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 1, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

Aratako/Qwen3-8B-RP-v0.1 is an 8 billion parameter causal language model, fine-tuned by Aratako from the Qwen/Qwen3-8B base model. Optimized specifically for role-playing scenarios, this model excels at generating character-driven dialogues and narratives based on detailed system prompts. It supports a context length of 32768 tokens, making it suitable for extended role-play interactions.

Loading preview...

Overview

Aratako/Qwen3-8B-RP-v0.1 is an 8 billion parameter language model developed by Aratako, fine-tuned from the Qwen/Qwen3-8B base model. Its primary specialization is role-playing, designed to generate character-specific responses and narratives based on detailed system prompts. The model is released under the MIT License.

Key Capabilities

  • Role-Play Generation: Excels at adopting specific character personas, dialogue styles, and situational contexts provided in the system prompt.
  • Contextual Understanding: Capable of handling detailed world-building, character backstories, and scene descriptions to maintain consistent role-play.
  • Dialogue Formatting: Can generate responses in specified formats, including character names and actions, as demonstrated in the examples.
  • Extended Interactions: Supports a substantial context length of 32768 tokens, allowing for longer and more complex role-playing sessions.

Usage

This model is intended for use in applications requiring dynamic and immersive role-play. Users can define character settings, dialogue situations, and desired response formats within the system prompt. Examples are provided for both ollama (GGUF version) and transformers library implementations, showcasing how to set up detailed role-play scenarios.

Training Details

The model was fine-tuned with specific hyperparameters, including a learning rate of 1e-5, a cosine learning rate scheduler, a global batch size of 128, and a maximum sequence length of 8192. The optimizer used was paged_adamw_8bit.