ReXeeD/Luminus-1.5B-Roleplay

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 15, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

ReXeeD/Luminus-1.5B-Roleplay is a 1.5 billion parameter model based on Qwen2.5-1.5B, specifically optimized for immersive roleplay, character consistency, and long-context understanding. It utilizes Chain-of-Thought (CoT) Distillation, Instruction-Following Difficulty (IFD) Filtering, and Direct Preference Optimization (DPO) to achieve quality comparable to larger 3B-4B models. With an expanded context length of 128K tokens via YaRN RoPE scaling, it excels at extended roleplaying sessions. This model is designed for creative writing and character-driven storytelling, even on modest local hardware.

Loading preview...

Luminus-1.5B-Roleplay: Advanced Small-Parameter Roleplay Model

Luminus-1.5B-Roleplay is a 1.5 billion parameter model built on Qwen2.5-1.5B, engineered to deliver high-quality, immersive roleplay experiences. It aims to match the character consistency and long-context understanding typically found in larger 3B-4B models, making it suitable for local deployment.

Key Innovations & Capabilities

  • Chain-of-Thought (CoT) Distillation: Trained with <think> blocks, enabling the model to generate internal reasoning before producing dialogue, enhancing character depth.
  • Direct Preference Optimization (DPO): Aligned to prefer deep, sensory-rich responses over generic AI-like text, ensuring immersive storytelling.
  • Expanded Context: Features a 128,000-token context window through YaRN RoPE scaling, supporting very long roleplaying sessions without losing consistency.
  • High-Quality Data: Utilized Instruction-Following Difficulty (IFD) filtering to select only the top 70% of training data, ensuring robust learning from challenging exchanges.

Training & Alignment

The model underwent a multi-stage training pipeline, including Supervised Fine-Tuning (SFT) on a custom roleplay dataset and DPO alignment. The training data incorporated both standard roleplay and CoT examples, generated by a large teacher model (Qwen 3.5 32B / gpt-oss-120b) acting as a Narrative Architect. This process instilled a preference for detailed, character-driven responses.

Recommended Usage

For optimal performance, users should employ a specific system prompt that leverages the model's <think> block training. Inference settings should include a mild repetition penalty and a stopping criteria for the <|im_end|> token to prevent runaway generations and ensure concise, grounded output.

Limitations

While highly capable for its size, complex multi-character plots or massive world-building tasks might still challenge its parameter limits compared to larger models. Users should also be aware of potential inherited biases from the base Qwen2.5-1.5B model and ensure proper handling of <think> tags in frontends.