millennium-qu/DirtyKing

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 15, 2026Architecture:Transformer0.0K Cold

millennium-qu/DirtyKing is a 4 billion parameter Qwen3-based instruction-tuned model, post-trained using Direct Preference Optimization (DPO) for specific Chinese conversational preferences. It specializes in generating "rude" or "dirty" language, maintaining strong language capabilities while enhancing human-like responses and expressive power in particular scenarios. This model is designed for applications requiring a distinct, unfiltered conversational style.

Loading preview...

Overview

DirtyKing is a 4 billion parameter model developed by millennium-qu, built upon the Qwen/Qwen3-4B-Instruct-2507 base model. It has undergone Direct Preference Optimization (DPO) post-training to align with a specific, unfiltered Chinese conversational style. The model is explicitly designed to generate "rude" or "dirty" language, while retaining the base model's strong language generation capabilities.

Key Capabilities

  • Specialized Conversational Style: Optimized for generating responses with a "rude" and "unfiltered" tone in Chinese.
  • Enhanced Human-like Interaction: Improves the naturalness and expressive power of replies in specific conversational contexts.
  • DPO Alignment: Utilizes Direct Preference Optimization with datasets like Karsh-CAI/btfChinese-DPO-small and dpo_mix_zh for fine-tuning.
  • Efficient Training: Trained using LLaMA-Factory on 4x NVIDIA RTX 4090 GPUs with BF16 precision.

Good For

  • Applications requiring a model that can generate intentionally "rude" or "blunt" Chinese dialogue.
  • Simulating specific character personas in conversational AI where an unfiltered communication style is desired.
  • Research into preference alignment for niche conversational styles.