Overview

This model, Qwen3-1.7B-Yukari-DPO, is a 2 billion parameter language model derived from the Qwen/Qwen3-1.7B base. It is specifically fine-tuned for role-playing the character Yukari Yakumo from the Touhou Project, building upon the Qwen3-1.7B-Yukari-SFT-v2 version.

Key Differentiators

Character Role-play: Optimized for emulating Yukari Yakumo, a character from Touhou Project, making it a fan-made derivative work.
DPO Alignment: Utilizes Direct Preference Optimization (DPO) to align the model's outputs with desired preferences, specifically to:
- Eliminate offensive content.
- Reduce overly verbose responses.
- Minimize formulaic or generic generation patterns.
Emotional Control: Supports input tags based on an 8-dimensional Plutchik emotion vector (joy, anger, sadness, fear, disgust, surprise, trust, anticipation), allowing users to control the emotional tone of Yukari's responses.
Training Method: Fine-tuned using QLoRA 4-bit NF4 with DPO (sigmoid loss) over 3 epochs, achieving a DPO accuracy of 87.5%.

Usage

The model expects a specific input format: [emotion tags]\nUser input. Emotion tags are structured as [<|emotion_level_emotion_name|>...], where emotion_level can range from "none" to "extremely strong" for each of the eight emotions. Both transformers library and llama.cpp (GGUF format) are supported for inference.

Licensing

Model weights are under Apache 2.0 License.
The character "Yukari Yakumo" is copyrighted by Team Shanghai Alice / ZUN, and this model adheres to the Touhou Project's fan-made derivative work guidelines.

Overview

Overview

Key Differentiators

Usage

Licensing

Full Model Card (README)