Model Overview
spow12/ChatWaifu_72B_v2.2 is a 72.7 billion parameter causal language model developed by spow12, built upon the Qwen2.5-72B-Instruct base model. It is a merged model, combining several specialized models like Nexusflow/Athene-V2-Chat, Nexusflow/Athene-V2-Agent, and various Qwen2.5-72B instruction-tuned models, using the model_stock merge method.
Key Capabilities
- Persona Maintenance: Specifically designed to create agent systems that can consistently maintain a given 'waifu' persona.
- Roleplay Generation: Fine-tuned on a diverse set of roleplay datasets, including private Japanese visual novel scripts and public roleplay datasets, enabling high-quality character-driven interactions.
- Multilingual Support: Supports both Japanese and English languages, with training data reflecting this linguistic diversity.
- Instruction Following: Incorporates instruction-tuned models and datasets to enhance its ability to follow user directives.
Training Details
The model underwent Supervised Fine-Tuning (SFT) using a comprehensive dataset that includes private visual novel dialogues (Riddle Joker, Café Stella, Senren*Banka) and various public instruction and roleplay datasets such as roleplay4fun/aesir-v1.1, kalomaze/Opus_Instruct_3k, Gryphe/Sonnet3.5-SlimOrcaDedupCleaned, and several Aratako synthetic datasets focusing on Japanese roleplay, translation, and coding.
Usage and Licensing
This model is intended for non-commercial and research purposes only. Users are encouraged to use it responsibly, contributing to the open-source community and research efforts, particularly for applications involving character-based AI.