Noromaid-7B-0.4-DPO: A Collaborative Finetune for Enhanced Roleplay
Noromaid-7B-0.4-DPO is a 7 billion parameter model, a collaborative effort between IkariDev and Undi, designed to deliver more human-like and nuanced text generation. This model is a full finetune, distinguishing itself through its unique training methodology and data selection.
Key Capabilities & Training
- Enhanced Human Behavior: The model was trained using the
no_robots dataset, specifically aimed at fostering more natural and human-like conversational patterns. - Novel Roleplay Data: It incorporates new, never-before-used private datasets, including "Aesir Private RP dataset" and "Another private Aesir dataset," ensuring fresh and diverse roleplay scenarios without relying on common, overused sources like LimaRP spam.
- DPO Optimization: Further refined with Direct Preference Optimization (DPO) using datasets such as
Intel/orca_dpo_pairs and NobodyExistsOnTheInternet/ToxicDPOqa, which helps in aligning the model's outputs with desired preferences and reducing unwanted responses. - ChatML Format: Utilizes the ChatML prompt format, making it straightforward to integrate into existing chat-based applications.
Good For
- Creative Roleplay: Excels in generating dynamic and engaging roleplay scenarios and character interactions.
- Conversational AI: Suitable for applications requiring models that can produce natural, human-like dialogue.
- Text Generation: Ideal for tasks demanding nuanced and contextually rich text outputs.