Nitral-AI/Captain-Eris_Violet-GRPO-v0.420

Warm
Public
12B
FP8
32768
1
Feb 17, 2025
License: other
Hugging Face
Overview

Model Overview

Nitral-AI/Captain-Eris_Violet-GRPO-v0.420 is a specialized language model developed by Nitral-AI, focusing on enhanced interactive narrative capabilities. Its development involved a sophisticated training regimen combining multi-stage supervised fine-tuning, pre-trained QLoRA adapters, and multi-stage Reinforcement Learning from Human Feedback (RLHF) optimized with GRPO (Generalized Reinforcement Learning with Policy Optimization). The final model is a merge of promising candidates identified during this iterative process.

Key Capabilities

  • Optimized for Character Interaction: Specifically designed for use in applications like SillyTavern, facilitating rich and engaging character-driven narratives.
  • Advanced Fine-tuning: Leverages a unique blend of SFT, QLoRA, and GRPO-optimized RLHF for superior performance in its target domain.
  • SillyTavern Integration: Provides examples and presets for seamless integration, including reasoning block parsing and Mistral formatting, to enhance user experience.
  • Model Merging: Utilizes a slerp merge method, combining Nitral-AI/Captain-Eris_Violet-0.420-Rebased and Nitral-AI/Captain-Eris_Violet-GRPO-Rebased to achieve its distinct characteristics.

Good For

  • Interactive Storytelling: Ideal for generating dynamic and context-aware responses in roleplaying and conversational AI applications.
  • SillyTavern Users: Directly supports and enhances the experience for users of SillyTavern with provided character cards and formatting examples.
  • Developers Exploring RLHF: Offers an example of a model developed using advanced multi-stage RLHF and GRPO techniques.