Overview
Sao10K/MN-12B-Lyra-v1 is an experimental 12-billion parameter model designed for general roleplaying. It is a merge of two distinct models, one trained with roleplay and creative writing data, and the other focused on instruction following and general smarts. The model utilizes the base Nemo 12B tokenizer, ensuring no token conflicts.
Key Capabilities
- General Roleplaying: Optimized for interactive and creative narrative generation.
- Emotional Intelligence: Achieved a score of 77.41 on EQ-Bench, indicating strong performance in understanding and generating emotionally nuanced text.
- Flexible Prompting: Supports both
[INST] and ChatML prompting formats due to its merged training data. - Merge Method: Developed using the
della_linear merge method, which was found to be optimal for this specific model combination.
Training Insights
- The base Nemo architecture, while capable, was found to be "dry" and required multi-stage fine-tuning for creative and varied data.
- Effective context length is noted to be around 16K tokens, which is considered sufficient for roleplaying scenarios, despite attempts to train with longer contexts.
Good For
- Creative Writing: Generating engaging and imaginative text.
- Interactive Narratives: Developing chatbots or applications requiring dynamic character interactions.
- Roleplaying Scenarios: Creating detailed and responsive roleplay experiences.