Sao10K/MN-12B-Lyra-v1
Sao10K/MN-12B-Lyra-v1 is an experimental 12-billion parameter general roleplaying model based on the Mistral-Nemo architecture. It is a merge of two models, one focused on roleplay and creative writing, and the other on instruction following and general intelligence. This model demonstrates strong performance in emotional intelligence benchmarks, scoring 77.41 on EQ-Bench, making it suitable for creative text generation and interactive narrative applications.
Loading preview...
Overview
Sao10K/MN-12B-Lyra-v1 is an experimental 12-billion parameter model designed for general roleplaying. It is a merge of two distinct models, one trained with roleplay and creative writing data, and the other focused on instruction following and general smarts. The model utilizes the base Nemo 12B tokenizer, ensuring no token conflicts.
Key Capabilities
- General Roleplaying: Optimized for interactive and creative narrative generation.
- Emotional Intelligence: Achieved a score of 77.41 on EQ-Bench, indicating strong performance in understanding and generating emotionally nuanced text.
- Flexible Prompting: Supports both
[INST]and ChatML prompting formats due to its merged training data. - Merge Method: Developed using the
della_linearmerge method, which was found to be optimal for this specific model combination.
Training Insights
- The base Nemo architecture, while capable, was found to be "dry" and required multi-stage fine-tuning for creative and varied data.
- Effective context length is noted to be around 16K tokens, which is considered sufficient for roleplaying scenarios, despite attempts to train with longer contexts.
Good For
- Creative Writing: Generating engaging and imaginative text.
- Interactive Narratives: Developing chatbots or applications requiring dynamic character interactions.
- Roleplaying Scenarios: Creating detailed and responsive roleplay experiences.