Overview
Mahou-1.5-mistral-nemo-12B Overview
Mahou-1.5-mistral-nemo-12B is a 12 billion parameter language model developed by flammenai, specifically engineered for conversational AI. Its primary focus is on generating short, engaging messages within casual conversation and character roleplay scenarios. The model was fine-tuned using the ORPO method over 3 epochs, utilizing 4x H100 GPUs.
Key Capabilities
- Casual Conversation: Designed to handle general conversational exchanges effectively.
- Character Roleplay: Excels at adopting and maintaining character personas, supporting specific formatting for actions (e.g.,
*leans against wall cooly*) and speech without quotes. - ChatML Format: Trained to use the ChatML format, ensuring compatibility with common chat interfaces.
Performance & Training
While optimized for conversational flow, the model's general reasoning capabilities, as indicated by Open LLM Leaderboard evaluations, show an average score of 26.28. Specific metrics include 67.51 for IFEval (0-Shot) and 36.26 for BBH (3-Shot). Users of SillyTavern can leverage provided presets for optimal performance, including a ChatML Instruct preset and a Sampler preset.
Good For
- Applications requiring engaging, short-form conversational responses.
- Interactive character roleplay experiences.
- Integration into platforms like SillyTavern for enhanced chat and roleplay functionalities.