KaidenRp2400_12b_v1: A Merged 12B Language Model
KaidenRp2400_12b_v1 is a 12 billion parameter language model developed by kainatq, utilizing the Mistral-Nemo-Base-2407 as its foundational architecture. This model is notable for its sophisticated, multi-stage merging process, employing the DARE TIES method to combine the strengths of several specialized models.
Key Capabilities and Merging Strategy
The model's development involved two primary intermediate merges, which were then combined with an additional model to form the final version. This approach aims to integrate diverse functionalities and improve overall performance, particularly in areas where the constituent models excel.
- Base Architecture: Built on
mistralai/Mistral-Nemo-Base-2407, providing a robust foundation. - Multi-Stage Merging: The model is a result of merging
kainatq/KaidenRp2400_12b_v1_m1 and kainatq/KaidenRp2400_12b_v1_m2, which themselves are merges of other models. - Constituent Models: The first merge (
m1) incorporated Gryphe/Pantheon-RP-1.5-12b-Nemo, kainatq/RP-king-12b-II, and elinas/Chronos-Gold-12B-1.0. The second merge (m2) included mergekit-community/MN-Sappho-g2-12B, nbeerbower/Nemoties-ChatML-12B, and pbevan11/Mistral-Nemo-Baseline-SFT. - Final Composition: The final model combines
kainatq/KaidenRp2400_12b_v1_m1, kainatq/KaidenRp2400_12b_v1_m2, and mergekit-community/MN-Sappho-j-12B.
Usage Guidelines
Users should follow the mistralai/Mistral-Nemo-Base-2407 (ChatML) instructions for proper implementation and interaction with this model. The complex merging strategy suggests an optimization for nuanced conversational and potentially role-playing scenarios, drawing from the specialized capabilities of its merged components.