djuna/MN-Chinofun-12B-2
The djuna/MN-Chinofun-12B-2 is a 12 billion parameter language model created by djuna using the Model Stock merge method, based on ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2. This merged model integrates capabilities from several pre-trained models, offering a 32768 token context length. It is designed to combine diverse strengths from its constituent models, making it suitable for general language generation tasks.
Loading preview...
Model Overview
The djuna/MN-Chinofun-12B-2 is a 12 billion parameter language model developed by djuna. It was created using the Model Stock merge method, leveraging ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2 as its base model. This merging approach combines the strengths of multiple pre-trained models to enhance overall performance and versatility.
Merged Components
This model integrates five distinct pre-trained models:
- grimjim/magnum-consolidatum-v1-12b
- spow12/ChatWaifu_v1.4
- GalrionSoftworks/Canidori-12B-v1
- Nohobby/MN-12B-Siskin-v0.2
- RozGrov/NemoDori-v0.2.2-12B-MN-ties
Performance Metrics
Evaluations on the Open LLM Leaderboard indicate an average score of 25.37. Specific scores include:
- IFEval (0-Shot): 61.71
- BBH (3-Shot): 29.53
- MATH Lvl 5 (4-Shot): 11.18
- MMLU-PRO (5-shot): 29.06
These results provide insight into its capabilities across various reasoning and knowledge-based tasks. The model supports a context length of 32768 tokens.