djuna/MN-Chinofun-12B-2

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Oct 23, 2024Architecture:Transformer0.0K Cold

The djuna/MN-Chinofun-12B-2 is a 12 billion parameter language model created by djuna using the Model Stock merge method, based on ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2. This merged model integrates capabilities from several pre-trained models, offering a 32768 token context length. It is designed to combine diverse strengths from its constituent models, making it suitable for general language generation tasks.

Loading preview...

Model Overview

The djuna/MN-Chinofun-12B-2 is a 12 billion parameter language model developed by djuna. It was created using the Model Stock merge method, leveraging ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2 as its base model. This merging approach combines the strengths of multiple pre-trained models to enhance overall performance and versatility.

Merged Components

This model integrates five distinct pre-trained models:

Performance Metrics

Evaluations on the Open LLM Leaderboard indicate an average score of 25.37. Specific scores include:

  • IFEval (0-Shot): 61.71
  • BBH (3-Shot): 29.53
  • MATH Lvl 5 (4-Shot): 11.18
  • MMLU-PRO (5-shot): 29.06

These results provide insight into its capabilities across various reasoning and knowledge-based tasks. The model supports a context length of 32768 tokens.