bamec66557/VICIOUS_MESH-12B-GAMMA

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Dec 20, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

bamec66557/VICIOUS_MESH-12B-GAMMA is a 12 billion parameter language model created by bamec66557 using the TIES merge method, based on Khetterman/DarkAtom-12B-v3 and incorporating Infermatic/MN-12B-Inferor-v0.1 and bamec66557/VICIOUS_MESH-12B-BETA. This merged model is designed for general language tasks, demonstrating an average performance of 26.77 on the Open LLM Leaderboard evaluation metrics, including IFEval and MMLU-PRO.

Loading preview...

Model Overview

bamec66557/VICIOUS_MESH-12B-GAMMA is a 12 billion parameter language model developed by bamec66557. It was constructed using the TIES merge method, combining several pre-trained models to leverage their respective strengths. The base model for this merge was Khetterman/DarkAtom-12B-v3, with contributions from Infermatic/MN-12B-Inferor-v0.1 and bamec66557/VICIOUS_MESH-12B-BETA.

Performance Highlights

Evaluated on the Open LLM Leaderboard, VICIOUS_MESH-12B-GAMMA achieved an average score of 26.77. Key metric scores include:

  • IFEval (0-Shot): 63.62
  • BBH (3-Shot): 31.49
  • MMLU-PRO (5-shot): 29.62

Detailed evaluation results are available on the Open LLM Leaderboard dataset page.

Merge Configuration

The model was created using mergekit with a specific YAML configuration, applying a density and weight of 0.5 to the contributing models during the TIES merge process. The final model uses bfloat16 data type.