seyf1elislam/KuTrix-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 14, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

KuTrix-7b is a 7 billion parameter language model developed by seyf1elislam, created by merging mistralai/Mistral-7B-v0.1 with SanjiWatsuki/Kunoichi-DPO-v2-7B and CultriX/NeuralTrix-7B-dpo using the DARE TIES method. This model achieves an average score of 74.42 on the Open LLM Leaderboard, demonstrating strong performance across various reasoning and language understanding tasks. With a 4096-token context length, it is suitable for general-purpose text generation and conversational AI applications.

Loading preview...

KuTrix-7b: A Merged 7B Language Model

KuTrix-7b is a 7 billion parameter language model developed by seyf1elislam, constructed through a sophisticated merge of pre-trained models. It utilizes the DARE TIES merge method, with mistralai/Mistral-7B-v0.1 serving as its base architecture. The merge incorporates SanjiWatsuki/Kunoichi-DPO-v2-7B and CultriX/NeuralTrix-7B-dpo, combining their strengths to enhance overall performance.

Key Capabilities & Performance

This model demonstrates robust performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard:

  • Average Score: 74.42
  • AI2 Reasoning Challenge (25-Shot): 70.48
  • HellaSwag (10-Shot): 87.94
  • MMLU (5-Shot): 65.28
  • TruthfulQA (0-shot): 70.85
  • Winogrande (5-shot): 81.93
  • GSM8k (5-shot): 70.05

These scores indicate its proficiency in reasoning, common sense, language understanding, and mathematical problem-solving. The model supports a context length of 4096 tokens.

Good For

  • General-purpose text generation and completion tasks.
  • Applications requiring strong reasoning and language comprehension.
  • Developers looking for a capable 7B model built on a Mistral base with enhanced DPO-tuned components.