Jagan666/7B-merge-champion

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 26, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Jagan666/7B-merge-champion is a 7 billion parameter language model created by Jagan666, formed by a linear merge of four Qwen2.5-7B fine-tunes. This model was developed as a learning project to build an end-to-end merge and evaluation pipeline. It demonstrates competent performance across various benchmarks, notably strong in mathematical reasoning tasks, while showing some dilution in instruction-following compared to individual fine-tunes.

Loading preview...

Model Overview

Jagan666/7B-merge-champion is a 7 billion parameter model based on the Qwen2.5 architecture, created by Jagan666. It is a linear merge of four distinct Qwen2.5-7B fine-tunes: Xiaojian9992024/Qwen2.5-Dyanka-7B-Preview, Xiaojian9992024/Qwen2.5-THREADRIPPER-Small, suayptalha/Clarus-7B-v0.3, and gz987/qwen2.5-7b-cabs-v0.3. The merge was performed using mergekit, with mixing weights determined by random search over a small held-out evaluation set.

Key Capabilities & Performance

This model was developed as a learning exercise to build an end-to-end merge and evaluation pipeline. Evaluation using lm-evaluation-harness on the Open LLM Leaderboard v2 task suite shows:

  • Strong performance in mathematical reasoning: Achieves 36.93% on MATH-Lvl-5 (hard) and 63.5% on algebra-hard, likely inheriting strengths from its source models like Clarus and qwen2.5-7b-cabs.
  • Average performance: Scores in the typical mid-range for 7B models on benchmarks such as BBH (55.55% acc_norm), MMLU-Pro (44.92% acc), GPQA (32.30% acc_norm), and MuSR (44.58% acc_norm).
  • Limitations in instruction following: Exhibits weaker performance on IFEval (38.63% prompt_level_strict_acc), suggesting that linear merging can dilute instruction-following behaviors when source models have differing response formatting.

When to Consider This Model

This model is suitable for use cases requiring a general-purpose 7B model with a particular strength in mathematical problem-solving. Developers interested in exploring linear merging techniques or requiring a model with a focus on quantitative tasks might find this merge valuable. However, for applications heavily reliant on strict instruction following, individual fine-tunes might offer better performance.