Sakalti/ultiima-72B: A Merged Qwen2.5 Model
Sakalti/ultiima-72B is a 72.7 billion parameter language model built upon the Qwen2.5 architecture. This model was created using the TIES merge method, combining the strengths of the base model Qwen/Qwen2.5-72B with the instruction-tuned variant Qwen/Qwen2.5-72B-Instruct.
Key Capabilities & Performance
This model demonstrates strong performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Its average score is 46.58, with notable results in specific areas:
- IFEval (0-Shot): 71.40
- BBH (3-Shot): 61.10
- MATH Lvl 5 (4-Shot): 52.42
- MMLU-PRO (5-shot): 54.51
With a substantial context length of 131072 tokens, ultiima-72B is well-suited for tasks requiring extensive contextual understanding and generation.
Merge Details
The model was constructed using mergekit, specifically employing the TIES (Trimmed, Iterative, and Selective) merging technique. The primary component in this merge was Qwen/Qwen2.5-72B-Instruct, with Qwen/Qwen2.5-72B serving as the base model. This approach aims to consolidate and enhance the capabilities of its constituent models.
Good For
- Applications requiring a large-scale, general-purpose language model.
- Tasks benefiting from a long context window.
- Scenarios where strong instruction following and reasoning capabilities are important.