ZYH-LLM-Qwen2.5-14B-V4 Overview
ZYH-LLM-Qwen2.5-14B-V4 is a 14.8 billion parameter language model from YOYO-AI, leveraging the Qwen2.5 architecture. It is the fourth iteration in the ZYH-LLM series, focusing on improving computational accuracy and inference capabilities through a sophisticated model merging strategy. The model maintains a substantial context length of 32768 tokens.
Key Capabilities & Development
This model was developed using a multi-stage merging process, combining various instruction-tuned and inference-tuned models, including those based on Qwen/Qwen2.5-14B-Instruct, arcee-ai/Virtuoso-Small-v2, arcee-ai/SuperNova-Medius, and Azure99/Blossom-V6-14B. A key aspect of its development involved increasing the proportion of the R1 distillation model in the merging recipe, which biases the model towards enhanced reasoning abilities. The merging template used aims to improve calculation accuracy and inference without compromising the general capabilities of the instruction model. The model also integrates a base model with a context of 1 million tokens.
Performance & Use Cases
Evaluations on the Open LLM Leaderboard show an average score of 43.14, with notable performance in IFEval (0-Shot) at 83.65 and MATH Lvl 5 (4-Shot) at 53.93. This indicates a strong capacity for instruction following and mathematical reasoning. The model is particularly well-suited for applications requiring robust instruction adherence, complex calculations, and general-purpose language understanding, benefiting from its carefully balanced merge of diverse base models.