YOYO-AI/Qwen2.5-14B-it-restore

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Mar 9, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

YOYO-AI/Qwen2.5-14B-it-restore is a 14.8 billion parameter language model based on the Qwen2.5 architecture, developed by YOYO-AI. This model is specifically engineered to restore and enhance instruction-following and mathematical abilities by combining 'della', 'ties', and 'model stock' merging methods. It aims to overcome the common degradation in these capabilities often seen when using single merging techniques, making it suitable for tasks requiring robust instruction adherence and numerical reasoning.

Loading preview...

Model Overview

YOYO-AI/Qwen2.5-14B-it-restore is a 14.8 billion parameter model built upon the Qwen2.5 architecture, designed to address specific challenges in instruction-following and mathematical reasoning. This model distinguishes itself by employing a novel combination of three merging methods: della, ties, and model stock. The primary goal of this multi-method approach is to mitigate the decline in instruction-following and mathematical capabilities that can occur when models are merged using only one of these techniques.

Key Capabilities

  • Enhanced Instruction Following: Specifically engineered to improve the model's ability to accurately understand and execute complex instructions.
  • Restored Mathematical Ability: Focuses on preventing the degradation of mathematical reasoning skills, which is a common issue in merged models.
  • Advanced Merging Strategy: Utilizes a unique combination of 'della', 'ties', and 'model stock' methods to achieve a more balanced and capable instruction-tuned model.

Good For

  • Applications requiring high fidelity in following user instructions.
  • Tasks that demand reliable mathematical and logical reasoning.
  • Scenarios where maintaining strong core LLM capabilities after merging is critical.