YOYO-AI/ZYH-LLM-Qwen2.5-14B-V3
ZYH-LLM-Qwen2.5-14B-V3 is a 14.8 billion parameter language model developed by YOYO-AI, built upon the Qwen2.5 architecture with a notable 131072 token context length. This third-generation model in the ZYH-LLM series utilizes extensive model merging techniques to create a powerful and unified base. It is specifically optimized for instruction following and complex reasoning, achieving the highest IFEval score among 14B models as of February 25, 2025, making it suitable for tasks requiring precise adherence to instructions and advanced problem-solving.
Loading preview...
ZYH-LLM-Qwen2.5-14B-V3 Overview
ZYH-LLM-Qwen2.5-14B-V3 is the third iteration in the ZYH-LLM series by YOYO-AI, a 14.8 billion parameter model based on the Qwen2.5 architecture. This model is distinguished by its advanced model merging techniques, designed to create a highly unified and powerful foundation for further fine-tuning and development. It boasts an impressive 131072 token context length, enabling it to process and understand extensive inputs.
Key Capabilities
- Superior Instruction Following: As of February 25, 2025, it holds the highest IFEval (0-Shot) score (85.78) among 14B models, indicating exceptional ability to understand and execute complex instructions.
- Robust Reasoning: Demonstrates strong performance in reasoning tasks, with a BBH (3-Shot) score of 48.18 and MATH Lvl 5 (4-Shot) score of 52.72.
- Unified Architecture: Developed through a multi-stage model merging process, integrating various Qwen2.5-14B variants and other models like EVA-Qwen2.5-14B-v0.2, arcee-ai/Virtuoso-Small-v2, and Azure99/Blossom-V6-14B.
Good for
- Applications requiring high-fidelity instruction adherence and complex task execution.
- Use cases demanding strong reasoning capabilities across diverse domains.
- Developers looking for a powerful and unified 14B base model for further specialized fine-tuning or merging experiments.