CultriX/Qwen2.5-14B-Hyperionv5

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Jan 19, 2025Architecture:Transformer0.0K Cold

CultriX/Qwen2.5-14B-Hyperionv5 is a 14.8 billion parameter language model based on the Qwen2.5 architecture, developed by CultriX. This model is a DARE TIES merge of nine distinct Qwen2.5-14B models, specifically optimized for enhanced performance across a range of complex reasoning and instruction-following tasks. It features a 32K context length and is particularly strong in mathematical reasoning, multi-domain knowledge, and factual question answering.

Loading preview...

CultriX/Qwen2.5-14B-Hyperionv5 Overview

CultriX/Qwen2.5-14B-Hyperionv5 is a 14.8 billion parameter merged language model built upon the Qwen2.5 architecture. Developed by CultriX, this model leverages the DARE TIES merge method, combining nine different Qwen2.5-14B base models, including CultriX/Qwen2.5-14B-Wernickev3 as its primary backbone. The merging process was meticulously configured with adaptive merge parameters, assigning specific task weights to optimize performance across various benchmarks.

Key Capabilities & Optimizations

This model is specifically tuned to excel in several critical areas, as indicated by its task-weighted merging configuration:

  • Mathematical Reasoning (MATH): Highest priority with a weight of 2.8, making it highly capable for complex mathematical problems.
  • Instruction Following (IFEval): Prioritized with a weight of 2.5, ensuring strong adherence to given instructions.
  • Complex Reasoning (BBH, MUSR): High emphasis on Big-Bench Hard and Multi-Step Reasoning tasks (weight 2.2).
  • Factual Question Answering (tinyTruthfulQA): Strong performance in factual accuracy and truthful responses (weight 2.2).
  • Multi-domain Knowledge (tinyMMLU, MMLU-PRO): Optimized for broad knowledge across various subjects (weights 1.8 and 2.0 respectively).

When to Use This Model

CultriX/Qwen2.5-14B-Hyperionv5 is particularly well-suited for applications requiring robust performance in:

  • Advanced mathematical problem-solving.
  • Strict instruction adherence and complex task execution.
  • General reasoning and multi-step logical deduction.
  • Factual information retrieval and accurate question answering.
  • Tasks demanding broad, multi-domain knowledge.

Its 32K context length further supports handling longer and more intricate prompts.