CultriX/Qwen2.5-14B-Hyperionv3

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Jan 10, 2025Architecture:Transformer0.0K Cold

CultriX/Qwen2.5-14B-Hyperionv3 is a 14.8 billion parameter language model developed by CultriX, created by merging multiple Qwen2.5-14B-based models using the della_linear method. It features a 32768 token context length and is specifically optimized for complex reasoning tasks, including mathematical problem-solving, instruction following, and multi-step reasoning, as indicated by its adaptive merge parameters. This model is designed for applications requiring enhanced logical and factual accuracy across various benchmarks.

Loading preview...

CultriX/Qwen2.5-14B-Hyperionv3 Overview

CultriX/Qwen2.5-14B-Hyperionv3 is a 14.8 billion parameter language model developed by CultriX, built upon the Qwen2.5-14B architecture. This model was created using the della_linear merge method via mergekit, integrating several specialized Qwen2.5-14B-based models. Its configuration emphasizes enhanced performance in complex reasoning and mathematical tasks, with a notable context length of 32768 tokens.

Key Capabilities

  • Advanced Reasoning: Prioritizes logical reasoning, contextual understanding, and multi-step problem-solving, with specific weighting for tasks like tinyArc, tinyWinogrande, BBH, and MUSR.
  • Mathematical Excellence: Features the highest priority for mathematical tasks (MATH), indicating strong capabilities in numerical and quantitative reasoning.
  • Instruction Following: Enhanced for instruction-following tasks (IFEval), suggesting robust performance in adhering to complex prompts and directives.
  • Factual Accuracy: Strengthened for accurate factual reasoning and question answering, with increased focus on tinyTruthfulQA and GPQA.
  • Multitask Performance: Integrates contributions from various models to maintain balanced strengths across a wide range of generalist and domain-specific benchmarks.

Good for

  • Applications requiring strong mathematical problem-solving.
  • Use cases demanding precise instruction following and complex reasoning.
  • Scenarios where high factual accuracy and contextual understanding are critical.
  • Developers seeking a merged model optimized for a blend of advanced reasoning and general language tasks.