CultriX/Qwen2.5-14B-Hyperionv3
CultriX/Qwen2.5-14B-Hyperionv3 is a 14.8 billion parameter language model developed by CultriX, created by merging multiple Qwen2.5-14B-based models using the della_linear method. It features a 32768 token context length and is specifically optimized for complex reasoning tasks, including mathematical problem-solving, instruction following, and multi-step reasoning, as indicated by its adaptive merge parameters. This model is designed for applications requiring enhanced logical and factual accuracy across various benchmarks.
Loading preview...
CultriX/Qwen2.5-14B-Hyperionv3 Overview
CultriX/Qwen2.5-14B-Hyperionv3 is a 14.8 billion parameter language model developed by CultriX, built upon the Qwen2.5-14B architecture. This model was created using the della_linear merge method via mergekit, integrating several specialized Qwen2.5-14B-based models. Its configuration emphasizes enhanced performance in complex reasoning and mathematical tasks, with a notable context length of 32768 tokens.
Key Capabilities
- Advanced Reasoning: Prioritizes logical reasoning, contextual understanding, and multi-step problem-solving, with specific weighting for tasks like tinyArc, tinyWinogrande, BBH, and MUSR.
- Mathematical Excellence: Features the highest priority for mathematical tasks (MATH), indicating strong capabilities in numerical and quantitative reasoning.
- Instruction Following: Enhanced for instruction-following tasks (IFEval), suggesting robust performance in adhering to complex prompts and directives.
- Factual Accuracy: Strengthened for accurate factual reasoning and question answering, with increased focus on tinyTruthfulQA and GPQA.
- Multitask Performance: Integrates contributions from various models to maintain balanced strengths across a wide range of generalist and domain-specific benchmarks.
Good for
- Applications requiring strong mathematical problem-solving.
- Use cases demanding precise instruction following and complex reasoning.
- Scenarios where high factual accuracy and contextual understanding are critical.
- Developers seeking a merged model optimized for a blend of advanced reasoning and general language tasks.