CultriX/Qwen2.5-14B-ReasoningMerge

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Feb 18, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CultriX/Qwen2.5-14B-ReasoningMerge is a 14.8 billion parameter language model created by CultriX using a SLERP merge of Sakalti/Saka-14B and RDson/WomboCombo-R1-Coder-14B-Preview. This model is specifically configured to enhance reasoning capabilities, with a focus on self-attention and MLP layers from its constituent models. It is designed for tasks requiring robust logical processing and potentially coding-related applications, leveraging a 32768 token context length.

Loading preview...

CultriX/Qwen2.5-14B-ReasoningMerge Overview

This model, developed by CultriX, is a 14.8 billion parameter language model created through a SLERP merge using mergekit. It combines the strengths of two distinct base models: Sakalti/Saka-14B and RDson/WomboCombo-R1-Coder-14B-Preview.

Key Merge Details

The merge process specifically weighted different components of the base models to optimize for particular characteristics:

  • Self-attention layers: Favored RDson/WomboCombo-R1-Coder-14B-Preview with a value of 0.4.
  • MLP layers: Favored Sakalti/Saka-14B with a value of 0.7.
  • Other parameters were set to a balanced value of 0.5.

This configuration suggests an intentional design to blend reasoning and potentially coding-oriented capabilities from its constituent models. The model utilizes Sakalti/Saka-14B as its tokenizer source and employs the chatml chat template.

Potential Use Cases

Given its merged architecture and specific parameter weighting, this model is likely suitable for:

  • Reasoning-intensive tasks: Benefiting from the combined strengths of its base models.
  • Coding assistance: Potentially leveraging the WomboCombo-R1-Coder component.
  • Applications requiring a balanced blend of general language understanding and specialized processing.