CultriX/Qwen2.5-14B-ReasoningMerge
CultriX/Qwen2.5-14B-ReasoningMerge is a 14.8 billion parameter language model created by CultriX using a SLERP merge of Sakalti/Saka-14B and RDson/WomboCombo-R1-Coder-14B-Preview. This model is specifically configured to enhance reasoning capabilities, with a focus on self-attention and MLP layers from its constituent models. It is designed for tasks requiring robust logical processing and potentially coding-related applications, leveraging a 32768 token context length.
Loading preview...
CultriX/Qwen2.5-14B-ReasoningMerge Overview
This model, developed by CultriX, is a 14.8 billion parameter language model created through a SLERP merge using mergekit. It combines the strengths of two distinct base models: Sakalti/Saka-14B and RDson/WomboCombo-R1-Coder-14B-Preview.
Key Merge Details
The merge process specifically weighted different components of the base models to optimize for particular characteristics:
- Self-attention layers: Favored
RDson/WomboCombo-R1-Coder-14B-Previewwith a value of 0.4. - MLP layers: Favored
Sakalti/Saka-14Bwith a value of 0.7. - Other parameters were set to a balanced value of 0.5.
This configuration suggests an intentional design to blend reasoning and potentially coding-oriented capabilities from its constituent models. The model utilizes Sakalti/Saka-14B as its tokenizer source and employs the chatml chat template.
Potential Use Cases
Given its merged architecture and specific parameter weighting, this model is likely suitable for:
- Reasoning-intensive tasks: Benefiting from the combined strengths of its base models.
- Coding assistance: Potentially leveraging the
WomboCombo-R1-Codercomponent. - Applications requiring a balanced blend of general language understanding and specialized processing.