CultriX/SeQwence-14B-EvolMergev1
CultriX/SeQwence-14B-EvolMergev1 is a 14.8 billion parameter language model with a 32768 token context length, created by CultriX using the DARE TIES merge method. This model is a merge of SeQwence-14Bv1, Qwen2.5-14B-Wernicke, and Qwen2.5-14B-MegaMerge-pt2. It is designed to combine and enhance the capabilities of its constituent models through a sophisticated merging technique.
Loading preview...
Model Overview
CultriX/SeQwence-14B-EvolMergev1 is a 14.8 billion parameter language model with a 32768 token context length. It was developed by CultriX using the mergekit framework, specifically employing the DARE TIES merge method.
Merge Details
This model is a composite of three distinct base models, aiming to leverage their individual strengths:
- Base Model: SeQwence-14Bv1
- Merged Models: Qwen2.5-14B-Wernicke and Qwen2.5-14B-MegaMerge-pt2
The DARE TIES (Disentangled, Adaptive, and Re-weighted TIES) method, as described in research papers like DARE and TIES, was utilized for this integration. The merging process involved specific layer-wise density and weight parameters for each contributing model, indicating a fine-tuned approach to combine their learned representations.
Key Characteristics
- Architecture: Merged model based on Qwen2.5-14B variants.
- Parameter Count: 14.8 billion parameters.
- Context Length: Supports a substantial context window of 32768 tokens.
- Merge Method: Utilizes advanced DARE TIES merging for potentially enhanced performance across various tasks.
Potential Use Cases
Given its merged nature and substantial context window, this model is likely suitable for applications requiring:
- Complex reasoning and understanding over long texts.
- Tasks benefiting from the combined strengths of its constituent models.
- General-purpose language generation and comprehension.