Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v7
Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v7 is a 14.8 billion parameter language model based on the Qwen2.5 architecture, created by Lunzima through a merge of multiple pre-trained models. Utilizing the SCE merge method, this model integrates various Qwen2.5-14B bases and Lamarckvergence-14B. It features a 32768 token context length and is designed as a general-purpose merged model, suitable for diverse language generation tasks.
Loading preview...
Model Overview
Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v7 is a 14.8 billion parameter language model developed by Lunzima. This model is a product of an advanced merging process, combining several pre-trained Qwen2.5-14B base models and suayptalha/Lamarckvergence-14B.
Merge Details
This model was created using the SCE (Sub-layer Combination and Expansion) merge method, as detailed in the SCE paper. The merging process utilized mergekit and integrated the following models:
/root/LLM/NQLSG-Qwen2.5-14B-Base3/root/LLM/NQLSG-Qwen2.5-14B-Base1Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v6suayptalha/Lamarckvergence-14B
The base model for the merge was /root/LLM/NQLSG-Qwen2.5-14B-Base2. The configuration specified bfloat16 dtype and an int8_mask of 1.0, with all merged models contributing across layers 0 to 48.
Key Characteristics
- Architecture: Qwen2.5-based, enhanced through a multi-model fusion.
- Parameter Count: 14.8 billion parameters.
- Context Length: Supports a substantial context window of 32768 tokens.
- Merge Method: Employs the SCE method for combining model strengths.
Intended Use
This model is suitable for general language generation and understanding tasks, benefiting from the combined capabilities of its constituent models. Its large context window makes it applicable for tasks requiring extensive input processing.