astom-M/matsuo-llm-advanced-phase-f4b
The astom-M/matsuo-llm-advanced-phase-f4b is a 7.6 billion parameter language model, merged using the DARE TIES method with Qwen/Qwen2.5-7B-Instruct as its base. This model combines two pre-trained models, prioritizing 'phase_d' with a 70% weight and supplementing with 'phase_e2b' at 30%. It is designed to leverage the strengths of its constituent models, offering a balanced performance profile for general language tasks with a 32K context length.
Loading preview...
Model Overview
This model, astom-M/matsuo-llm-advanced-phase-f4b, is a 7.6 billion parameter language model created through a sophisticated merging process. It utilizes the DARE TIES merge method, known for its effectiveness in combining pre-trained models while preserving their capabilities.
Key Characteristics
- Base Model: Built upon the robust foundation of
Qwen/Qwen2.5-7B-Instruct, ensuring strong general language understanding and generation. - Merge Composition: It is a blend of two distinct pre-trained models:
./outputs/phase_d/merged_model(70% weight) and./outputs/phase_e2b/merged_model(30% weight). This specific weighting aims for a more conservative integration, withphase_dbeing dominant. - Merge Method: Employs the DARE TIES technique, which is designed to efficiently merge models by pruning and re-scaling weights.
- Configuration: The merge process included parameters such as
normalize: trueandint8_mask: true, and was performed usingbfloat16dtype.
Potential Use Cases
Given its merge-based architecture and foundation on Qwen2.5-7B-Instruct, this model is suitable for a variety of applications requiring a capable 7B-class LLM, including:
- General text generation and completion.
- Instruction following and conversational AI.
- Summarization and question answering tasks.