aloobun/Synch-Qwen1.5-1.8B
TEXT GENERATIONConcurrency Cost:1Model Size:1.8BQuant:BF16Ctx Length:32kPublished:Mar 22, 2024License:tongyi-qianwen-researchArchitecture:Transformer0.0K Warm
aloobun/Synch-Qwen1.5-1.8B is an experimental 1.8 billion parameter language model based on Qwen/Qwen1.5-1.8B, created by aloobun. This model was produced by merging the embedding weights of aloobun/Reyna-Mini-1.8B-v0.2 and aloobun/Reyna-Mini-1.8B-v0.1 using the TIES merging method. It features a 32768 token context length and is designed to explore the effects of targeted weight merging on model performance.
Loading preview...
Overview
aloobun/Synch-Qwen1.5-1.8B is an experimental 1.8 billion parameter language model developed by aloobun. It is built upon the Qwen/Qwen1.5-1.8B base model and distinguishes itself through a unique merging process.
Key Characteristics
- TIES Merging Method: This model was created using the TIES merging technique, specifically applied only to the embedding weights.
- Merged Models: It incorporates characteristics from two prior models: aloobun/Reyna-Mini-1.8B-v0.2 and aloobun/Reyna-Mini-1.8B-v0.1.
- Targeted Weight Application: The merge configuration applied different weights and densities to the
embed_tokenslayer of the constituent models, indicating a focused approach to modifying the model's input representation.
Potential Use Cases
- Research into Model Merging: Ideal for researchers studying the impact of specific weight merging strategies, particularly the TIES method and its application to embedding layers.
- Exploration of Qwen1.5-1.8B Variants: Developers interested in how targeted modifications can alter the behavior of the Qwen1.5-1.8B base model may find this useful.
- Small-Scale Experimentation: Its 1.8 billion parameter size makes it suitable for rapid prototyping and experimentation where larger models might be computationally prohibitive.