aloobun/Synch-Qwen1.5-1.8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.8BQuant:BF16Ctx Length:32kPublished:Mar 22, 2024License:tongyi-qianwen-researchArchitecture:Transformer0.0K Warm

aloobun/Synch-Qwen1.5-1.8B is an experimental 1.8 billion parameter language model based on Qwen/Qwen1.5-1.8B, created by aloobun. This model was produced by merging the embedding weights of aloobun/Reyna-Mini-1.8B-v0.2 and aloobun/Reyna-Mini-1.8B-v0.1 using the TIES merging method. It features a 32768 token context length and is designed to explore the effects of targeted weight merging on model performance.

Loading preview...

Overview

aloobun/Synch-Qwen1.5-1.8B is an experimental 1.8 billion parameter language model developed by aloobun. It is built upon the Qwen/Qwen1.5-1.8B base model and distinguishes itself through a unique merging process.

Key Characteristics

  • TIES Merging Method: This model was created using the TIES merging technique, specifically applied only to the embedding weights.
  • Merged Models: It incorporates characteristics from two prior models: aloobun/Reyna-Mini-1.8B-v0.2 and aloobun/Reyna-Mini-1.8B-v0.1.
  • Targeted Weight Application: The merge configuration applied different weights and densities to the embed_tokens layer of the constituent models, indicating a focused approach to modifying the model's input representation.

Potential Use Cases

  • Research into Model Merging: Ideal for researchers studying the impact of specific weight merging strategies, particularly the TIES method and its application to embedding layers.
  • Exploration of Qwen1.5-1.8B Variants: Developers interested in how targeted modifications can alter the behavior of the Qwen1.5-1.8B base model may find this useful.
  • Small-Scale Experimentation: Its 1.8 billion parameter size makes it suitable for rapid prototyping and experimentation where larger models might be computationally prohibitive.