amshunath/qwen-medical-dare-optimal
amshunath/qwen-medical-dare-optimal is a 1.5 billion parameter language model based on the Qwen2.5-1.5B-Instruct architecture, created by amshunath. This model was developed using the Linear DARE merge method, combining the base Qwen2.5-1.5B-Instruct with another local model. It is designed for general language tasks, leveraging its merged architecture for potentially enhanced performance.
Loading preview...
Model Overview
amshunath/qwen-medical-dare-optimal is a 1.5 billion parameter language model built upon the Qwen2.5-1.5B-Instruct base architecture. It was created by amshunath using the Linear DARE merge method, which combines multiple pre-trained models to potentially improve capabilities without extensive retraining.
Key Characteristics
- Base Model: Utilizes the robust Qwen2.5-1.5B-Instruct as its foundation.
- Merge Method: Employs the Linear DARE technique, known for its ability to blend model strengths.
- Parameter Count: Features 1.5 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a context window of 32768 tokens, suitable for processing longer inputs.
Merge Details
The model was constructed by merging Qwen/Qwen2.5-1.5B-Instruct with an additional local model, ./model_sft_merged_local. The merge configuration applied a density of 0.7 and a weight of 1.0 to the local model during the DARE merge process. This approach aims to integrate specific characteristics or fine-tuning from the local model into the Qwen2.5 base.