amshunath/qwen-medical-dare-optimal

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Cold

amshunath/qwen-medical-dare-optimal is a 1.5 billion parameter language model based on the Qwen2.5-1.5B-Instruct architecture, created by amshunath. This model was developed using the Linear DARE merge method, combining the base Qwen2.5-1.5B-Instruct with another local model. It is designed for general language tasks, leveraging its merged architecture for potentially enhanced performance.

Loading preview...

Model Overview

amshunath/qwen-medical-dare-optimal is a 1.5 billion parameter language model built upon the Qwen2.5-1.5B-Instruct base architecture. It was created by amshunath using the Linear DARE merge method, which combines multiple pre-trained models to potentially improve capabilities without extensive retraining.

Key Characteristics

  • Base Model: Utilizes the robust Qwen2.5-1.5B-Instruct as its foundation.
  • Merge Method: Employs the Linear DARE technique, known for its ability to blend model strengths.
  • Parameter Count: Features 1.5 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 32768 tokens, suitable for processing longer inputs.

Merge Details

The model was constructed by merging Qwen/Qwen2.5-1.5B-Instruct with an additional local model, ./model_sft_merged_local. The merge configuration applied a density of 0.7 and a weight of 1.0 to the local model during the DARE merge process. This approach aims to integrate specific characteristics or fine-tuning from the local model into the Qwen2.5 base.