thrnn/qwen2.5-1.5b-medical-sft-dare

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 2, 2026Architecture:Transformer Cold

The thrnn/qwen2.5-1.5b-medical-sft-dare model is a 1.5 billion parameter language model based on the Qwen2.5-1.5B-Instruct architecture, fine-tuned for medical applications. It was created using the Linear DARE merge method, combining the base Qwen model with a specialized medical SFT LoRA. This model is designed for tasks requiring medical domain knowledge, leveraging its 32768 token context length for comprehensive analysis.

Loading preview...

Model Overview

The thrnn/qwen2.5-1.5b-medical-sft-dare is a 1.5 billion parameter language model built upon the Qwen/Qwen2.5-1.5B-Instruct base. It was developed using the Linear DARE merge method, which combines the foundational Qwen model with a specialized outputs/model_sft_lora component. This merging strategy aims to integrate specific fine-tuning for medical applications into the robust Qwen2.5 architecture.

Key Characteristics

  • Base Model: Qwen2.5-1.5B-Instruct, providing a strong general language understanding foundation.
  • Merge Method: Utilizes the Linear DARE (DARE_p0.5) technique, as described in the arXiv paper, to effectively blend model weights.
  • Specialization: The inclusion of outputs/model_sft_lora indicates a focus on a specific domain, likely medical, given the model's name.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer texts relevant to medical documents or patient histories.

Intended Use Cases

This model is particularly suited for applications requiring domain-specific knowledge in the medical field. Its fine-tuned nature suggests improved performance on tasks such as:

  • Medical text summarization.
  • Answering medical questions.
  • Extracting information from clinical notes.
  • Assisting with medical report generation.