allout2726/model_sft_dare
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 2, 2026Architecture:Transformer Cold

The allout2726/model_sft_dare is a merged language model based on Qwen/Qwen2.5-1.5B-Instruct, created using the DARE TIES merge method. This model integrates a fine-tuned component, making it suitable for tasks benefiting from merged model architectures. It is designed for applications requiring a compact yet capable language model derived from a Qwen 2.5 base.

Loading preview...

Model Overview

The allout2726/model_sft_dare is a language model created by allout2726 through a merging process using mergekit. It is built upon the Qwen/Qwen2.5-1.5B-Instruct base model, indicating its foundation in the Qwen 2.5 architecture, known for its instruction-following capabilities.

Key Characteristics

  • Merge Method: This model specifically utilizes the DARE TIES merge method, as detailed in the research paper DARE TIES. This method is designed to combine the strengths of multiple pre-trained models.
  • Base Model: The merging process started with Qwen/Qwen2.5-1.5B-Instruct, a 1.5 billion parameter instruction-tuned model from the Qwen family.
  • Merged Components: The model incorporates a component identified as /kaggle/working/temp_sft_full, suggesting the integration of a specific fine-tuned (SFT) model into the base.
  • Configuration: The merge was performed with a density parameter of 0.30 and a weight of 1.0 for the additional model, indicating a specific strategy for combining the model weights.

Potential Use Cases

Given its foundation in an instruction-tuned Qwen 2.5 model and the application of the DARE TIES merge method, this model is likely suitable for:

  • Instruction Following: Leveraging the capabilities inherited from its Qwen 2.5 Instruct base.
  • Specific Domain Tasks: If the /kaggle/working/temp_sft_full component was fine-tuned on a particular dataset, the merged model would excel in that domain.
  • Experimentation with Merged Architectures: Developers interested in exploring the performance benefits of DARE TIES merging for compact models.