ARAVIND8179986644/model_sft_dare_resta

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Warm

ARAVIND8179986644/model_sft_dare_resta is a 1.5 billion parameter language model with a 32768 token context length, created by ARAVIND8179986644 using the Task Arithmetic merge method. It is based on Qwen/Qwen2.5-1.5B-Instruct and incorporates components from ARAVIND8179986644/model_sft_dare and a local model named 'harmful_full'. This model is specifically designed through a merging process to combine and potentially adjust characteristics from its constituent models.

Loading preview...

Model Overview

ARAVIND8179986644/model_sft_dare_resta is a 1.5 billion parameter language model built upon the Qwen/Qwen2.5-1.5B-Instruct base model. It leverages a substantial 32768 token context window, making it suitable for processing longer inputs and generating extended outputs.

Merge Details

This model was constructed using the Task Arithmetic merge method, a technique designed to combine the capabilities of multiple pre-trained models. The merging process involved:

  • Base Model: Qwen/Qwen2.5-1.5B-Instruct
  • Included Models:
    • ARAVIND8179986644/model_sft_dare (with a weight of 1.0)
    • A local model identified as ./harmful_full (with a weight of -1.0)

The use of a negative weight for ./harmful_full suggests an intent to subtract or mitigate certain characteristics from that component during the merge, potentially to refine the model's behavior or remove undesirable traits.

Potential Use Cases

Given its architecture and the specific merge method, this model could be explored for applications requiring:

  • Refined Instruction Following: Building on the Qwen2.5-1.5B-Instruct base, it likely retains strong instruction-following capabilities.
  • Specific Task Adaptation: The Task Arithmetic merge allows for fine-grained control over how different model components contribute, potentially making it adaptable for niche tasks where specific behaviors need to be enhanced or suppressed.