itsmepv/model_sft_dare_resta
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer0.0K Cold

The itsmepv/model_sft_dare_resta is a 1.5 billion parameter language model, merged using the Task Arithmetic method with Qwen/Qwen2.5-1.5B-Instruct as its base. This model integrates 'itsmepv/model_sft_dare' and 'fused_harmful_full' to create a specialized variant. It is designed for specific applications derived from its unique merging strategy, offering a distinct profile compared to its base models.

Loading preview...

Overview

itsmepv/model_sft_dare_resta is a 1.5 billion parameter language model created by itsmepv. It was developed using the Task Arithmetic merge method, leveraging Qwen/Qwen2.5-1.5B-Instruct as its foundational base model. This merging technique allows for the combination of specific characteristics from different pre-trained models.

Merge Details

The model incorporates two distinct components:

  • itsmepv/model_sft_dare: This model was included with a weight of 1.0.
  • ./fused_harmful_full: This component was integrated with a negative weight of -1.0, suggesting an intent to subtract or mitigate certain characteristics present in this model.

This unique configuration, defined by a YAML file, indicates a deliberate effort to fine-tune the model's behavior by combining and potentially counteracting the influences of its constituent parts. The model's 32768-token context length provides ample capacity for processing longer sequences.

Potential Use Cases

Given its specialized merge, this model is likely suitable for:

  • Targeted applications: Where the specific blend of itsmepv/model_sft_dare and the inverse of fused_harmful_full is beneficial.
  • Research into model merging: For understanding the effects of Task Arithmetic with both positive and negative weighting.
  • Fine-tuning for niche tasks: Where a base model's general capabilities need to be precisely adjusted by incorporating or removing specific learned behaviors.