Name: allout2726/model_sft_dare_resta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: allout2726

Model Overview

allout2726/model_sft_dare_resta is a 1.5 billion parameter language model developed by allout2726, built upon the Qwen/Qwen2.5-1.5B-Instruct base model. It was created using the Task Arithmetic merge method, a technique described in the paper "Task Arithmetic" (arXiv:2212.04089), which allows for combining the capabilities of multiple pre-trained models.

Key Merge Details

This model is a composite of two distinct components:

allout2726/model_sft_dare: Integrated with a weight of 1.0.
/kaggle/working/temp_harmful_full: Integrated with a negative weight of -1.0.

The use of a negative weight for the "harmful full" component suggests an intentional effort to subtract or mitigate specific characteristics associated with that model, likely related to safety, bias, or undesirable content generation. This makes model_sft_dare_resta particularly interesting for use cases where fine-grained control over model behavior, especially in terms of content filtering or safety alignment, is critical.

Technical Specifications

Base Model: Qwen/Qwen2.5-1.5B-Instruct
Parameter Count: 1.5 billion
Context Length: 32768 tokens
Merge Method: Task Arithmetic
Data Type: float16

Potential Use Cases

Given its unique merge configuration, this model is likely optimized for:

Content Moderation: Filtering or identifying specific types of content.
Safety Alignment: Developing models with enhanced safety features.
Specialized Instruction Following: Tasks where certain behaviors or outputs need to be suppressed or amplified through model merging.

Overview

Model Overview

Key Merge Details

Technical Specifications

Potential Use Cases

Full Model Card (README)