Name: anirvankrishna/model_sft_resta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: anirvankrishna

Model Overview

anirvankrishna/model_sft_resta is a 1.5 billion parameter language model built upon the Qwen2.5-1.5B-Instruct base architecture, supporting a 32768-token context length. This model was created using the mergekit tool, specifically employing the Task Arithmetic merge method.

Merge Details

The model is a composite of two distinct components:

Base Model: Qwen/Qwen2.5-1.5B-Instruct
Merged Component: anirvankrishna/model_harmful_lora_fused

This merging process involved applying a negative weight (-1.0) to the anirvankrishna/model_harmful_lora_fused component's layers (0 to 28) relative to the base model. This configuration suggests an experimental approach to modify or subtract specific learned behaviors or characteristics from the base model, rather than simply adding them.

Potential Use Cases

Research into Model Merging: Ideal for researchers studying the effects of Task Arithmetic, particularly with negative weighting, on model capabilities and biases.
Behavioral Modification: Can be used to explore how specific LORA-fused models, when subtracted, alter the base model's responses or mitigate certain characteristics.
Experimental Fine-tuning: Provides a foundation for further fine-tuning or analysis of models created through complex merging strategies.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)