Name: krishdebroy/model_sft_dare_resta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: krishdebroy

Overview

The krishdebroy/model_sft_dare_resta is a 1.5 billion parameter language model created by krishdebroy. It was developed using the MergeKit tool and the Task Arithmetic merge method, building upon the Qwen/Qwen2.5-1.5B-Instruct as its base model. This merging approach allows for the combination of specific characteristics from different pre-trained models.

Key Capabilities

Merged Architecture: Integrates features from Qwen/Qwen2.5-1.5B-Instruct with additional models, specifically krishdebroy/model_sft_dare and a local LORA model (/kaggle/working/model_harmful_lora).
Task Arithmetic Method: Utilizes a specific merging technique that allows for weighted combination of model parameters, including negative weighting for certain components.
Parameter Count: Operates with 1.5 billion parameters, offering a balance between performance and computational efficiency.
Context Length: Supports a substantial context window of 32768 tokens, suitable for processing longer inputs and maintaining conversational coherence over extended interactions.

Good For

Experimental Merging: Ideal for researchers and developers interested in exploring the effects of model merging, particularly with the Task Arithmetic method.
Customized Behavior: Potentially useful for creating models with highly specific or modified behaviors by combining and adjusting the influence of different source models.
Applications requiring Qwen2.5-1.5B-Instruct base: Suitable for tasks where the base model's capabilities are desired, with added modifications from the merged components.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)