Name: anirvankrishna/model_sft_resta_dare API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: anirvankrishna

Model Overview

The anirvankrishna/model_sft_resta_dare is a 1.5 billion parameter language model built upon the Qwen2.5-1.5B-Instruct base architecture. It was developed by anirvankrishna using the mergekit tool, specifically employing the Task Arithmetic merge method.

Key Characteristics

Base Model: Qwen/Qwen2.5-1.5B-Instruct, providing a strong foundation for language understanding and generation.
Merge Method: Utilizes Task Arithmetic, a technique described in the paper "Task Arithmetic" (arXiv:2212.04089), to combine model weights.
Merged Components: Integrates anirvankrishna/model_harmful_lora_fused with the Qwen base model, with a specific configuration that applies a negative weight to the fused model's layers.
Context Length: Supports a context window of 32,768 tokens.

Intended Use Cases

This model is suitable for applications requiring a language model with the characteristics derived from its unique merging strategy. Developers can leverage its capabilities for tasks where the specific combination of its base and merged components offers an advantage over standalone models. Its 32K context length makes it suitable for processing longer inputs and generating more extensive responses.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)