Name: wvnvwn/qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wvnvwn

Model Overview

This model, wvnvwn/qwen-2.5-7B-Instruct-Resta-lr5e-5-scale0.3, is a 7.6 billion parameter instruction-tuned language model built upon the Qwen 2.5 architecture. It was created using a linear merge method via mergekit, combining three distinct pre-trained models to achieve its characteristics.

Merge Details

The model is a composite of:

Qwen/Qwen2.5-7B-Instruct: The base instruction-tuned model from Qwen.
wvnvwn/qwen-2.5-7B-Instruct-SSFT-lr5e-5: A specialized instruction-tuned variant.
wvnvwn/qwen-2.5-7B-Instruct-SSFT-gsm8k-lr5e-5: Another specialized instruction-tuned variant, likely with a focus on mathematical reasoning given the 'gsm8k' identifier.

Configuration

The merge utilized a specific weighting scheme:

wvnvwn/qwen-2.5-7B-Instruct-SSFT-gsm8k-lr5e-5 contributed with a weight of 1.0.
wvnvwn/qwen-2.5-7B-Instruct-SSFT-lr5e-5 contributed with a weight of 0.3.
Qwen/Qwen2.5-7B-Instruct had a negative weight of -0.3.

This configuration suggests an attempt to emphasize the characteristics of the specialized SSFT models while potentially mitigating or adjusting aspects of the base Qwen 2.5 Instruct model. Developers can explore this model for tasks where a blend of these specific instruction-tuned capabilities is desired.

Overview

Model Overview

Merge Details

Configuration

Full Model Card (README)