Name: wvnvwn/qwen-2.5-7B-Resta-lr3e-5-scale0.3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wvnvwn

Model Overview

The wvnvwn/qwen-2.5-7B-Resta-lr3e-5-scale0.3 is a 7.6 billion parameter language model built upon the Qwen2.5-7B architecture. It was created by wvnvwn using the linear merge method via mergekit, combining multiple specialized versions of the base model.

Merge Details

This model is a composite of three distinct Qwen2.5-7B variants:

Qwen/Qwen2.5-7B: The foundational base model.
wvnvwn/qwen-2.5-7B-SSFT-gsm8k-lr3e-5: A version specifically fine-tuned, likely for mathematical reasoning tasks, given the 'gsm8k' identifier (GSM8K is a dataset for grade school math problems).
wvnvwn/qwen-2.5-7B-SSFT-lr3e-5: Another fine-tuned variant, contributing to general language capabilities.

The merge configuration applied specific weights to each component, with the GSM8K-tuned model receiving a weight of 1.0, the general fine-tuned model 0.3, and the base Qwen2.5-7B model a negative weight of -0.3 across all 28 layers. This unique weighting suggests an attempt to enhance specific capabilities while potentially mitigating others from the base model.

Potential Use Cases

Given its composition, this model is likely well-suited for:

Mathematical Reasoning: The inclusion of a GSM8K-tuned component suggests improved performance on quantitative and logical problem-solving.
General Language Understanding and Generation: Benefiting from the Qwen2.5-7B base and additional fine-tuning.
Applications requiring a balance of general knowledge and specific reasoning skills.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)