Name: quangdung/Qwen2.5-7B-Math-Distill-Sens API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: quangdung

Model Overview

quangdung/Qwen2.5-7B-Math-Distill-Sens is a 7.6 billion parameter model developed by quangdung, resulting from the application of Sensitivity-aware Model Merging (Sens Merging) to two base models: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and Qwen/Qwen2.5-Math-7B. The primary goal of this merge is to create a model that retains the robust mathematical reasoning capabilities of DeepSeek-R1-Distill while drastically reducing the verbosity and token length of its outputs, thereby lowering inference costs.

Key Capabilities & Performance

Optimized Mathematical Reasoning: Achieves an average accuracy of 66.9% across various mathematical benchmarks, including College Math, GSM8K, MATH, Minerva Math, and OlympiadBench.
Reduced Output Verbosity: Produces significantly shorter outputs, with an average of 701 tokens per response. This represents a 75.2% reduction in output tokens compared to the DeepSeek-R1-Distill-Qwen-7B base model.
Cost-Effective Inference: The substantial reduction in output length directly translates to lower inference costs without requiring additional gradient-based fine-tuning.
Competitive Accuracy: Maintains strong reasoning performance, with only a 2.5-point average accuracy drop compared to the more verbose DeepSeek-R1-Distill-Qwen-7B.

When to Use This Model

This model is ideal for applications requiring accurate mathematical problem-solving where inference cost and output length are critical considerations. It offers an attractive trade-off between reasoning quality and efficiency, making it suitable for scenarios where concise, yet correct, mathematical explanations are preferred over lengthy chains of thought.

Overview

Model Overview

Key Capabilities & Performance

When to Use This Model

Full Model Card (README)