Name: modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: modrill

Overview

This model, modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_5, is a 4 billion parameter language model built upon the Qwen3-4B-Base architecture. It was created using a technique called Task Arithmetic, which merges the weights of two models to combine their learned capabilities. Specifically, it merges a fine-tuned math_think_11 Qwen3-4B-Base SFT model with the original Qwen/Qwen3-4B-Base model.

Key Characteristics

Architecture: Qwen3-4B-Base, a 4 billion parameter model.
Merge Method: Utilizes Task Arithmetic, a method for combining model weights. The formula applied is theta = theta_base + scaling * (theta_sft - theta_base).
Source Models: Merges a Supervised Fine-Tuned (SFT) version of Qwen3-4B-Base (math_think_11_qwen3_4b_base_sft) with the foundational Qwen/Qwen3-4B-Base.
Scaling Coefficient: A scaling factor of 0.5 was applied during the merge process, indicating a balanced contribution from the SFT model's learned task-specific knowledge.

Potential Use Cases

This model is suitable for applications that can benefit from the combined strengths of a base Qwen3 model and a specialized SFT version. While the specific SFT task is not detailed, the merge suggests an intent to enhance performance in areas where the math_think_11 SFT model demonstrated proficiency, potentially related to mathematical reasoning or specific domain understanding.

Overview

Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)