modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 20, 2026License:cc-by-nc-4.0Architecture:Transformer Open Weights Warm

The modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8 is a 4 billion parameter language model derived from Qwen3-4B-Base, created by modrill. This model is a result of a Task Arithmetic merge, combining a fine-tuned Qwen3-4B-Base with its original base model using a scaling coefficient of 0.8. It is specifically designed to leverage the strengths of both the base and the fine-tuned versions for enhanced performance in its target domain.

Loading preview...

Model Overview

The modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8 is a 4 billion parameter language model developed by modrill. It is constructed using a Task Arithmetic merge method, which combines two models: math_think_11_qwen3_4b_base_sft (a fine-tuned version of Qwen3-4B-Base) and the original Qwen/Qwen3-4B-Base model. This merging technique applies a scaling coefficient of 0.8 to the difference between the fine-tuned and base model parameters, adding it back to the base model.

Key Characteristics

  • Architecture: Based on the Qwen3-4B-Base model.
  • Parameter Count: 4 billion parameters.
  • Merging Method: Utilizes Task Arithmetic, a technique for combining the learned capabilities of different models.
  • Scaling Coefficient: A specific scaling factor of 0.8 was applied during the merge process, indicating a weighted contribution from the fine-tuned model.

Intended Use Cases

This model is particularly suitable for applications where a blend of the original base model's general capabilities and the specialized knowledge from the math_think_11_qwen3_4b_base_sft fine-tuning is beneficial. The Task Arithmetic approach aims to enhance specific task performance while retaining broader foundational knowledge.