modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8
The modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8 is a 4 billion parameter language model derived from Qwen3-4B-Base, created by modrill. This model is a result of a Task Arithmetic merge, combining a fine-tuned Qwen3-4B-Base with its original base model using a scaling coefficient of 0.8. It is specifically designed to leverage the strengths of both the base and the fine-tuned versions for enhanced performance in its target domain.
Loading preview...
Model Overview
The modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_8 is a 4 billion parameter language model developed by modrill. It is constructed using a Task Arithmetic merge method, which combines two models: math_think_11_qwen3_4b_base_sft (a fine-tuned version of Qwen3-4B-Base) and the original Qwen/Qwen3-4B-Base model. This merging technique applies a scaling coefficient of 0.8 to the difference between the fine-tuned and base model parameters, adding it back to the base model.
Key Characteristics
- Architecture: Based on the Qwen3-4B-Base model.
- Parameter Count: 4 billion parameters.
- Merging Method: Utilizes Task Arithmetic, a technique for combining the learned capabilities of different models.
- Scaling Coefficient: A specific scaling factor of
0.8was applied during the merge process, indicating a weighted contribution from the fine-tuned model.
Intended Use Cases
This model is particularly suitable for applications where a blend of the original base model's general capabilities and the specialized knowledge from the math_think_11_qwen3_4b_base_sft fine-tuning is beneficial. The Task Arithmetic approach aims to enhance specific task performance while retaining broader foundational knowledge.