modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_6
The modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_6 is a 4 billion parameter language model based on the Qwen3-4B-Base architecture. It is a task arithmetic merge, specifically designed to enhance mathematical reasoning capabilities by combining a fine-tuned Qwen3-4B-Base model with its original base version. This model is optimized for arithmetic tasks, leveraging a scaling coefficient of 0.6 to balance the influence of the fine-tuned and base models.
Loading preview...
Overview
This model, modrill/math_think_11_qwen3_4b_base_task_arithmetic_scaling_0_6, is a 4 billion parameter language model built upon the Qwen3-4B-Base architecture. It is created using a task arithmetic merge method, which combines two models to achieve specific performance characteristics. The merge specifically targets enhancing mathematical reasoning.
Key Capabilities
- Mathematical Reasoning: The model is a result of merging a fine-tuned Qwen3-4B-Base model (specifically
math_think_11_qwen3_4b_base_sft) with the originalQwen/Qwen3-4B-Base. - Task Arithmetic: It utilizes the task arithmetic formula
theta = theta_base + scaling * (theta_sft - theta_base)with a scaling coefficient of0.6to integrate the specialized mathematical capabilities.
Good For
- Arithmetic Tasks: This model is particularly suited for applications requiring improved performance on arithmetic and mathematical reasoning problems, due to its specialized merging approach.
- Leveraging Qwen3-4B-Base: Users already familiar with or utilizing the Qwen3-4B-Base architecture can benefit from this version's enhanced mathematical focus.