SamsungSAILMontreal/Qwen3-4B-Instruct-2507-Math
SamsungSAILMontreal/Qwen3-4B-Instruct-2507-Math is a 4 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. This model is specifically optimized for mathematical reasoning tasks, leveraging the gsm8k dataset for training. It is designed to excel in solving grade school math problems, making it suitable for applications requiring numerical problem-solving capabilities.
Loading preview...
Qwen3-4B-Instruct-2507-Math Overview
This model is a specialized 4 billion parameter instruction-tuned variant, derived from the Qwen/Qwen3-4B-Instruct-2507 base model. Its primary differentiation lies in its fine-tuning on the gsm8k dataset, which focuses on grade school mathematical reasoning problems. This targeted training enhances its ability to process and solve numerical questions.
Key Capabilities
- Mathematical Reasoning: Optimized for solving arithmetic and word problems commonly found in grade school mathematics.
- Instruction Following: Retains the instruction-following capabilities of its base Qwen model, adapted for mathematical contexts.
- Efficient Fine-tuning: Developed using the TRL library with SFT/full-rank options, demonstrating a practical approach to domain-specific adaptation.
Performance and Use Cases
While the fine-tuning process resulted in a slight decrease in direct gsm8k score compared to the pre-fine-tuned Qwen3-4B-Instruct-2507 (76.8 vs 80.4), this model is specifically intended for research and applications where a dedicated mathematical problem-solving focus is beneficial. It serves as an experimental model for exploring meta-merge techniques, as detailed in the associated blog post. Developers can leverage this model for tasks requiring focused mathematical understanding and generation, particularly within educational technology or automated problem-solving systems.