michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT
The michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT model is a 1.5 billion parameter language model, based on the Qwen2.5 architecture, specifically fine-tuned for mathematical reasoning tasks. It leverages a Supervised Fine-Tuning (SFT) approach on the GSM8K dataset, demonstrating a focus on generating structured thinking processes and final answers for arithmetic problems. With a context length of 32768 tokens, this model is optimized for solving grade school math word problems.
Loading preview...
michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT Overview
This model is a 1.5 billion parameter variant of the Qwen2.5 architecture, developed by michaelbzhu, and has undergone Supervised Fine-Tuning (SFT) specifically on the GSM8K dataset. Its primary focus is on enhancing mathematical reasoning capabilities, particularly for grade school level arithmetic problems.
Key Capabilities & Training Details
- Mathematical Reasoning: The model is fine-tuned to process and solve mathematical word problems from the GSM8K dataset.
- Structured Output: It is trained to generate responses in a specific format, including thinking steps (
<think>{thinking tokens}</think>) followed by the final answer (<answer>{final answer}</answer>). This structured output aims to provide transparent reasoning. - Performance on GSM8K: After 2 epochs of fine-tuning, the model achieved a "correct format" rate of 1260/1319 and a "correct reward" rate of 515/1319 on the GSM8K test set, indicating its ability to adhere to the desired output structure and provide accurate solutions.
- Context Length: Supports a substantial context length of 32768 tokens, allowing for processing longer problem descriptions or multi-step reasoning.
When to Use This Model
- Mathematical Problem Solving: Ideal for applications requiring the solution of arithmetic and grade school level math word problems.
- Reasoning Transparency: Suitable for use cases where not just the answer, but also the step-by-step thought process leading to the answer, is important due to its structured output format.
- Educational Tools: Can be integrated into educational platforms for generating solutions or explanations for math problems.