Model Overview
tripathysagar/Qwen2.5-0.5B-GSM8K-SFT is a specialized language model built upon the Qwen/Qwen2.5-0.5B architecture. It has been fine-tuned using Supervised Fine-Tuning (SFT) with LoRA on the GSM8K dataset, specifically targeting mathematical reasoning tasks.
Key Capabilities
- Mathematical Reasoning: Excels at solving grade-school level math word problems.
- Structured Output: Designed to provide answers in a consistent format:
The answer is: {number}. - Step-by-Step Solutions: Follows instructions to generate detailed solution steps before presenting the final numerical answer.
Training Details
The model underwent a single epoch of SFT using LoRA (r=32, alpha=16) targeting all linear layers. It was trained on 1024 examples with a learning rate of 0.0002 and a batch size of 8x4 (with gradient accumulation), utilizing bf16 precision.
Ideal Use Cases
This model is particularly well-suited for applications requiring:
- Automated math problem-solving.
- Educational tools for generating math solutions.
- Integration into systems where precise, numerically formatted answers to arithmetic problems are needed.