od2961/Qwen2.5-1.5B-Open-R1-SFT is a 1.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. This model specializes in mathematical reasoning and problem-solving, having been trained on the OpenR1-Math-220k dataset. It is optimized for tasks requiring numerical and logical computation, making it suitable for applications in quantitative analysis and educational tools.
Loading preview...
Model Overview
This model, od2961/Qwen2.5-1.5B-Open-R1-SFT, is a specialized 1.5 billion parameter language model. It is a fine-tuned version of the base Qwen/Qwen2.5-1.5B-Instruct model, specifically enhanced for mathematical tasks.
Key Capabilities
- Mathematical Reasoning: The model has undergone Supervised Fine-Tuning (SFT) on the open-r1/OpenR1-Math-220k dataset, which focuses on mathematical problems.
- Instruction Following: Inherits instruction-following capabilities from its base Qwen2.5-1.5B-Instruct model.
- Efficient Performance: With 1.5 billion parameters, it offers a balance between performance and computational efficiency for specialized tasks.
Training Details
The model was trained using the TRL (Transformer Reinforcement Learning) library, leveraging SFT techniques. The training process utilized specific versions of frameworks including TRL 0.16.0, Transformers 4.50.0, and Pytorch 2.6.0+cu124.
Good For
- Mathematical Problem Solving: Ideal for applications requiring the generation or understanding of mathematical solutions.
- Educational Tools: Can be integrated into systems for tutoring or generating math-related content.
- Quantitative Analysis: Suitable for tasks involving numerical reasoning and data interpretation.