The zyl2023/Qwen2.5-1.5B-Open-R1-Distill model is a 1.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It was specifically trained on the OpenR1-Math-220k dataset using the TRL framework. This model is optimized for mathematical reasoning and problem-solving tasks, leveraging its specialized training data to enhance performance in this domain. Its primary strength lies in handling mathematical queries and generating relevant responses.
Loading preview...
Model Overview
The zyl2023/Qwen2.5-1.5B-Open-R1-Distill is a 1.5 billion parameter language model derived from the Qwen/Qwen2.5-1.5B-Instruct architecture. This model has undergone specialized fine-tuning using the TRL (Transformer Reinforcement Learning) framework.
Key Capabilities
- Mathematical Reasoning: The model's core strength lies in its enhanced ability to process and respond to mathematical problems, thanks to its training on the OpenR1-Math-220k dataset.
- Instruction Following: Inherits instruction-following capabilities from its base Qwen2.5-1.5B-Instruct model.
Training Details
This model was trained using Supervised Fine-Tuning (SFT) on the aforementioned mathematical dataset. The training process utilized TRL version 0.16.0, Transformers 4.50.0, and Pytorch 2.5.1+cu121. Further details on the training run can be visualized via Weights & Biases.
Good For
- Applications requiring mathematical problem-solving.
- Educational tools focused on math.
- Generating responses to quantitative questions.