kong3125/Qwen2.5-MATH-1.5B-BASE-RLOO-EP3-LR2e06

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 16, 2025Architecture:Transformer Cold

The kong3125/Qwen2.5-MATH-1.5B-BASE-RLOO-EP3-LR2e06 model is a fine-tuned version of Qwen's Qwen2.5-MATH-7B, specifically optimized for mathematical reasoning tasks. It was trained using the GRPO method on the jhn9803/hendrycks-math-with-answers dataset. This model is designed to excel in solving complex mathematical problems, leveraging techniques from DeepSeekMath for enhanced performance in this domain.

Loading preview...

Model Overview

This model, kong3125/Qwen2.5-MATH-1.5B-BASE-RLOO-EP3-LR2e06, is a specialized language model derived from Qwen's Qwen2.5-MATH-7B. It has undergone fine-tuning to significantly enhance its capabilities in mathematical reasoning.

Key Differentiators

Use Cases

This model is particularly well-suited for applications requiring strong mathematical reasoning, such as:

  • Automated problem-solving in mathematics.
  • Educational tools for math assistance.
  • Research in AI for mathematical understanding and generation.

Training Details

The model was trained with specific versions of key frameworks:

  • TRL: 0.18.0
  • Transformers: 4.52.3
  • Pytorch: 2.6.0
  • Datasets: 2.17.0
  • Tokenizers: 0.21.4