daviddavidlu/DAPO-with-prompt-augmentation-step2720
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 5, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
The daviddavidlu/DAPO-with-prompt-augmentation-step2720 is a 1.5 billion parameter Qwen2.5-Math-based model, fine-tuned using DAPO with prompt augmentation on the MATH Level-3-to-5 Dataset. This model is specifically designed for mathematical reasoning tasks, leveraging prompt augmentation to enhance reasoning trace diversity and stability during reinforcement learning training. It excels at generating diverse reasoning steps for complex mathematical problems, making it suitable for advanced mathematical problem-solving applications. The model's training methodology focuses on improving performance in mathematical reasoning through innovative prompt augmentation techniques.
Loading preview...