daviddavidlu/DAPO-with-prompt-augmentation-step2820
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 3, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The daviddavidlu/DAPO-with-prompt-augmentation-step2820 model is a Qwen2.5-Math-1.5B checkpoint trained using DAPO (no dynamic sampling) with prompt augmentation on the MATH Level-3-to-5 Dataset. Developed by daviddavidlu, this model is specifically designed for mathematical reasoning tasks. It leverages prompt augmentation to enhance rollout diversity and stability during reinforcement learning training, making it suitable for complex mathematical problem-solving.

Loading preview...