asparius/Qwen2.5-1.5B-SPO-1ep-iter2
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 24, 2025Architecture:Transformer Warm

The asparius/Qwen2.5-1.5B-SPO-1ep-iter2 model is a 1.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-1.5B. It was specifically trained on the DigitalLearningGmbH/MATH-lighteval dataset using the GRPO method, which is designed to enhance mathematical reasoning. This model is optimized for tasks requiring strong mathematical problem-solving capabilities, leveraging its 131072 token context length for complex calculations.

Loading preview...