leonMW/DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Sep 1, 2025Architecture:Transformer Warm

The leonMW/DeepSeek-R1-Distill-Qwen-1.5B-GSPO-Basic model is a 1.5 billion parameter language model, fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B with a 32768 token context length. It was trained using the GRPO method, introduced in the DeepSeekMath paper, to enhance mathematical reasoning capabilities. This model is optimized for tasks requiring robust reasoning, particularly in mathematical contexts, leveraging its specialized training approach.

Loading preview...