Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p5_1p0_grpo_dr_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm

Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p5_1p0_grpo_dr_grpo_42_rule is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, as detailed in the DeepSeekMath paper, to enhance mathematical reasoning capabilities. It is optimized for tasks requiring robust logical and mathematical processing, making it suitable for specialized analytical applications. The model leverages a 32768 token context length for processing extensive inputs.

Loading preview...