Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p0_1p0_grpo_dr_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm
Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p0_1p0_grpo_dr_grpo_42_rule is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on enhancing mathematical reasoning capabilities. It is optimized for tasks requiring robust mathematical and logical processing, making it suitable for specialized applications in this domain. The model has a context length of 32768 tokens.
Loading preview...