Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p2_1p0_grpo_dr_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm

Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p2_1p0_grpo_dr_grpo_42_rule is a 1.7 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities in large language models. It is optimized for tasks requiring robust mathematical and logical processing, making it suitable for applications where precise reasoning is critical. The model supports a context length of 32768 tokens.

Loading preview...