Kazuki1450/Qwen3-1.7B-Base_csum_3_10_tok_Sure_1p0_0p0_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 18, 2026Architecture:Transformer Cold

Kazuki1450/Qwen3-1.7B-Base_csum_3_10_tok_Sure_1p0_0p0_1p0_grpo_42_rule is a 1.7 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, as introduced in the DeepSeekMath paper, to enhance mathematical reasoning capabilities. It is designed for tasks requiring robust logical and mathematical problem-solving, leveraging its specialized training approach.

Loading preview...