Kazuki1450/Qwen3-1.7B-Base_csum_6_10_1p0_0p0_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 11, 2026Architecture:Transformer Cold
Kazuki1450/Qwen3-1.7B-Base_csum_6_10_1p0_0p0_1p0_grpo_42_rule is a fine-tuned version of the Qwen/Qwen3-1.7B-Base model, developed by Kazuki1450. This model was trained using the TRL framework and incorporates the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is specifically optimized for tasks requiring advanced mathematical problem-solving, building upon the foundational Qwen3-1.7B-Base architecture.
Loading preview...