Kazuki1450/Qwen3-0.6B_csum_6_10_clean_1p0_0p0_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 16, 2026Architecture:Transformer Cold

Kazuki1450/Qwen3-0.6B_csum_6_10_clean_1p0_0p0_1p0_grpo_42_rule is an 0.8 billion parameter language model fine-tuned from Qwen/Qwen3-0.6B. This model was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on enhancing mathematical reasoning. It is suitable for tasks requiring improved reasoning capabilities, particularly in mathematical contexts, building upon the Qwen3 architecture.

Loading preview...