Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p8_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p8_1p0_grpo_42_rule is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities in large language models. It is particularly suited for tasks requiring improved logical and mathematical problem-solving, building upon the base Qwen3 architecture.

Loading preview...