Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p1_1p0_grpo_dr_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm

Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_1p0_0p1_1p0_grpo_dr_grpo_42_rule is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities in large language models. With a context length of 32768 tokens, it is particularly suited for tasks requiring robust logical and mathematical processing. Its fine-tuning focuses on improving performance in complex reasoning scenarios.

Loading preview...