Kazuki1450/Qwen3-1.7B-Base_csum_3_10_1p0_0p0_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 18, 2026Architecture:Transformer Cold

Kazuki1450/Qwen3-1.7B-Base_csum_3_10_1p0_0p0_1p0_grpo_42_rule is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the TRL framework and incorporates the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is suitable for tasks requiring improved logical and mathematical problem-solving, building upon the base Qwen3-1.7B architecture.

Loading preview...