Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_rel_1e2_1p0_0p0_1p0_grpo_42_rule
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

Kazuki1450/Qwen3-1.7B-Base_dsum_3_6_rel_1e2_1p0_0p0_1p0_grpo_42_rule is a 1.7 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B-Base. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning in language models. It is specifically optimized for tasks requiring robust mathematical problem-solving capabilities, leveraging techniques from DeepSeekMath research. The model has a context length of 32768 tokens, making it suitable for processing longer inputs in its specialized domain.

Loading preview...