thangvip/qwen3-1.7b-dspo-no-sft-sgd-linear-6500
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Feb 7, 2026Architecture:Transformer Warm
The thangvip/qwen3-1.7b-dspo-no-sft-sgd-linear-6500 model is a fine-tuned version of Qwen/Qwen3-1.7B, developed by thangvip. This 1.7 billion parameter model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is particularly suited for tasks requiring advanced mathematical problem-solving, building upon the base Qwen3 architecture.
Loading preview...