sergiopaniego/wordle-grpo-Qwen3-1.7B
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Jan 29, 2026Architecture:Transformer Cold

The sergiopaniego/wordle-grpo-Qwen3-1.7B is a 1.7 billion parameter language model, fine-tuned by sergiopaniego from the Qwen/Qwen3-1.7B base model. It was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on mathematical reasoning. This model is designed to enhance reasoning capabilities, particularly in areas related to mathematical problem-solving.

Loading preview...