shaikabdulfahad/wordle-qwen2-mini
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The shaikabdulfahad/wordle-qwen2-mini is a 0.5 billion parameter Qwen2-Instruct model fine-tuned by Shaik Abdul Fahad using Group Relative Policy Optimization (GRPO) reinforcement learning. This model is specifically designed to play the Wordle game, learning optimal strategies purely from reward signals rather than supervised examples. It excels at strategic word guessing, including opening with vowel-rich words and utilizing feedback effectively, with a context length of 32768 tokens.

Loading preview...