yasserrmd/Coder-GRPO-3B
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Feb 8, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Coder-GRPO-3B is a 3 billion parameter instruction-tuned causal language model developed by yasserrmd, based on Qwen/Qwen2.5-3B-Instruct. It is fine-tuned using Group Relative Policy Optimization (GRPO) on the glaiveai/glaive-code-assistant dataset. This model excels at code reasoning and generation, producing short, correct programs and concise explanations. Its primary strength lies in high-signal code tasks such as writing, refactoring, explaining, and fixing code.

Loading preview...