CM/Qwen2.5-1.5B-Open-R1-Code-GRPO
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 21, 2025Architecture:Transformer Warm
CM/Qwen2.5-1.5B-Open-R1-Code-GRPO is a 1.5 billion parameter language model developed by CM, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It is specifically optimized for code generation and problem-solving, leveraging the GRPO training method. This model excels at verifiable coding tasks, making it suitable for applications requiring robust code output within its 32768 token context length.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–