ViratChauhan/Qwen3-4B-RL
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 26, 2026Architecture:Transformer Warm

ViratChauhan/Qwen3-4B-RL is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B. Developed by ViratChauhan, this model utilizes the TRL framework and the GRPO method, as introduced in the DeepSeekMath paper, for its training procedure. It is designed for general text generation tasks, demonstrating capabilities in conversational responses and potentially mathematical reasoning due to its training methodology. The model has a context length of 32768 tokens.

Loading preview...