NathanRoll/writing-rlvr-qwen2.5-1.5b
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Feb 23, 2026Architecture:Transformer Warm

The NathanRoll/writing-rlvr-qwen2.5-1.5b is a 1.5 billion parameter language model fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring advanced reasoning, particularly in mathematical contexts, making it suitable for specialized applications in this domain. The model supports a context length of 32768 tokens.

Loading preview...