hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-hotpot
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Cold

The hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-hotpot model is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B. Developed by hector-gr, this model utilizes the GRPO training method, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring advanced reasoning, building upon the robust foundation of the Qwen2.5 architecture.

Loading preview...