hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-ece10-hotpot
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Cold

The hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-ece10-hotpot model is a 7.6 billion parameter language model developed by hector-gr, fine-tuned from Qwen/Qwen2.5-7B. This model was trained using the GRPO method, as introduced in the DeepSeekMath paper, focusing on mathematical reasoning. It is optimized for tasks requiring advanced reasoning capabilities, leveraging its foundation in a robust base model and specialized training approach.

Loading preview...