hector-gr/RLCR-v4-ks-highcov-batch-hotpot
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 28, 2026Architecture:Transformer Cold

The hector-gr/RLCR-v4-ks-highcov-batch-hotpot model is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B by hector-gr. It utilizes the GRPO method, as introduced in the DeepSeekMath paper, to enhance mathematical reasoning capabilities. With a context length of 32768 tokens, this model is optimized for tasks requiring advanced reasoning and problem-solving, particularly in areas benefiting from robust mathematical understanding.

Loading preview...