hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-noece-noaurc-scaletrue-batchcov-cold-math
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Cold

The hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-noece-noaurc-scaletrue-batchcov-cold-math model is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B. It was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on mathematical reasoning. This model is optimized for tasks requiring advanced mathematical problem-solving and logical deduction.

Loading preview...