hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-ece10-cold-math
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 25, 2026Architecture:Transformer Cold
The hector-gr/RLCR-v4-ks-uniqueness-cov0-entropy100-ece10-cold-math model is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B. It was trained using the GRPO method, as introduced in the DeepSeekMath paper, which focuses on enhancing mathematical reasoning capabilities. This model is specifically optimized for complex mathematical tasks and logical problem-solving, leveraging its 32768 token context length for detailed analysis. Its training methodology aims to push the limits of mathematical reasoning in open language models.
Loading preview...