hector-gr/RLCR-v4-ks-highcov-volume-cold-math
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 28, 2026Architecture:Transformer Cold

hector-gr/RLCR-v4-ks-highcov-volume-cold-math is a 7.6 billion parameter language model fine-tuned by hector-gr, based on the Qwen/Qwen2.5-7B architecture. This model was trained using GRPO (Gradient-based Reinforcement Learning with Policy Optimization), a method specifically designed to enhance mathematical reasoning capabilities in large language models. With a context length of 32768 tokens, it is optimized for tasks requiring advanced mathematical problem-solving and complex reasoning.

Loading preview...