hector-gr/RLCR-v4-ks-uniqueness-buf5k-cold-math
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 28, 2026Architecture:Transformer Cold

The hector-gr/RLCR-v4-ks-uniqueness-buf5k-cold-math model is a 7.6 billion parameter language model, fine-tuned from Qwen/Qwen2.5-7B. It was trained using the TRL framework and incorporates the GRPO method, specifically optimizing for mathematical reasoning tasks. This model is designed to enhance performance in complex mathematical problem-solving, building upon the capabilities of its base architecture.

Loading preview...