hector-gr/RLCR-v4-ks-uniqueness-hotpot-aliases-qwen35-balanced
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 8, 2026Architecture:Transformer Cold
The hector-gr/RLCR-v4-ks-uniqueness-hotpot-aliases-qwen35-balanced model is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B. Developed by hector-gr, it utilizes the GRPO training method, as introduced in the DeepSeekMath paper, for enhanced performance. This model is specifically optimized for tasks requiring advanced reasoning, leveraging its Qwen2.5 base and specialized training. It features a context length of 32768 tokens, making it suitable for complex conversational and analytical applications.
Loading preview...