TMLR-Group-HF/GT-Qwen3-8B-Base-MATH
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Aug 5, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

The TMLR-Group-HF/GT-Qwen3-8B-Base-MATH model is an 8 billion parameter Qwen3-Base variant, developed by TMLR-Group-HF. It is specifically trained using the GRPO Ground Truth method with a MATH training set, as detailed in the Co-rewarding paper. This model is optimized for eliciting reasoning in large language models, making it particularly suitable for complex mathematical and reasoning tasks. It features a 32768 token context length, enhancing its ability to handle extensive problem descriptions.

Loading preview...