stellalisy/rethink_rlvr_reproduce-ground_truth-qwen2.5_math_7b-lr5e-7-kl0.00-step150
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 13, 2025Architecture:Transformer Cold

The stellalisy/rethink_rlvr_reproduce-ground_truth-qwen2.5_math_7b-lr5e-7-kl0.00-step150 is a 7.6 billion parameter language model based on the Qwen2.5 architecture, featuring a 32768 token context length. This model is specifically fine-tuned for mathematical reasoning and problem-solving tasks, aiming to reproduce ground truth results. Its primary strength lies in its specialized training for numerical and logical operations, making it suitable for applications requiring precise mathematical computation.

Loading preview...