Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step9000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

The Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step9000 is a 1.5 billion parameter language model, fine-tuned from the Qwen2.5 architecture. This model is specifically trained for mathematical reasoning tasks, particularly on the GSM8K dataset. It aims to provide enhanced performance in solving grade school math problems. Its compact size makes it suitable for applications requiring efficient mathematical problem-solving capabilities.

Loading preview...