Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step2000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

The Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step2000 model is a 1.5 billion parameter language model based on the Qwen2.5 architecture. This model is specifically fine-tuned for mathematical reasoning tasks, particularly those found in the GSM8K dataset. With a context length of 32768 tokens, it is designed to handle complex numerical problems and multi-step arithmetic. Its primary strength lies in its ability to process and solve grade-school level math problems.

Loading preview...