Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step3000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm
Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step3000 is a 1.5 billion parameter language model based on the Qwen2.5 architecture. This model is specifically fine-tuned for mathematical reasoning, particularly on the GSM8K dataset, indicating an optimization for grade-school level math word problems. Its primary use case is to provide enhanced performance in numerical and logical problem-solving tasks.
Loading preview...