Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step2500
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step2500 is a 1.5 billion parameter language model, likely based on the Qwen2.5 architecture, fine-tuned for specific tasks. This model is trained for 2500 steps on the GSM8K dataset, indicating a specialization in mathematical reasoning and problem-solving. Its compact size makes it suitable for applications requiring efficient on-device deployment or lower computational overhead while focusing on numerical accuracy.

Loading preview...