Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step8000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step8000 is a 1.5 billion parameter Qwen2.5-based model, fine-tuned specifically for mathematical reasoning tasks. This model is optimized for performance on the GSM8K dataset, indicating its strength in grade school math problems. It features a substantial context length of 32768 tokens, making it suitable for processing complex mathematical queries and multi-step problems.

Loading preview...