Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step1000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026Architecture:Transformer Warm
The Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step1000 is a 1.5 billion parameter language model, likely based on the Qwen2.5 architecture, fine-tuned for mathematical reasoning tasks. With a context length of 32768 tokens, this model is specifically optimized for performance on the GSM8K dataset, indicating its strength in grade school math problems. Its primary application is in numerical problem-solving and educational AI tools requiring arithmetic and logical deduction.
Loading preview...