Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step1500
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm
The Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step1500 is a 1.5 billion parameter Qwen2.5-based language model, fine-tuned specifically for mathematical reasoning and problem-solving tasks. With a context length of 32768 tokens, this model is optimized to excel in arithmetic and logical challenges, making it suitable for applications requiring precise numerical and analytical capabilities. Its training focuses on enhancing performance in areas like the GSM8K dataset, distinguishing it from general-purpose LLMs.
Loading preview...