Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step4000
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 24, 2026Architecture:Transformer Warm

The Ilia2003Mah/qwen2.5-1.5b-gsm8k-train-step4000 is a 1.5 billion parameter Qwen2.5-based language model with a 32768 token context length. This model is a fine-tuned version, specifically trained for mathematical reasoning tasks, as indicated by its GSM8K training. It is intended for applications requiring numerical problem-solving capabilities.

Loading preview...