rediska0123/qwen2.5-math-1.5b-dpo-gsm8k-v3
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 4, 2026Architecture:Transformer Warm

The rediska0123/qwen2.5-math-1.5b-dpo-gsm8k-v3 model is a 1.5 billion parameter language model based on the Qwen2.5 architecture. It is fine-tuned using Direct Preference Optimization (DPO) specifically for mathematical reasoning tasks, as indicated by its GSM8K-v3 training. This model is designed to excel in solving grade school level math problems and similar quantitative challenges.

Loading preview...