rediska0123/qwen2.5-math-1.5b-dpo-gsm8k-v2
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 4, 2026Architecture:Transformer Warm

The rediska0123/qwen2.5-math-1.5b-dpo-gsm8k-v2 model is a 1.5 billion parameter language model with a 32768 token context length. Developed by rediska0123, this model is fine-tuned for mathematical reasoning tasks, specifically leveraging DPO (Direct Preference Optimization) on the GSM8K dataset. Its primary strength lies in solving grade school math problems, making it suitable for applications requiring numerical and logical problem-solving capabilities.

Loading preview...