rediska0123/qwen2.5-math-1.5b-dpo-gsm8k

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 3, 2026Architecture:Transformer Warm

rediska0123/qwen2.5-math-1.5b-dpo-gsm8k is a 1.5 billion parameter language model based on the Qwen2.5 architecture, featuring a 32768 token context length. This model is fine-tuned using DPO on the GSM8K dataset, indicating a specialization in mathematical reasoning and problem-solving tasks. It is designed for applications requiring strong numerical and logical capabilities.

Loading preview...

Model Overview

This model, rediska0123/qwen2.5-math-1.5b-dpo-gsm8k, is a 1.5 billion parameter language model built upon the Qwen2.5 architecture. It boasts a substantial context length of 32768 tokens, allowing it to process and understand extensive inputs.

Key Characteristics

  • Architecture: Qwen2.5 base model.
  • Parameter Count: 1.5 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports up to 32768 tokens, enabling handling of long-form content and complex problem descriptions.
  • Fine-tuning: Utilizes Direct Preference Optimization (DPO) on the GSM8K dataset.

Primary Specialization

This model is specifically fine-tuned for mathematical reasoning and problem-solving, leveraging the GSM8K dataset. This makes it particularly adept at tasks requiring numerical understanding, logical deduction, and step-by-step mathematical solutions.

Intended Use Cases

  • Mathematical Problem Solving: Ideal for generating solutions or explanations for arithmetic, algebra, and other mathematical challenges.
  • Educational Tools: Can be integrated into platforms for tutoring or generating math exercises.
  • Logical Reasoning: Applicable in scenarios demanding structured logical thought processes.