marioIsGoated/qwen2.5-math-1.5b-dpo-gsm8k

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 10, 2026Architecture:Transformer Warm

The marioIsGoated/qwen2.5-math-1.5b-dpo-gsm8k is a 1.5 billion parameter Qwen2.5-based language model developed by marioIsGoated. This model is fine-tuned for mathematical reasoning tasks, specifically targeting performance on the GSM8K dataset. With a context length of 32768 tokens, it is designed to excel in solving complex arithmetic and word problems.

Loading preview...

Model Overview

The marioIsGoated/qwen2.5-math-1.5b-dpo-gsm8k is a compact yet capable language model, built upon the Qwen2.5 architecture with 1.5 billion parameters. Developed by marioIsGoated, this model has been specifically fine-tuned using Direct Preference Optimization (DPO) to enhance its performance on mathematical reasoning tasks, particularly those found in the GSM8K dataset. It supports a substantial context length of 32768 tokens, allowing it to process longer and more complex problem descriptions.

Key Capabilities

  • Mathematical Reasoning: Optimized for solving arithmetic and word problems, making it suitable for educational or analytical applications requiring numerical understanding.
  • Qwen2.5 Architecture: Leverages the robust foundation of the Qwen2.5 series, known for its general language understanding capabilities.
  • Extended Context Window: A 32768-token context length facilitates handling multi-step problems and detailed mathematical descriptions.

Good For

  • Educational Tools: Integrating into platforms that assist students with math homework or problem-solving.
  • Automated Problem Solving: Applications requiring automated solutions or verification for mathematical questions.
  • Research in Mathematical LLMs: As a base model for further experimentation and fine-tuning on specific mathematical domains.