eth-nlped/MathDial-SFT-Qwen2.5-1.5B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Sep 17, 2025Architecture:Transformer Warm

The eth-nlped/MathDial-SFT-Qwen2.5-1.5B-Instruct model is a 1.5 billion parameter supervised fine-tuned (SFT) language model based on the Qwen2.5 architecture, developed by eth-nlped. It is specifically optimized for conversational math problem-solving, excelling at step-by-step reasoning within a dialogue context. This model is designed to provide scaffolding and interactive tutoring for math word problems, making it suitable for educational tools and research in dialogue-based problem solving.

Loading preview...

Overview

This model, developed by eth-nlped, is a supervised fine-tuned (SFT) language model built upon the Qwen2.5-1.5B-Instruct base model. Its core differentiator is its training on the MathDial dataset, which comprises conversational math word problems where a tutor guides a student through solutions step-by-step.

Key Capabilities

  • Conversational Math Problem Solving: Designed to engage in dialogue to solve math word problems.
  • Step-by-Step Reasoning: Excels at breaking down complex problems into manageable steps within a conversation.
  • Scaffolding: Provides guidance and support, mimicking a tutor's approach to help students understand solutions.
  • Context-Aware Responses: Utilizes a sliding window approach during training to ensure responses are relevant to the entire conversation history.

Training Details

The model was fine-tuned using the Hugging Face transformers and trl frameworks over 3 epochs. Each training example included an instruction, student's name, math word problem with solution, and the student's initial approach as input, with the tutor's step-by-step solution as the target output.

Intended Use Cases

  • Interactive math tutoring systems.
  • Research into dialogue-based problem-solving methodologies.
  • Integration into educational software and tools requiring guided mathematical assistance.