Name: eth-nlped/TutorRL-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: eth-nlped

TutorRL-7B: A Pedagogically Aligned Math Tutor

TutorRL-7B, developed by eth-nlped, is a 7.6 billion parameter model based on Qwen2.5-7B-Instruct, uniquely fine-tuned to act as a math tutor rather than a direct problem-solver. Its core innovation lies in its alignment with pedagogical principles using reinforcement learning (GRPO) within a synthetic multi-turn classroom environment, eliminating the need for human-labeled data.

Key Capabilities

Socratic Tutoring: Guides users through problem-solving with Socratic questioning, scaffolding reasoning, and withholding direct answers to foster learning.
Pedagogical Alignment: Optimized for educational interactions, focusing on teaching methodologies over solution provision.
Annotation-Free Training: Utilizes a scalable, annotation-free approach for training LLMs as educational tutors.

Good For

Interactive Math Tutoring: Ideal for applications requiring an AI to teach math concepts and problem-solving.
Educational Research: A valuable tool for research into the educational alignment of large language models.
Socratic Dialogue Generation: Capable of generating guided, inquiry-based conversations for learning.

This model is distinct from its variant, TutorRL-7B-think, as it does not generate <think> blocks for planning-based reasoning.

Overview

TutorRL-7B: A Pedagogically Aligned Math Tutor

Key Capabilities

Good For

Full Model Card (README)