Name: aryan-kolapkar/MathReasoner-Mini-1.5b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aryan-kolapkar

MathReasoner-Mini-1.5b: Specialized Mathematical Reasoning Model

MathReasoner-Mini-1.5b is a 1.5 billion parameter language model developed by aryan-kolapkar, specifically engineered for mathematical reasoning. Built on the Qwen2.5-Math-1.5B-base architecture, this model has undergone a rigorous three-stage training process (SFT, DPO, and GRPO) to enhance its ability to solve school-level math problems, particularly those found in the GSM8K dataset.

Key Capabilities & Performance

High Mathematical Accuracy: Achieves approximately 83.7% Pass@1 zero-shot accuracy on the GSM8K benchmark, a significant improvement over the base Qwen2.5-Math-1.5B's 54%.
Structured Output: Demonstrates 99% accuracy in generating structured outputs, enclosing reasoning within <think> tags and numerical answers within <answer> tags, crucial for automated evaluation and clarity.
Reinforcement Learning Enhanced: Utilizes GRPO (Generative Reinforcement Learning with Policy Optimization) with a custom reward function focusing on format strictness and correctness, further refining its reasoning capabilities.
Context Length: Supports a substantial context length of 32768 tokens.

Training Methodology

The model's performance is a result of a multi-stage training approach:

Supervised Fine-Tuning (SFT): Initial training on a curated GSM8K subset with self-verified generations.
Direct Preference Optimization (DPO): Fine-tuning with ~1,000 preference pairs, emphasizing correct vs. incorrect reasoning and shorter, correct CoT (Chain-of-Thought) samples.
GRPO Reinforcement Learning: Further optimization using GRPO on the GSM8K train split, incorporating a custom reward for format and correctness.

Recommended Use Cases

High School Level Math Problems: Ideal for tasks requiring step-by-step mathematical reasoning at a high school curriculum level.
Structured Reasoning Output: Particularly effective when applications require clearly delineated reasoning processes and final answers.

Note: The model is primarily designed for mathematical tasks and performs best when questions are posed in English. It is not advised for other general-purpose tasks.

Overview

MathReasoner-Mini-1.5b: Specialized Mathematical Reasoning Model

Key Capabilities & Performance

Training Methodology

Recommended Use Cases

Full Model Card (README)