Name: Harsha901/Qwen3-4B-Inst-Math-Reasoning-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Harsha901

Model Overview

Harsha901/Qwen3-4B-Inst-Math-Reasoning-SFT is a supervised fine-tuned (SFT) variant of the Qwen3-4B-Instruct model, developed by Harsha901. This 4 billion parameter model is specifically optimized for mathematical reasoning and step-by-step problem-solving, building upon the Qwen3 architecture. It was fine-tuned using Unsloth and Hugging Face's TRL library, resulting in approximately 2x faster training.

Key Capabilities

Multi-step mathematical reasoning: Handles complex math problems requiring several logical steps.
Algebra, arithmetic, and word problems: Proficient in various mathematical domains.
Chain-of-thought style explanations: Generates clear, logically structured reasoning chains.
Improved instruction adherence: Follows prompts precisely for consistent outputs.
More stable reasoning: Offers enhanced reliability compared to its base model.

Training and Evaluation

The model was trained on a curated dataset of instruction-style math prompts and step-by-step solutions, emphasizing logical consistency and clear intermediate steps. While formal benchmark results are planned, qualitative evaluations show improved structured reasoning and more consistent intermediate steps compared to the base model.

Good For

Math problem solving: Ideal for generating solutions with detailed explanations.
Educational assistants: Can serve as a tool for teaching and learning mathematics.
Reasoning benchmarks: Suitable for tasks requiring logical deduction and problem-solving.
Downstream alignment: A strong foundation for further preference tuning (DPO / RLHF).

Limitations

It's important to note that the model's outputs are not guaranteed to be mathematically correct in all cases and should be verified for critical applications. It can also be verbose due to its reasoning-style outputs and is not optimized for creative or non-technical writing.

Overview

Model Overview

Key Capabilities

Training and Evaluation

Good For

Limitations

Full Model Card (README)