Name: sfewf/qwen3-4b-math-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sfewf

Qwen3-4b-math-RL: Enhanced for Mathematical Reasoning

The sfewf/qwen3-4b-math-RL model is a specialized version of the Qwen3-4b architecture, distinguished by its post-RL training on extensive mathematical datasets. This optimization aims to significantly improve its performance in complex reasoning and mathematical problem-solving.

Key Capabilities & Features

Max-Thinking Mode: Inspired by DeepSeek V4, this mode encourages an "absolute maximum" reasoning effort. It prompts the model to thoroughly decompose problems, rigorously test logic, and explicitly document its entire deliberation process, including intermediate steps and rejected hypotheses. This ensures comprehensive and verifiable reasoning.
Default Mode with Length-Penalty: When not in Max-Thinking mode, the model is trained with a length-penalty, enabling it to produce shorter, more concise responses while striving to maintain high accuracy.
Improved Reasoning Efficiency: Observations indicate that the RL training has led to more efficient reasoning processes in both default and Max-Thinking modes.

Performance Highlights

Evaluations demonstrate strong performance, particularly with the Max-Thinking mode:

GSM8K: Achieves 0.9172 (standard) and 0.9327 (max-effort) accuracy.
MATH-lighteval: Scores 0.8019 (standard) and 0.8505 (max-effort) accuracy.
BBH: Reaches 0.7963 (standard) and 0.8709 (max-effort) accuracy.
GPQA: Shows 0.2667 (standard) and 0.3125 (max-effort) accuracy.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

Advanced Mathematical Problem Solving: Excels in tasks demanding detailed, step-by-step mathematical reasoning.
Complex Reasoning Tasks: Benefits from the Max-Thinking mode for problems requiring deep logical analysis and comprehensive deliberation.
Educational Tools: Can be used to generate detailed explanations for mathematical solutions.
Automated Code Generation (Math-related): Potentially useful for generating code snippets for mathematical algorithms or proofs.

Overview

Qwen3-4b-math-RL: Enhanced for Mathematical Reasoning

Key Capabilities & Features

Performance Highlights

Ideal Use Cases

Full Model Card (README)