Name: jaygala24/Qwen3-1.7B-ReMax-math-reasoning API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jaygala24

Model Overview

This model, jaygala24/Qwen3-1.7B-ReMax-math-reasoning, is a specialized fine-tune of the Qwen3-1.7B base model. Its primary focus is enhancing mathematical reasoning capabilities through a unique reinforcement learning approach.

Key Capabilities & Training

Mathematical Reasoning: Specifically fine-tuned to excel at solving mathematical problems, as evidenced by its strong performance on benchmarks like GSM8K and MATH-500.
ReMax RL Algorithm: Utilizes the ReMax algorithm for training, notably without a KL penalty, which differentiates its optimization strategy. This method employs a greedy-decoded response's reward as the baseline for advantages.
Targeted Datasets: Trained on gsm8k_train and math_train datasets, ensuring a strong foundation in arithmetic and advanced mathematical concepts.
Performance Metrics: Achieves an impressive overall pass@1 of 78.32% and pass@32 of 94.72% across 1819 problems from GSM8K and MATH-500 datasets, indicating robust problem-solving ability.
High Sequence Length: Trained with a sequence length of 8192, allowing for processing longer problem descriptions and reasoning steps.

Should I use this for my use case?

Yes, if you need: A compact yet powerful model for mathematical problem-solving, arithmetic, and logical reasoning tasks where step-by-step explanations are crucial.
Yes, if you are: Developing applications that require high accuracy in quantitative analysis or educational tools for math.
Consider alternatives if: Your primary use case is general-purpose text generation, creative writing, or tasks unrelated to mathematical reasoning, as this model is highly specialized.

Overview

Model Overview

Key Capabilities & Training

Should I use this for my use case?

Full Model Card (README)