Name: SeaFill2025/Qwen3-4B-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: 96kevinli29

Qwen3-4B-SFT-Math: Specialized for Mathematical Reasoning

Qwen3-4B-SFT-Math is a 4 billion parameter model from the Sea-Fill Community, fine-tuned from Qwen3-4B-Base using a pure long-think math recipe at a ~45K scale. This model addresses the need for reproducible 'warm-start' SFT bases, bridging the gap between base models and reinforcement learning models, particularly for math-focused applications.

Key Capabilities & Features

Exceptional Math Reasoning: Demonstrates substantial performance gains in mathematical reasoning, with Pass@1 accuracy improvements of +20.62% on AIME 2025, +19.79% on AIME 2026, and +42.81% on AMC 2023 compared to its base model.
Optimized for CoT: Aligned for Chain-of-Thought (CoT) and instruction following, making it suitable for complex problem-solving requiring detailed step-by-step reasoning.
Warm-Start for RL: Designed as a robust SFT-only baseline for reinforcement learning (RL) research, allowing for further alignment studies.
Qwen Chat Template: Trained with the Qwen chat template, expecting responses to end with <|im_end|>. Users should configure eos_token_id to 151645.

Use Cases & Limitations

Good for: Pure mathematical reasoning tasks, especially those requiring long-think processes and detailed derivations. Ideal for researchers exploring SFT-to-RL alignment in math domains.
Limitations: This model is math-only SFT and is not optimized for general-domain reasoning, factuality, or instruction following outside of mathematics. It may produce hallucinations or unsafe outputs in non-math contexts. Users should also be aware that long rollouts are common, with a significant fraction hitting the 16K token cap on hard problems, suggesting a larger token budget (e.g., 32K) for AIME-level evaluations.

Overview

Qwen3-4B-SFT-Math: Specialized for Mathematical Reasoning

Key Capabilities & Features

Use Cases & Limitations

Full Model Card (README)