Name: Kwai-Klear/Klear-Reasoner-8B-SFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Kwai-Klear

Klear-Reasoner-8B-SFT: Advanced Reasoning for Math and Code

Klear-Reasoner-8B-SFT is an 8-billion-parameter model from Kwai-Klear, designed for long reasoning capabilities, particularly in mathematics and coding. It demonstrates outstanding performance on challenging benchmarks such as AIME 2024/2025 and LiveCodeBench V5/V6, achieving scores up to 90.5% and 66.0% respectively with a 64K inference budget.

Key Innovations:

Quality-centric long CoT SFT: Leverages supervised fine-tuning distilled from DeepSeek-R1-0528.
Gradient-Preserving Clipping Policy Optimization (GPPO): A novel Reinforcement Learning (RL) method that preserves gradients from clipped tokens, significantly boosting exploration and convergence during training.

Performance Highlights:

Achieves competitive results against other 7B-8B models on AIME and LiveCodeBench, with notable improvements when using a 64K inference budget.
The model's training environment utilizes a sandbox for code evaluation (Firejail) and a math verification system (math_verify).

Use Cases:

Complex Mathematical Problem Solving: Excels in advanced math competitions and tasks requiring multi-step reasoning.
Code Generation and Debugging: Strong performance on live coding benchmarks suggests utility in programming assistance and automated code solutions.

Overview

Klear-Reasoner-8B-SFT: Advanced Reasoning for Math and Code

Key Innovations:

Performance Highlights:

Use Cases:

Full Model Card (README)