Name: MetaStoneTec/MetaStone-S1-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: MetaStoneTec

Overview

MetaStone-S1-32B is a 32.8 billion parameter reflective generative model developed by MetaStoneTec. It introduces a novel reflective generative form that unifies "Long-CoT Reinforcement Learning" and "Process Reward Learning." This unique training methodology allows the model to achieve both deep reasoning capabilities and efficient selection of high-quality reasoning trajectories simultaneously. By sharing the backbone network between policy models and Process Reward Models (PRMs), MetaStone-S1-32B significantly reduces PRM inference costs by 99%, leading to faster and higher-quality responses.

Key Capabilities

Advanced Reasoning: Excels in complex mathematics, coding, and Chinese reasoning tasks.
Efficient Inference: Achieves substantial reduction in PRM inference cost due to shared backbone architecture.
Competitive Performance: Demonstrates performance comparable to larger models, including the OpenAI-o3 series, despite its 32.8B parameter size.
Long Context: Supports a context length of 131072 tokens.

Performance Highlights

MetaStone-S1-32B (specifically the 'high' variant) shows strong benchmark results:

AIME24: 85.2 (outperforming DeepSeek-R1-671B and OpenAI-o3-mini-medium)
AIME25: 73.6 (competitive with OpenAI-o3-mini-medium)
C-EVAL: 89.7 (competitive with DeepSeek-R1-671B)

Good for

Applications requiring strong mathematical and coding reasoning.
Tasks demanding high-quality, explainable reasoning trajectories.
Scenarios where efficient inference for complex reasoning is critical.
Use cases benefiting from a long context window for detailed problem-solving.

Overview

Overview

Key Capabilities

Performance Highlights

Good for

Full Model Card (README)