rStar2-Agent-14B: Advanced Agentic Reasoning Model

rStar2-Agent-14B is a 14 billion parameter model developed as part of the rStar2-Agent research, detailed in its technical report. This model demonstrates advanced agentic reasoning capabilities, achieving math reasoning performance comparable to much larger 67B models through pure agentic reinforcement learning.

Key Capabilities

Agentic Reasoning: Excels at planning, reasoning, and autonomous problem-solving.
Tool Use: Capable of efficiently using coding tools (specifically Python code execution) to explore, verify, and reflect during complex tasks.
Math Problem Solving: Optimized for mathematical reasoning, enabling it to tackle intricate math problems by breaking them down and validating solutions.
Reproducible Performance: The model's math evaluation results are designed to be reproducible, with official code and training recipes available on the GitHub repository.

Usage and Integration

The model can be served using SGLang and integrated via an OpenAI-compatible API. It supports dynamic tool calling, allowing it to interact with external functions like execute_python_code_with_standard_io for real-time code execution and result processing. This enables a dynamic problem-solving loop where the model can generate code, execute it, and use the output to refine its reasoning.