Name: sailing-lab/SR2AM-v0.1-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sailing-lab

SR²AM-v0.1-8B: Self-Regulated Simulative Reasoning Agentic LLM

SR²AM-v0.1-8B is an 8 billion parameter language model developed by sailing-lab, designed to enhance agentic reasoning through a novel three-system decomposition. It integrates reactive execution (System I), simulative reasoning (System II) via an internal world model, and self-regulation (System III) managed by a learned configurator. This architecture allows the model to decide when and how deeply to plan, optimizing its reasoning process.

Key Capabilities and Features

System I + II + III Decomposition: Employs a configurator to dynamically decide planning depth, a simulative planner for constructing future-state-grounded plans, and reactive execution for fine-grained reasoning and tool use.
SFT + RL Training: Utilizes supervised learning on data encoding the self-regulated planning structure, followed by reinforcement learning (GRPO) to optimize for task success.
Agentic Tool Use: Supports web search (SerpAPI), web browsing with LLM summarization, and stateless Python code execution (SandboxFusion).
Compact and Efficient: Achieves an overall Pass@1 of 57.0 across 11 diverse benchmarks, including math, science, tabular analysis, and web information seeking. This performance is competitive with systems ranging from 120B to 355B parameters, while maintaining a compact size and efficient reasoning token usage (averaging 3,698 tokens per trajectory).

When to Use This Model

SR²AM-v0.1-8B is particularly well-suited for applications requiring complex, multi-step reasoning and agentic capabilities, especially where efficiency and performance at a smaller scale are critical. Its strengths lie in tasks that benefit from structured planning, self-regulation, and tool integration, such as:

Automated problem-solving in math and science domains.
Information retrieval and synthesis from the web.
Tasks requiring dynamic decision-making on planning depth.

For more details, refer to the project website and the research paper.

Overview

SR²AM-v0.1-8B: Self-Regulated Simulative Reasoning Agentic LLM

Key Capabilities and Features

When to Use This Model

Full Model Card (README)