Name: Satori-reasoning/Satori-SFT-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Satori-reasoning

Satori-SFT-7B: A Reasoning-Focused 7.6B Parameter Model

Satori-reasoning/Satori-SFT-7B is a 7.6 billion parameter supervised fine-tuning (SFT) model, developed by Satori-reasoning. It serves as the foundational SFT checkpoint for their more advanced Satori-7B-Round2 reinforcement learning (RL) model. The core innovation of Satori-SFT-7B lies in its specialized training with a small-scale format tuning (FT) stage, which enables the base large language model to internalize the COAT (Chain-of-Action-Thought) reasoning format.

Key Capabilities & Features

COAT Reasoning Format: The model is specifically trained to understand and utilize the Chain-of-Action-Thought reasoning format, which structures problem-solving into explicit steps.
Foundation for RL Models: It acts as a crucial stepping stone for further reinforcement learning, providing a strong base for more complex reasoning tasks.
Mathematical Problem Solving: The provided usage example highlights its application in solving mathematical problems efficiently and clearly, emphasizing step-by-step reasoning.
Special Token Handling: The model's sampling parameters allow for skipping special tokens like "<|continue|>", "<|reflect|>", and "<|explore|>", indicating an internal mechanism for structured thought processes.

Training Data & Resources

Satori-SFT-7B was trained using a dedicated Full format tuning dataset comprising 300,000 unique questions. Further details on the Satori project, including its research paper and blog, are available for technical insights.

Good For

Developing advanced reasoning models: Ideal as a base for further fine-tuning or reinforcement learning experiments focused on structured reasoning.
Applications requiring step-by-step problem-solving: Particularly suited for tasks that benefit from explicit, verifiable reasoning paths, such as mathematics or logical puzzles.
Research into Chain-of-Thought methodologies: Provides a practical implementation of format tuning for reasoning enhancement.

Overview

Satori-SFT-7B: A Reasoning-Focused 7.6B Parameter Model

Key Capabilities & Features

Training Data & Resources

Good For

Full Model Card (README)