Name: SakanaAI/RLT-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SakanaAI

SakanaAI/RLT-7B: Reinforcement-Learned Teacher Student Model

RLT-7B is a 7.6 billion parameter autoregressive language model developed by Sakana AI. This model is a "student" model, distilled from a 7B Reinforcement-Learned Teacher. The core innovation lies in the Reinforcement-Learned Teachers (RLT) pipeline, where the teacher model is explicitly trained to generate high-quality reasoning traces, which are then used to distill knowledge into the student model.

Key Characteristics

RLT Pipeline: Utilizes a novel Reinforcement-Learned Teachers approach for distillation, focusing on reasoning trace quality.
Reasoning Optimization: Trained with supervised fine-tuning using specific hyperparameters and reasoning tags from Li et al. 2025 to enhance reasoning capabilities.
Context Length: Features a notable context length of 131,072 tokens.
Research Prototype: Intended for research and development, not commercial deployment, as an experimental prototype.

Evaluation and Resources

Evaluation of RLT-7B was conducted using the SkyThought library. Further details on the RLT pipeline, training methodology, and results can be found in the associated paper and the Sakana AI RLT repository.

Overview

SakanaAI/RLT-7B: Reinforcement-Learned Teacher Student Model

Key Characteristics

Evaluation and Resources

Full Model Card (README)