Name: Vinnnf/Thinkless-1.5B-RL-DeepScaleR API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Vinnnf

Thinkless: Adaptive Reasoning LLM

Thinkless-1.5B-RL-DeepScaleR is a 1.5 billion parameter model designed to intelligently decide when to engage in detailed, long-form reasoning versus providing concise, short-form responses. Developed by Gongfan Fang, Xinyin Ma, and Xinchao Wang, this model utilizes a novel reinforcement learning framework with two control tokens: <short> for brief answers and <think> for in-depth reasoning.

Key Capabilities

Adaptive Reasoning: Employs a learnable framework to select optimal reasoning modes based on task complexity and the model's internal assessment.
Computational Efficiency: Significantly reduces the use of long-chain thinking by 50%-90% on various benchmarks, leading to lower computational costs compared to traditional Reasoning Language Models.
Decoupled Learning: Incorporates a Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, separating control token selection from response accuracy, which stabilizes training and prevents collapse.
Mathematical Proficiency: Demonstrates strong performance on mathematical and reasoning benchmarks such as Minerva Algebra, MATH-500, and GSM8K.

Good For

Applications requiring efficient and adaptive reasoning, especially in mathematical problem-solving.
Scenarios where balancing response conciseness with reasoning depth is crucial.
Reducing inference costs for reasoning-intensive tasks by avoiding unnecessary complex thought processes.