Vinnnf/Thinkless-1.5B-RL-DeepScaleR

Name: Vinnnf/Thinkless-1.5B-RL-DeepScaleR API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Vinnnf

Warm

Public

Model Size: 1.5B

Quant: BF16

Ctx length: 32768

Concurrency cost: 1

Published on: May 16, 2025

License: apache-2.0

Hugging Face

Vinnnf/Thinkless-1.5B-RL-DeepScaleR is a 1.5 billion parameter language model developed by Gongfan Fang, Xinyin Ma, and Xinchao Wang. It is trained under a reinforcement learning paradigm using a Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, enabling it to adaptively select between short-form and long-form reasoning. This model is optimized to reduce computational costs by minimizing unnecessary long-chain thinking, particularly excelling in mathematical and reasoning benchmarks like Minerva Algebra, MATH-500, and GSM8K.

No reviews yet. Be the first to review!