Name: agentica-org/DeepScaleR-1.5B-Preview API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: agentica-org

DeepScaleR-1.5B-Preview Overview

DeepScaleR-1.5B-Preview is a 1.5 billion parameter language model from agentica-org, fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. Its core innovation lies in its application of distributed reinforcement learning (RL) to scale up to long context lengths, specifically targeting mathematical reasoning tasks. The model achieves a notable 43.1% Pass@1 accuracy on AIME 2024, representing a 15% improvement over its base model and outperforming OpenAI's O1-Preview.

Key Capabilities & Training

Mathematical Reasoning: Excels in solving complex mathematical problems, as evidenced by its strong performance on AIME, MATH, and AMC benchmarks.
Reinforcement Learning: Utilizes Deepseek's Group Relative Policy Optimization (GRPO), an extension of PPO, with a simple reward function (1 for correct, 0 for incorrect answers).
Iterative Context Lengthening: Employs a cost-effective training strategy that progressively increases context length from 8K to 24K tokens as the model improves, optimizing compute resources.
Data: Trained on approximately 40,000 unique problem-answer pairs from AIME, AMC, Omni-MATH, and Still datasets.

Performance Highlights

DeepScaleR-1.5B-Preview demonstrates leading performance among 1.5B and 7B parameter models on several mathematical benchmarks:

AIME 2024: 43.1% Pass@1 (compared to 40.0% for O1-Preview and 28.8% for its base model).
MATH 500: 87.8% Pass@1.
AMC 2023: 73.6% Pass@1.

Good For

Applications requiring strong mathematical problem-solving capabilities.
Research into efficient reinforcement learning for language models.
Use cases where a smaller, highly specialized model for reasoning is preferred over larger, general-purpose LLMs.

Overview

DeepScaleR-1.5B-Preview Overview

Key Capabilities & Training

Performance Highlights

Good For

Full Model Card (README)