Name: agentica-org/DeepCoder-1.5B-Preview API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: agentica-org

DeepCoder-1.5B-Preview: Code Reasoning LLM

DeepCoder-1.5B-Preview is a 1.5 billion parameter language model developed by agentica-org, specifically designed for code reasoning. It is fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B using advanced distributed reinforcement learning (RL) techniques.

Key Differentiators & Capabilities

Reinforcement Learning (RL) for LLMs (RLLM): Leverages an improved GRPO+ algorithm, incorporating insights from DAPO, for more stable and effective training.
Iterative Context Lengthening: Achieves a remarkable context length of 131072 tokens, generalizing well to long contexts due to techniques like DAPO's overlong filtering.
Enhanced Code Performance: Significantly outperforms its base model, DeepSeek-R1-Distill-Qwen-1.5B, across various coding benchmarks, including LiveCodeBench (LCBv5), Codeforces, and HumanEval+.
- LCB (v5): 25.1 (vs. 16.9)
- HumanEval+: 73.0 (vs. 58.3)
Open-Source & Accessible: Released under the MIT License, promoting open AI development and collaboration.

Training Innovations

The model's training recipe includes several enhancements to the GRPO algorithm:

Offline Difficulty Filtering: Ensures a suitable difficulty range in the training dataset without runtime overhead.
No Entropy or KL Loss: Eliminates instability issues often associated with these terms in RL training.
Overlong Filtering & Clip High: Techniques adopted from DAPO to preserve long-context reasoning and encourage exploration.

Ideal Use Cases

Code Generation: Excels at generating functional and complex code solutions.
Code Problem Solving: Strong performance on competitive programming and coding challenge benchmarks.
Long-Context Code Analysis: Capable of handling and reasoning over very long codebases or problem descriptions due to its extended context window.