Name: inclusionAI/DR-Venus-4B-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: inclusionAI

DR-Venus-4B-RL: A Deep Research Agent

DR-Venus-4B-RL is a 4 billion parameter model developed by inclusionAI, specifically engineered for long-horizon web research and evidence-grounded question answering. It is a reinforcement-learned checkpoint, building upon the inclusionAI/DR-Venus-4B-SFT model, and uses Qwen/Qwen3-4B-Thinking-2507 as its base.

Key Capabilities & Training

This model's core strength lies in its agentic capabilities, trained with an advanced IGPO-style agentic RL algorithm. It leverages information gain rewards and format-aware turn-level supervision to enhance execution reliability over long tool-use trajectories. The training incorporates a maximum rollout horizon of 200 interaction steps and supports a maximum context length of 256K, enabling extensive multi-turn interactions with search and visit tools.

Performance Highlights

DR-Venus-4B-RL demonstrates significant improvements over its SFT counterpart and other small models (under 9B parameters) on various deep research benchmarks. It shows gains in:

BrowseComp: +2.3
BrowseComp-ZH: +2.0
xBench-DS-2505: +5.7
xBench-DS-2510: +5.4
DeepSearchQA: +1.9

These improvements are attributed to better formatting accuracy, more reliable tool use, and enhanced long-horizon execution stability.

Intended Use Cases

Long-horizon deep research with tool-augmented reasoning.
Improving execution reliability in complex, multi-step tasks.
Evidence-grounded answering using search and visit tools.
Deployment within the official DR-Venus inference pipeline.

It is not primarily optimized for plain chat or generic short-context instruction following without tools.

Overview

DR-Venus-4B-RL: A Deep Research Agent

Key Capabilities & Training

Performance Highlights

Intended Use Cases

Full Model Card (README)