Name: rl-research/DR-Tulu-SFT-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: rl-research

DR-Tulu-SFT-8B: A Tool-Use Agent for Deep Research

DR-Tulu-SFT-8B is an 8 billion parameter model developed by rl-research, serving as the Supervised Fine-Tuning (SFT) checkpoint of the DR Tulu deep research agent. Built upon the Qwen3-8B architecture, this model is specifically designed and trained for advanced tool-use capabilities using the dr-agent-lib framework.

Key Capabilities & Differentiators

Specialized for Tool-Use: Unlike general-purpose LLMs, DR-Tulu-SFT-8B is explicitly trained to integrate and utilize external tools, making it highly effective for complex, multi-step research tasks.
Enhanced Research Performance: The model significantly outperforms its base model, Qwen3-8B, across various research-focused benchmarks. For instance, it achieves 72.3 on SQAv2, 38.1 on HealthBench, and 39.0 on DeepResearch Bench, demonstrating superior performance in tasks requiring deep information retrieval and synthesis.
SFT Training: It has undergone supervised fine-tuning on a dedicated dataset (rl-research/dr-tulu-sft-data) to optimize its agentic behavior and tool interaction.
Open Deep Research Agent: Positioned as an open research agent, it aims to facilitate advanced research applications.

Intended Use Cases

Deep Research: Ideal for applications requiring comprehensive information gathering, analysis, and synthesis from various sources.
Agentic Systems: Best utilized within the dr-agent-lib framework for building intelligent agents that can interact with tools to solve complex problems.
Question Answering: Excels in challenging question-answering scenarios, particularly those requiring external knowledge access and reasoning.

Note: This model is optimized for the dr-agent-lib framework; direct inference with standard HuggingFace or vLLM setups may not yield optimal results. Refer to the DR Tulu GitHub repository for proper usage and integration.

Overview

DR-Tulu-SFT-8B: A Tool-Use Agent for Deep Research

Key Capabilities & Differentiators

Intended Use Cases

Full Model Card (README)