rl-research/DR-Tulu-SFT-8B
DR-Tulu-SFT-8B by rl-research is an 8 billion parameter SFT (Supervised Fine-Tuning) checkpoint of DR Tulu, an open deep research agent built on Qwen3-8B. This model is specifically trained for tool-use within the dr-agent-lib framework, excelling in complex research-oriented question answering and information retrieval tasks. It demonstrates significant performance improvements over its base model in benchmarks like SQAv2, HealthBench, and DeepResearch Bench, making it suitable for advanced research applications requiring agentic capabilities.
Loading preview...
DR-Tulu-SFT-8B: A Tool-Use Agent for Deep Research
DR-Tulu-SFT-8B is an 8 billion parameter model developed by rl-research, serving as the Supervised Fine-Tuning (SFT) checkpoint of the DR Tulu deep research agent. Built upon the Qwen3-8B architecture, this model is specifically designed and trained for advanced tool-use capabilities using the dr-agent-lib framework.
Key Capabilities & Differentiators
- Specialized for Tool-Use: Unlike general-purpose LLMs, DR-Tulu-SFT-8B is explicitly trained to integrate and utilize external tools, making it highly effective for complex, multi-step research tasks.
- Enhanced Research Performance: The model significantly outperforms its base model, Qwen3-8B, across various research-focused benchmarks. For instance, it achieves 72.3 on SQAv2, 38.1 on HealthBench, and 39.0 on DeepResearch Bench, demonstrating superior performance in tasks requiring deep information retrieval and synthesis.
- SFT Training: It has undergone supervised fine-tuning on a dedicated dataset (
rl-research/dr-tulu-sft-data) to optimize its agentic behavior and tool interaction. - Open Deep Research Agent: Positioned as an open research agent, it aims to facilitate advanced research applications.
Intended Use Cases
- Deep Research: Ideal for applications requiring comprehensive information gathering, analysis, and synthesis from various sources.
- Agentic Systems: Best utilized within the
dr-agent-libframework for building intelligent agents that can interact with tools to solve complex problems. - Question Answering: Excels in challenging question-answering scenarios, particularly those requiring external knowledge access and reasoning.
Note: This model is optimized for the dr-agent-lib framework; direct inference with standard HuggingFace or vLLM setups may not yield optimal results. Refer to the DR Tulu GitHub repository for proper usage and integration.