rl-research/DR-Tulu-No-RLER-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 24, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

DR Tulu-No-RLER-8B is an 8 billion parameter deep research agent developed by rl-research, built upon rl-research/DR-Tulu-SFT-8B. This model has undergone Reinforcement Learning (RL) training for tool-use within the dr-agent-lib framework. It serves as an ablation model, specifically trained without Reinforcement Learning with Evolving Rubrics (RLER) to analyze the impact of RLER on model performance. Its primary use case is research and educational exploration of RL training effects in deep research agents.

Loading preview...