xxwu/Agent-STAR-RL-1.5B
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Mar 23, 2026License:mitArchitecture:Transformer Open Weights Warm

The xxwu/Agent-STAR-RL-1.5B is a 1.5 billion parameter language model, built on the Qwen2.5-1.5B-Instruct backbone, developed by xxwu. It is specifically fine-tuned using Reinforcement Learning (RL) within the STAR pipeline (Data Synthesis → SFT → RL) for long-horizon tool orchestration and planning tasks. This model excels at handling complex, multi-turn environments, particularly benefiting from scale-aware RL recipes for smaller models. Its primary application is in enabling agents to effectively use tools and plan over extended interactions.

Loading preview...