swordli/Qwen2.5-3B-Base-SAPO
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Mar 6, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
swordli/Qwen2.5-3B-Base-SAPO is a 3.1 billion parameter model based on the Qwen2.5 architecture, developed by Jian Li et al. It implements SAPO, a policy optimization method designed to stabilize post-training for autonomous multi-turn search agents. This model is specifically optimized for improving search agent performance on complex, real-world question-answering tasks by enforcing token-level distributional constraints.
Loading preview...