Alibaba-NLP/WebSailor-3B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Jul 10, 2025License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

WebSailor-3B by Alibaba-NLP is a language model specifically designed for complex web navigation and information-seeking tasks. It utilizes a novel post-training methodology to teach sophisticated reasoning, addressing high uncertainty in vast information landscapes. This model excels at challenging information-seeking problems, outperforming larger open-source agents on benchmarks like BrowseComp-en and BrowseComp-zh.

Loading preview...

WebSailor-3B: Advanced Web Navigation and Information Seeking

WebSailor-3B, developed by Alibaba-NLP, introduces a comprehensive post-training methodology to equip LLM agents with sophisticated reasoning capabilities for complex web navigation and information-seeking tasks. This model is specifically engineered to tackle scenarios characterized by extreme uncertainty and non-linear solution paths, a domain where previous open-source models often struggled compared to proprietary systems.

Key Capabilities and Innovations

  • Sophisticated Reasoning for Web Tasks: WebSailor is trained to handle information-seeking tasks across three difficulty levels, including "Level 3" problems that involve high uncertainty and intricate, non-linear solution paths.
  • SailorFog-QA Data Synthesis: To generate these challenging tasks, the model leverages SailorFog-QA, a novel data synthesis pipeline that constructs complex knowledge graphs and applies information obfuscation, creating questions demanding creative exploration.
  • Efficient Training Methodology: The training process involves generating expert trajectories and reconstructing reasoning for concise, action-oriented supervision. It begins with a "cold start" using rejection sampling fine-tuning (RFT) on high-quality examples, followed by an efficient agentic reinforcement learning stage powered by the Duplicating Sampling Policy Optimization (DUPO) algorithm.
  • State-of-the-Art Open-Source Agent: WebSailor-3B establishes a new benchmark for open-source agents, achieving outstanding results on difficult benchmarks such as BrowseComp-en and BrowseComp-zh. Notably, it demonstrates performance comparable to proprietary systems like Doubao-Search, even with a smaller parameter count than many competing models.

Ideal Use Cases

  • Complex Web Navigation: Automating tasks that require navigating intricate websites and extracting specific information.
  • Advanced Information Retrieval: Solving information-seeking queries where initial uncertainty is high and direct answers are not readily available.
  • Agentic Applications: Developing intelligent agents that can explore and reason within vast, unstructured web environments.