Alibaba-NLP/WebDancer-32B is a 32 billion parameter agentic search reasoning model developed by Alibaba-NLP, designed for autonomous information seeking. It utilizes a ReAct framework and a four-stage training paradigm to acquire autonomous search and reasoning skills. This model excels at complex web-based tasks, achieving strong performance on benchmarks like GAIA and WebWalkerQA.
Loading preview...
WebDancer-32B: Autonomous Information Seeking Agent
WebDancer-32B is a 32 billion parameter model developed by Alibaba-NLP, specifically engineered for autonomous information seeking and reasoning. It operates as a native agentic search model, leveraging the ReAct framework to enable deep research-like capabilities.
Key Capabilities
- Autonomous Search and Reasoning: The model is trained to autonomously acquire and apply search and reasoning skills, making it suitable for complex, multi-step information retrieval tasks.
- Four-Stage Training Paradigm: Its development involved a unique training process including browsing data construction, trajectory sampling, supervised fine-tuning for effective cold start, and reinforcement learning for improved generalization.
- Data-Centric Approach: Integrates trajectory-level supervision fine-tuning and reinforcement learning (DAPO) to create a scalable pipeline for training agentic systems.
- Strong Benchmark Performance: Achieves a Pass@3 score of 61.1% on GAIA and 54.6% on WebWalkerQA, indicating its proficiency in handling challenging web-based question answering and task execution.
Good For
- Autonomous Agents: Ideal for building agents that need to independently search for and process information from the web.
- Complex Information Retrieval: Suited for tasks requiring multi-step reasoning and interaction with web environments.
- Research and Analysis Automation: Can be applied to automate aspects of research by autonomously seeking and synthesizing information.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.