Qwen/WebWorld-14B
Qwen/WebWorld-14B is a 14 billion parameter open-web world model developed by Qwen, designed for training and evaluating web agents. It is trained on over 1 million real-world web interaction trajectories, enabling long-horizon simulation and multi-format state representations. This model excels at predicting web page states and agent actions, demonstrating strong cross-domain generalization for tasks like code, GUI, and game environments.
Loading preview...
WebWorld-14B: A World Model for Web Agents
WebWorld-14B is a 14 billion parameter model from the Qwen WebWorld series, specifically engineered as an open-web world model for training and evaluating web agents. It is built upon the Qwen3-14B base model and trained on an extensive dataset of over 1 million real-world web interaction trajectories.
Key Capabilities & Features
- Long-horizon simulation: Supports web interaction simulations spanning 30+ steps.
- Multi-format state representations: Handles various web state formats including A11y Tree, HTML, XML, Markdown, and natural language.
- CoT-activated reasoning: Incorporates Chain-of-Thought reasoning for accurate transition prediction.
- Unified action space: Supports a comprehensive set of Python-style function calls for web actions (e.g.,
click,fill,goto,scroll). - Cross-domain generalization: Demonstrates significant performance gains in diverse environments such as API services, code, games, and GUI desktops.
Performance Highlights
WebWorld-14B shows strong performance in both intrinsic and extrinsic evaluations:
- Intrinsic Evaluation (WebWorld-Bench): Achieves 70.7% average factuality and 44.7% average Web Turing Score, comparable to larger proprietary models.
- Extrinsic Evaluation (Agent Training): When integrated with Qwen3-14B, it boosts success rates by +8.3% on MiniWob++ and +9.2% on WebArena for agent training.
Ideal Use Cases
WebWorld-14B is recommended for:
- High-fidelity web simulation: For scenarios requiring accurate and robust long-horizon web environment predictions.
- Web agent development: As a core component for training and evaluating autonomous web agents.
- Data synthesis: For generating realistic web interaction trajectories.
Limitations
Users should be aware of limitations such as potential sycophancy/optimism bias, non-focus on long-form content generation fidelity, and its text-only nature (no visual/pixel-level simulation).