mPLUG/GUI-Owl-1.5-32B-Instruct

VISIONConcurrency Cost:2Model Size:33.4BQuant:FP8Ctx Length:32kPublished:Feb 15, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

GUI-Owl 1.5-32B-Instruct is a 33.4 billion parameter instruction-tuned model developed by X-PLUG, built on Qwen3-VL, designed for multi-platform GUI automation. With a 32768-token context length, it excels at native GUI agent tasks across desktops, mobile devices, and browsers, demonstrating state-of-the-art performance on benchmarks like OSWorld-Verified and Mobile-World. This model features native tool and MCP calling, long-horizon memory capabilities, and can function as a standalone agent or within multi-agent frameworks.

Loading preview...

GUI-Owl 1.5-32B-Instruct Overview

GUI-Owl 1.5-32B-Instruct is a 33.4 billion parameter model from the next-generation GUI agent family, developed by X-PLUG and based on Qwen3-VL. It is specifically engineered for multi-platform GUI automation, supporting interactions across desktops, mobile devices, and web browsers. The model leverages a scalable hybrid data flywheel and multi-platform environment RL (MRPO) for enhanced capabilities.

Key Capabilities

  • State-of-the-art performance: Achieves leading results on various GUI benchmarks, including OSWorld-Verified (56.5%), AndroidWorld (69.4%), OSWorld-MCP (47.6%), and Mobile-World (46.8%).
  • Native Tool & MCP Calling: Supports direct invocation of external tools and coordination with MCP servers, demonstrating strong performance on OSWorld-MCP and Mobile-World.
  • Long-Horizon Memory: Incorporates built-in memory capabilities, outperforming other native agent models on MemGUI-Bench without requiring external workflow orchestration.
  • Multi-Agent Ready: Can operate as a complete end-to-end agent or fulfill specialized roles (planner, executor, verifier, notetaker) within the Mobile-Agent-v3.5 framework.
  • High Context Length: Features a substantial 32768-token context window, enabling processing of complex and extended GUI interaction sequences.

Use Cases

This model is ideal for developers building automated solutions for:

  • Cross-platform GUI testing and automation.
  • Intelligent agents for desktop and mobile applications.
  • Complex task execution requiring tool use and memory in GUI environments.

For more details, refer to the paper and GitHub repository.