mPLUG/GUI-Owl-1.5-2B-Instruct

VISIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Feb 14, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

GUI-Owl 1.5-2B-Instruct, developed by mPLUG, is a 2 billion parameter instruction-tuned model built on Qwen3-VL, designed for multi-platform GUI automation. It excels as a native GUI agent across desktops, mobile devices, and browsers, supporting tool invocation and long-horizon memory. This model is optimized for fast inference and edge deployment in GUI agent applications, achieving strong performance on benchmarks like OSWorld-Verified and AndroidWorld.

Loading preview...

GUI-Owl 1.5-2B-Instruct: Multi-Platform GUI Agent

GUI-Owl 1.5-2B-Instruct is a 2 billion parameter model from the GUI-Owl 1.5 family, built upon Qwen3-VL, specifically designed for native GUI automation across diverse platforms including desktops, mobile devices, and browsers. It leverages a hybrid data flywheel, unified agent capability enhancements, and multi-platform environment RL (MRPO) to deliver robust performance.

Key Capabilities

  • Multi-Platform GUI Automation: Supports automation across various operating systems and environments.
  • Tool & MCP Calling: Natively integrates external tool invocation and Multi-platform Coordination Protocol (MCP) server coordination.
  • Long-Horizon Memory: Features built-in memory capabilities, eliminating the need for external workflow orchestration for complex tasks.
  • Multi-Agent Ready: Can function as a standalone end-to-end agent or as specialized roles (planner, executor, verifier, notetaker) within the Mobile-Agent-v3.5 framework.
  • Optimized for Inference: As an 'Instruct' variant, it is designed for fast inference and suitability for edge deployments.

Performance Highlights

This model demonstrates strong performance on various end-to-end online benchmarks, including:

  • OSWorld-Verified: Achieves 43.5
  • AndroidWorld: Achieves 67.9
  • OSWorld-MCP: Achieves 33.0
  • Mobile-World: Achieves 31.3
  • WindowsAA: Achieves 25.8

Good For

  • Developing native GUI automation solutions for desktop, mobile, and web applications.
  • Applications requiring efficient, instruction-tuned agents for GUI interaction.
  • Edge deployment scenarios where fast inference is critical for GUI automation tasks.