meituan/EvoCUA-8B-20260105

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The meituan/EvoCUA-8B-20260105 is an 8 billion parameter general-purpose multimodal model developed by Meituan, designed for computer use automation. It excels at end-to-end multi-turn automation, operating applications like Chrome, Excel, and VSCode through screenshots and natural language instructions. This model achieves a 46.1% task completion rate on the OSWorld benchmark, demonstrating competitive performance with significantly fewer parameters than 72B-level models. EvoCUA-8B is also noted for its strong robustness against unintended behaviors, making it a safer agent for automated tasks.

Loading preview...

EvoCUA-8B: A Leading Open-Source Computer Use Agent

EvoCUA-8B-20260105, developed by Meituan, is an 8 billion parameter general-purpose multimodal model specifically engineered for computer use automation. It stands out for its ability to perform end-to-end, multi-turn automation across various applications, including Chrome, Excel, PowerPoint, and VSCode, by interpreting screenshots and natural language instructions.

Key Capabilities & Performance

  • OSWorld Benchmark Leader: Achieves a 46.1% task completion rate on OSWorld, making it competitive with 72B-level models while using significantly fewer parameters.
  • Cross-OS Generalization: Demonstrates strong zero-shot generalization, scoring 56.48% on WindowsAgentArena (WAA) with its 32B variant, surpassing other leading GUI agents.
  • Enhanced Safety: An independent study by Yoshua Bengio and Dawn Song's teams found EvoCUA-32B to have the lowest unintended-behavior rate (35.0%) among tested Computer Use Agents, indicating high robustness.
  • Novel Training: Utilizes a unique data synthesis and training approach that consistently improves computer use capabilities across open-source Vision-Language Models without degrading general performance.

Good For

  • Automating complex desktop tasks through natural language.
  • Developing robust and safe AI agents for computer interaction.
  • Research and development in the field of Computer Use Agents, especially for those seeking efficient models with strong performance and safety characteristics.