Hcompany/Holo3-35B-A3B

TEXT GENERATIONConcurrency Cost:3Model Size:35.1BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Mar 23, 2026License:apache-2.0Architecture:Transformer0.4K Open Weights Cold

Holo3-35B-A3B by Hcompany is a 35.1 billion parameter Vision-Language Model (VLM) specifically optimized for GUI Agents and computer use. Utilizing a Sparse Mixture-of-Experts (MoE) architecture with 3B active parameters, it excels at interpreting visual interfaces, reasoning over content, and executing actions across web, desktop, and mobile environments. This model achieves state-of-the-art performance on OSWorld-Verified for computer use agents and demonstrates strong capabilities in UI localization and grounding.

Loading preview...

Holo3-35B-A3B: Vision-Language Model for GUI Agents

Holo3-35B-A3B is Hcompany's latest Vision-Language Model (VLM) designed for GUI Agents, enabling operation across diverse digital environments including web, desktop, and mobile. This model interprets visual interfaces, reasons over complex content, and executes precise actions, making it highly effective for automated computer use.

Key Capabilities & Differentiators

  • State-of-the-Art Computer Use: Achieves 77.8% on OSWorld-Verified, setting a new benchmark for computer use agents.
  • Efficient Architecture: Built on a Sparse Mixture-of-Experts (MoE) architecture with 35.1 billion total parameters but only 3 billion active parameters, offering high performance at reduced inference cost.
  • Enterprise Readiness: Outperforms larger competitors on the H Corporate Benchmark, a dedicated evaluation suite for multi-step tasks in E-commerce, Business Software, Collaboration, and Multi-App workflows.
  • Superior UI Localization & Grounding: Excels at identifying interaction elements and understanding their functions, validated by top-tier performance on ScreenSpot-Pro and OSWorld-G.
  • Reinforced Perception & Decision-Making: Training pipeline leverages open-source datasets, synthetic trajectories, and human-annotated samples for reliable multi-step reasoning.

Ideal Use Cases

  • Automated Web Navigation: Building agents that can navigate and interact with websites.
  • Desktop & Mobile Automation: Developing agents for automating tasks across various operating system interfaces.
  • Business Process Automation: Implementing agents for complex, multi-step workflows in enterprise applications.
  • GUI-based Task Execution: Any application requiring an AI to "see" and interact with a graphical user interface.