Name: HelloKKMe/GTA1-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: HelloKKMe

HelloKKMe/GTA1-32B: State-of-the-Art GUI Grounding Model

HelloKKMe/GTA1-32B is a 32 billion parameter vision-language model developed by HelloKKMe, specifically engineered for Graphical User Interface (GUI) grounding. This model leverages Reinforcement Learning (RL) with GRPO (Grounding Reinforcement Policy Optimization) to achieve superior performance in identifying and locating UI elements.

Key Capabilities & Differentiators

Direct Objective Alignment: Unlike models relying on extensive Chain-of-Thought (CoT) reasoning, GTA1-32B directly rewards successful clicks and actionable responses, leading to more grounded and precise UI element identification.
State-of-the-Art Performance: The model consistently achieves leading results across challenging GUI grounding datasets, including ScreenSpot-V2 (93.2%), ScreenSpotPro (53.6%), and OSWORLD-G (61.9%).
Benchmarking Excellence: The 32B variant shows significant performance improvements (e.g., +1.3% on ScreenSpot-V2, +5.6% on ScreenSpotPro, +2.3% on OSWORLD-G) compared to its baseline models like Qwen2.5-VL-32B-Instruct.
Optimized for UI Interaction: Its training methodology makes it highly effective for tasks requiring accurate localization of interactive elements within a GUI.

When to Use This Model

Automated UI Testing: For precisely locating and interacting with UI elements in automated testing frameworks.
Robotic Process Automation (RPA): To enable robots to accurately identify and click on specific GUI components.
Accessibility Tools: Developing tools that assist users in navigating and interacting with complex interfaces.
Any application requiring highly accurate GUI element localization.

This model is particularly suited for scenarios where direct, grounded interaction with visual interfaces is paramount, offering a robust solution for complex GUI automation and understanding.

Overview

HelloKKMe/GTA1-32B: State-of-the-Art GUI Grounding Model

Key Capabilities & Differentiators

When to Use This Model

Full Model Card (README)