Name: Salesforce/GTA1-32B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Salesforce

Overview

Salesforce/GTA1-32B is a 32 billion parameter multimodal model developed by Salesforce, specialized in GUI grounding and automation. Unlike traditional LLMs that might generate lengthy textual reasoning, GTA1-32B leverages Reinforcement Learning (RL), specifically GRPO, to directly reward successful GUI interactions. This approach focuses on generating actionable responses, such as pyautogui.click(x, y) commands, making it highly effective for automating tasks within graphical user interfaces.

Key Capabilities

State-of-the-Art GUI Grounding: Consistently achieves top results across challenging GUI grounding datasets, including ScreenSpot-V2, ScreenSpotPro, OSWORLD-G, and OSWORLD-G-Refined. For instance, the 32B model achieves 95.2 on ScreenSpot-V2 and 63.6 on ScreenSpotPro, showing significant improvements over baselines.
Agentic Performance: Demonstrates strong performance on agent benchmarks like OSWorld, OSWorld-Verified, and WindowsAgentArena, indicating its capability to execute complex multi-step tasks within various operating system environments.
Direct Action Generation: Optimized to produce direct pyautogui commands for clicks, facilitating seamless integration into automation workflows.

Good For

Automated GUI Testing: Ideal for creating agents that can interact with and test software applications through their graphical interfaces.
Robotic Process Automation (RPA): Suitable for automating repetitive tasks that involve navigating and manipulating GUI elements.
Research in Agentic AI: Provides a robust foundation for developing and evaluating agents focused on human-computer interaction and environmental grounding.

Ethical Considerations

Users are advised to exercise caution, ensure human oversight, and comply with all applicable regulations when deploying this model, especially in production environments, due to potential accuracy limitations and security implications of automated actions.

Overview

Overview

Key Capabilities

Good For

Ethical Considerations

Full Model Card (README)