OpenAI has introduced Operator, an advanced AI agent designed to automate web tasks, reducing human intervention. Currently, it is available only to Pro subscribers in the United States.
Operator utilizes the Computer-Using Agent (CUA) model, which merges GPT-4o's multimodal capabilities with advanced reasoning and learning. This allows it to interact with graphical interfaces like a human, following user instructions.
The tool automates actions in a browser by processing screenshot data and simulating mouse or keyboard inputs. OpenAI aims to refine the tool through user feedback, with plans for a broader rollout once testing is complete.
Operator has been tested with an 87% success rate on sites like Amazon, although performance drops to 58.1% on other tests. The company is working to enhance these metrics over time.
Security measures include user approvals for significant actions and monitoring access to sensitive sites. The tool blocks risky operations, such as bank transfers, and ensures privacy by allowing users to opt-out of data training.
Businesses like DoorDash and Uber are collaborating with OpenAI to optimize the Operator for everyday tasks. The company plans to integrate these features into ChatGPT and release CUA via an API for developers.