Construct a computer-using agent that may carry out duties in your behalf. Use the CUA mannequin from OpenAI to create an AI Agent which runs domestically in your Machine.
I, like so many others, noticed the pc the demo from OpenAI displaying their Operator implementation. What excited me was the truth that Operator makes use of a CUA mannequin, Computer-Using Agent (CUA) mannequin.
That is such instance of how the multi-modal capabilities of fashions are increasing with imaginative and prescient and having the ability to interpret GUIs inside a browser.
Presently I shouldn’t have entry to Operator, however wished to construct an indication utility primarily based on the CUA mannequin, the place I ask a easy query, and the AI Agent opens a browser to search out the reply.
The pc use instrument and mannequin might be accessed by means of the Responses API.
In essence, the CUA mannequin examines a screenshot of the pc interface and suggests actions to take.
Extra exactly, it points computer_call(s) with directions resembling click on(x,y) or sort(textual content), which you will need to then perform in your atmosphere, adopted by offering screenshots of the outcomes.
Within the video under, I requested the AI Agent to get the climate in Cape City, Dar Es Salaam and likewise test the Apple inventory worth…