- Hugging Face has presented an AI tool to navigate the web in its name
- The open computer agent uses a real web browser to complete tasks such as obtaining instructions or booking tickets
- The agent and its open source demonstration can see what is on the screen, click on the buttons, complete forms and move step by step through tasks like a human
Hugging Face has introduced his own opinion on the growing number of semi -independent agents who can do online mandates for people. The new and free open computer agent is like having a personal assistant who lives within his web browser.
Part of the initiative of “Smolagents” in the course of the company, the open computer agent can interact with websites and applications as it would, handling a mouse and an invisible keyboard to complete the applications. The AI can open a browser, write things in forms, click buttons and more. Ask him to find instructions, and go to Google Maps, enter the origin and destination, and will show him the route as an obedient digital driver.
You can try it yourself with the demonstration live. Fair warning, its popularity is causing some delays and errors due to a request for orders.
We are launching the use of the computer in Smolagents! 🥳-> As vision models become more capable, they become able to feed complex agent workflows. Especially qwen-vl models, which admit the built-in ground connection, that is, the ability to locate any element in an image for its coordinates, therefore, for … pic.twitter.com/mi8muwzkisMay 6, 2025
AI AGENT
The Open computer agent is a different philosophy from an idea that has led to similar tools such as the OpenAI operator, the use of the browser, proxy 1.0 and the operator of the opera browser. Like these tools, Hugging Face’s ai Agent is about being an active participant instead of a passive source of information.
Like the use of the browser, Open Computer Agent is open source, which means that anyone can see how it works and build on it, or at least modify it for cases of niche use. The agent is the beginning of something more flexible, not a finished product with a million legal resignations. That also means that the demonstration is exactly that, a demonstration, not a polished package. It can be wrong and require that you jump for session and captcha tests.
Reserve tickets, verify the store hours, searches, search for instructions and click on the menus are everything that many people would like to be able to do with a single natural language message. It is one thing to ask Chatgpt how to find cheap flights. It is another to see a tool go to a travel website, move through the listings and try to click “Book now”.
It can be defective and far from being striking, but the open computer agent represents an AI approach that could become as common as Ia Uburse Image generators.
You may also like