Hugging Face, a company specializing in open-source AI models, has introduced an experimental AI agent designed to perform basic computer tasks. The "Open Computer Agent" interacts with applications within a virtual Linux machine via a web browser. This allows it, for example, to browse the internet and perform simple searches.
However, the technology is still in its early stages of development. Hugging Face itself acknowledges significant limitations. The agent is slow, struggles with CAPTCHAs, and often requires a restart to function properly. By default, the agent logs requests to improve the technology. However, users can disable this feature.
Tests show that the agent struggles even with simple tasks. For example, the agent was tasked with finding the Hugging Face headquarters on Google Maps. Instead, it searched for a "3D printing shop." A conventional Google search delivers the correct result much faster.
Hugging Face has paid particular attention to the agent's design. The interactive Linux interface presents itself in a retro-futuristic design reminiscent of the Apple TV series "Severance." A switch labeled "Innie/Outie" activates and deactivates this effect.
The agent is based on "smolagents," a minimalist framework for AI agents that Hugging Face introduced in December 2024. This open-source library allows developers to create agents with minimal code. The AI can write Python code directly instead of using traditional JSON commands. The goal is to optimize workflows and make agents more efficient.
In addition, the agent uses Alibaba's Qwen-VL vision model. This model can locate elements in images and interact with user interfaces. In benchmarks, the latest Qwen2.5-VL-32B model (released in March) even outperformed larger models like Qwen2-VL-72B and showed particular strengths in analyzing complex visual information.
The release of the Open Computer Agent, inspired by OpenAI's experimental ChatGPT Operator, is the latest in a series of open-source initiatives from Hugging Face that follow in the footsteps of commercial solutions. In February, the company introduced Open Deep Research, a competitor to OpenAI's Deep Research, developed in just 24 hours.
Although corporate interest in AI agents is increasing – KPMG reports that 65 percent of companies are already experimenting with AI agents – the current state of the Open Computer Agent demonstrates how early the technology still is. Agents that operate computers like humans are still in the experimental phase. It's an interesting playground for developers and researchers, but the agent is far from ready for everyday use.