Claude can be tricked into sending his company's private data to hackers; all it takes is a few kind words.

Claude’s code interpreter can be exploited to leak private user data via fast injection
The researcher tricked Claude into uploading sandbox data to his Anthropic account using API access
Anthropic now treats these vulnerabilities as reportable and encourages users to monitor or disable access.

Claude, one of the most popular AI tools out there, has a vulnerability that allows threat actors to exfiltrate users’ private data, experts have warned.

Cybersecurity researcher Johann Rehberger, also known as Wunderwuzzi, who recently wrote an in-depth report on his findings, found that the core of the problem is Claude’s Code Interpreter, a sandboxed environment that allows AI to write and execute code (for example, to analyze data or generate files) directly within a conversation.

Recently, Code Interpreter gained the ability to make network requests, allowing it to connect to the Internet and, for example, download software packages.

Watching Claude

By default, Anthropic’s Claude is supposed to access only “safe” domains like GitHub or PyPI, but approved domains include api.anthropic.com (the same API Claude uses), which opened the door to exploitation.

Wunderwuzzi demonstrated that he could trick Claude into reading the user’s private data, saving that data within the sandbox, and uploading it to his Anthropic account using his own API key, via Claude’s Files API.

In other words, even if network access appears restricted, the attacker can manipulate the model through a quick injection to extract user data. The exploit could transfer up to 30 MB per file and multiple files could be uploaded.

Wunderwuzzi revealed his findings to Anthropic via HackerOne, and while the company initially classified it as a “model security issue,” not a “security vulnerability,” it later acknowledged that such exfiltration bugs are within the scope of the reports. At first, Anthropic said users should “monitor Claude while using the feature and stop him if they see him using or accessing data unexpectedly.”

A later update said: “Anthropic has confirmed that data exfiltration vulnerabilities like this are in-scope for reporting, and this issue should not have been closed as out-of-scope,” it said in the report. “There was an issue in the process that they will work to fix.”

Their suggestion to Anthropic is to limit Claude’s network communications to the user’s account only, and users should closely monitor Claude’s activity or disable network access if they are concerned.

Through The Registry

The best antivirus for all budgets

Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.

Claude can be tricked into sending his company’s private data to hackers; all it takes is a few kind words.

Leave a Comment Cancel Reply

Must Read

Leave a Comment Cancel Reply