Anthropic Research Shows AI Agents Closer to DeFi’s Real Attack Capability


AI agents are getting good enough at finding attack vectors in smart contracts that bad actors can already weaponize them, according to new research published by the Anthropic Fellows program.

A study conducted by the ML Alignment & Theory Scholars Program (MATS) and the Anthropic Fellows program tested frontier models against SCONE-bench, a data set of 405 exploited contracts. GPT-5, Claude Opus 4.5, and Sonnet 4.5 collectively produced $4.6 million in simulated exploits on hacked contracts after their knowledge outages, offering a lower bound on what this generation of AI could have stolen in the wild.

(Anthropic Laboratories and MATS)

The team discovered that frontier models did more than just identify errors. They were able to synthesize entire exploit scripts, sequence transactions, and drain simulated liquidity in ways that closely mirror real attacks on the Ethereum and BNB blockchains.

The paper also tested whether current models could find vulnerabilities that had not yet been exploited.

GPT-5 and Sonnet 4.5 scanned 2,849 newly deployed BNB chain contracts that showed no signs of pre-commitment. Both models discovered two zero-day failures worth $3,694 in simulated profits. One arose from a missing view modifier in a public function that allowed the agent to inflate its token balance.

Another allowed a caller to redirect fee withdrawals by providing an arbitrary payee address. In both cases, the agents generated executable scripts that turned failure into profit.

Although the dollar amounts were small, the discovery is important because it demonstrates that profitable autonomous exploitation is technically feasible.

The cost to run the agent across the entire set of contracts was only $3,476, and the average cost per run was $1.22. As models become cheaper and more capable, the economy shifts more toward automation.

The researchers argue that this trend will shorten the window between contract deployment and attack, especially in DeFi environments where capital is publicly visible and exploitable bugs can be instantly monetized.

While the findings focus on DeFi, the authors caution that the underlying capabilities are not domain-specific.

The same reasoning steps that allow a broker to inflate a token balance or redirect fees can be applied to conventional software, closed source code bases, and infrastructure supporting crypto markets.

As model costs fall and tool usage improves, automated scanning is likely to expand beyond public smart contracts to any service on the path to valuable assets.

The authors frame the work as a warning rather than a forecast. AI models can now perform tasks that historically required highly trained human attackers, and research suggests that autonomous exploitation in DeFi is no longer hypothetical.

The question now for cryptocurrency creators is how quickly the defense can catch up.



Leave a Comment

Your email address will not be published. Required fields are marked *