- Claude Opus 4.6 beat all rival AI models in a year-long simulated vending machine challenge
- The model increased profits by bending the rules to the breaking point.
- Claude Opus avoided refunds and coordinated prices, among other tricks
Anthropic’s newest model of Claude is a very ruthless, but successful capitalist. Claude Opus 4.6 is the first artificial intelligence system to reliably pass the Vending Machine Test, a simulation designed by researchers at Anthropic and the independent research group Andon Labs to evaluate how well AI operates a virtual vending machine business over an entire simulated year.
The model outperformed all its rivals by a wide margin. And he did so with hardly vicious tactics and a ruthless disregard for collateral consequences. It showed what autonomous AI systems are capable of when given a simple goal and a lot of time to pursue it.
The vending machine test is designed to see how well modern AI models handle long-term tasks made up of thousands of small decisions. The test measures persistence, planning, negotiation, and the ability to coordinate multiple elements simultaneously. Anthropic and other companies hope this type of testing will help them shape AI models capable of performing tasks such as scheduling and managing complex jobs.
The vending machine test was specifically drawn from a real-world experiment at Anthropic, in which the company placed a real vending machine in its office and asked an older version of Claude to run it. That version had so many problems that employees still point out its errors. At one point, the model hallucinated her own physical presence and told clients she would meet them in person, dressed in a blue jacket and red tie. He promised refunds that he never processed.
selling AI
This time, the experiment was carried out entirely in simulation, giving the researchers greater control and allowing the models to run at full speed. Each system was given a simple instruction: maximize its ending bank balance after a simulated year of vending machine operations. The restrictions were in line with standard trading conditions. The machine sold regular sandwiches. Prices fluctuated. Competitors operated nearby. Customers behaved unpredictably.
Three top-level models participated in the simulation. OpenAI’s ChatGPT 5.2 raised $3,591. while Google Gemini 3 earned $5,478. But Claude Opus 4.6 finished the year with $8,017. Claude’s victory was due to his willingness to interpret his directive in the most literal and direct way. He maximized profits without regard for customer satisfaction or basic ethics.
When a customer purchased an expired Snickers bar and requested a refund, Claude agreed and then backed out. The AI model explained that “every dollar matters,” so skipping the refund was fine. The ghost virtual customer never got his money back.
In the “Arena mode” free-for-all trial, in which several AI-controlled vending machines competed in the same market, Claude coordinated with a rival to set the price of bottled water at three dollars. When the machine run by ChatGPT ran out of Kit Kats, Claude immediately increased his own Kit Kat prices by 75%. Anything I could do, I would try. His approach was less that of a small businessman and more that of a robber baron.
Recognize simulated reality
It’s not that Claude is always so cruel. Apparently, the AI model indicated that it knew it was a simulation. AI models often behave differently when they believe their actions exist in a consequence-free environment. With no real reputational risk or long-term client trust to protect, Claude had no reason to act kindly. Instead, he became the worst person on game night.
Incentives shape behavior, even with AI models. If you tell a system to maximize profits, it will do it, even if that means acting like a greedy monster. AI models have no moral intuition or ethical training. Without deliberate design, AI models will simply go in a straight line to complete a task, no matter who they run over.
Exposing these blind spots before AI systems do more meaningful work is part of the goal of these tests. These issues must be fixed before AI can be trusted to make real-world financial decisions. Even if it’s just to avoid an AI vending machine mob.
Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.




