AI-generated code contains more bugs and errors than human production

Average AI-generated pull request has 10.83 issues compared to 6.45 for human code, report says
Quality can be better in terms of typos, leaving room for human reviewers
Microsoft code patches are available, but so is the overall result probably

AI-generated code is actually prone to more vulnerabilities than human-generated code, raising questions about the reliability of some tools, new data from CodeRabbit claims.

Pull requests made with AI tools had an average of 10.83 issues, compared to 6.45 issues in human-generated pull requests, ultimately leading to longer reviews and the potential for more bugs to make their way into the finished product.

In addition to having 1.7x more issues overall, AI-generated pull requests also had 1.4x more critical issues and 1.7x more major issues, so they’re not just minor issues.

AI-generated code is not as secure as you might think

Logic and correctness errors (1.75x), code quality and maintainability (1.64x), security (1.57), and performance (1.42x) all recorded higher-than-average code errors, and the report criticizes AI for introducing more serious errors that human reviewers must fix.

Some of the problems that AI would likely introduce include improper password handling, insecure object references, XSS vulnerabilities, and insecure deserialization.

“AI coding tools dramatically increase throughput, but they also introduce predictable and measurable weaknesses that organizations must actively mitigate,” said CodeRabbit AI Director David Loker.

However, it is not necessarily a bad thing, as AI improves efficiency in the initial stages of code generation. The technology also introduced 1.76 times fewer spelling errors and 1.32 times fewer testability issues.

So while the study highlights some of AI’s flaws, it also serves the important purpose of demonstrating how humans and AI agents could interact with each other in the future. Instead of displacing human workers, we’re seeing human work shift toward managing and reviewing AI: computers are simply handling some of the tedious tasks that slow humans down in the first place.

Although Microsoft claims to have patched 1,139 CVEs in 2025, making it the second-highest year on record, that doesn’t necessarily equate to a bad thing. For starters, with AI, developers are creating more code, so the overall percentage of questionable code may not be as bad as those numbers initially suggest.

Then there’s the fact that AI models, like OpenAI’s GPT family, are constantly being improved to produce more accurate and less flawed results.

Follow TechRadar on Google News and add us as a preferred source to receive news, reviews and opinions from our experts in your feeds. Be sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp also.

Must Read

Leave a Comment Cancel Reply