Tahoe Dev
Back to Blog
Article February 22, 2026

Vibe Coding Risks in 2026: What the Data Says

Studies show 45% of AI-generated code contains security vulnerabilities. Learn the real risks of vibe coding and what businesses should do before launching.

T

Tahoe Dev

You’ve probably heard the pitch by now: describe what you want, let AI build it, ship it by Friday. It’s called vibe coding—a term coined by Andrej Karpathy in early 2025 to describe the practice of building software by prompting an AI and accepting whatever it generates, often without reading the code at all.

And it’s impressive. Tools like Cursor, Replit, Lovable, and Claude Code can spin up working prototypes in minutes. For side projects and quick experiments, it’s a real leap forward.

But there’s a growing body of research—published in late 2025 and early 2026—that shows what happens when vibe-coded applications go to production without professional review. The numbers aren’t great.

We use AI tools every day at Tahoe Dev and they make us faster. But if you’re a business owner, founder, or decision-maker thinking about shipping AI-generated code to real users, you deserve to see the actual data before you launch.

Nearly Half of AI-Generated Code Fails Security Tests

How often does AI write insecure code? Veracode’s 2025 GenAI Code Security Report set out to answer that question systematically, testing output from over 100 large language models across 80 coding tasks in Java, JavaScript, Python, and C#.

The result: 45% of AI-generated code samples introduced security vulnerabilities from the OWASP Top 10—the industry-standard list of the most critical web application security risks.

Some languages fared worse than others. Java had the highest failure rate at 72%, while Python, C#, and JavaScript came in between 38% and 45%. Certain vulnerability types were especially persistent: LLMs failed to produce secure code against cross-site scripting in 86% of cases and log injection in 88% of cases.

Veracode’s October 2025 update tested newer models released between July and October 2025. Most showed no meaningful security improvement—some actually performed slightly worse. The exception was OpenAI’s GPT-5 Mini, which hit a 72% security pass rate—the highest recorded to date. But one model improving doesn’t change the systemic picture: the average LLM still fails at writing secure code.

2,000+ Vulnerabilities Found in Real Vibe-Coded Apps

Lab tests are one thing. What about apps that are actually deployed?

In October 2025, security research firm Escape analyzed over 5,600 publicly available applications built on vibe coding platforms like Lovable.dev, Base44.com, and Create.xyz. These are platforms that let non-developers build and deploy full-stack applications without writing code.

What they found was alarming:

  • Over 2,000 security vulnerabilities across the analyzed apps
  • 400+ exposed secrets (API keys, database credentials, authentication tokens)
  • 175 instances of exposed personal data, including medical records, bank account numbers (IBANs), phone numbers, and email addresses

These weren’t theoretical risks. These were live applications with real user data sitting exposed on the open internet.

AI Code Has 1.7x More Issues Than Human-Written Code

In December 2025, CodeRabbit published the “State of AI vs Human Code Generation” report, analyzing 470 open-source GitHub pull requests—320 AI-coauthored and 150 human-only.

The results were consistent across every category they measured:

CategoryAI vs Human Code
Overall issues per PR1.7x more (10.83 vs 6.45)
Logic and correctness errors1.75x more
Code quality and maintainability1.64x more
Security findings1.57x more
Performance issues1.42x more
Critical issues1.4x more
XSS vulnerabilities2.74x more

The takeaway isn’t that AI-generated code is useless—it’s that it requires more review, not less. Or as The Register’s headline put it: AI-authored code contains worse bugs than software crafted by humans.

Only 10.5% of Functionally Correct AI Code Is Actually Secure

Here’s maybe the most striking stat. A December 2025 academic paper titled “Is Vibe Coding Safe?” benchmarked AI-generated code across real-world tasks.

In their benchmark using SWE-Agent with Claude 4 Sonnet, 61% of generated solutions were functionally correct—the code worked as intended—but only 10.5% were secure.

That gap is the whole problem with vibe coding in one stat. Working code and safe code are not the same thing. Your app might do exactly what you asked it to do and still leave your users’ data wide open.

The Hidden Cost: Technical Debt at AI Speed

Security vulnerabilities get the headlines, but technical debt might be the bigger long-term cost. AI-generated code accumulates it faster than anything we’ve seen before.

Forrester’s 2025 Predictions report projected that by 2026, 75% of technology decision-makers will see their technical debt rise to moderate or high severity as the rapid development of AI solutions adds complexity to IT landscapes. The empirical data supports that projection: GitClear’s analysis of 211 million lines of code found that since AI coding tools went mainstream, copy-pasted code grew from 8.3% to 12.3% of all changed lines, refactoring dropped from 25% to under 10%, and code churn—new code rewritten within two weeks—rose from 5.5% to 7.9%.

The mechanism is straightforward. AI generates code fast, nobody fully understands the architecture because nobody designed it intentionally, and when something breaks six months later the codebase is a tangled mess that costs more to untangle than it would have cost to build right.

Developer surveys tell the same story. The 2025 Stack Overflow Developer Survey found that only 3% of developers “highly trust” AI-generated code. The people writing code for a living know it needs human oversight.

Vibe Coding’s Impact on the Software Ecosystem

The ripple effects go beyond individual applications—they’re hitting the open-source ecosystem that most modern software depends on.

In January 2026, researchers from Central European University and the Kiel Institute published “Vibe Coding Kills Open Source”, and the title isn’t hyperbole. When developers use AI to consume open-source libraries without reading documentation, reporting bugs, or engaging with maintainers, the feedback loop that sustains these projects breaks down. Usage goes up, but the community contributions that keep them alive go down.

The Register’s coverage noted that Tailwind Labs CEO Adam Wathan has reported that docs traffic dropped roughly 40% despite the framework being more popular than ever—more people using the code, fewer people understanding or supporting it.

This matters for your business because every modern application sits on top of open-source dependencies. If those dependencies degrade in quality because their maintainer communities erode, everyone’s software gets less reliable over time.

The 69-Vulnerability Benchmark

In December 2025, security firm Tenzai ran a head-to-head comparison of five leading vibe coding tools—Claude Code, OpenAI Codex, Cursor, Replit, and Devin—by having each build the same three test applications from identical prompts.

As reported by CSO Online, the 15 resulting applications contained a combined 69 vulnerabilities: roughly 45 rated low-to-medium severity, many rated high, and about half a dozen critical—including authorization flaws and business logic vulnerabilities.

Every single tool produced vulnerable code. The difference wasn’t whether the output had security problems, but how many and how severe.

So What Should You Actually Do?

None of this means you shouldn’t use AI to build software. We use AI tools at Tahoe Dev every day and they make us significantly faster. The difference is in how they’re used.

Use AI for speed, humans for safety. AI is excellent at generating boilerplate, scaffolding features, and accelerating repetitive work. Architecture decisions, security review, data handling, and deployment still need experienced human judgment.

Never ship without a code review. Every study above points in the same direction: AI-generated code contains more bugs, more security flaws, and more maintainability issues. A professional review before launch is the single most effective thing you can do.

The developers who get the best results treat AI output as a first draft—a fast junior developer whose work always needs a senior review. They read what the AI wrote, test it, refactor it, and take responsibility for what ships.

And think about maintenance from the start. The cheapest code to write is often the most expensive code to maintain. If nobody on your team understands the codebase, you’ll pay for it every time something needs to change.

Get Your Vibe-Coded App Reviewed Before You Launch

If you’ve built something with AI tools and you’re getting ready to put it in front of real users, we can help. Our Vibe Code Review service is a professional security and architecture review specifically designed for AI-generated applications. We’ll look at your code for security vulnerabilities, performance issues, architectural problems, and production readiness—and give you a clear report on what needs to be fixed before launch.

Already launched and want a second opinion? Get in touch—we’re happy to take a look.

Enjoyed this article? Share it!

T

Written by Tahoe Dev

Software engineer with 20+ years of experience building web applications, cloud infrastructure, and SaaS products. Passionate about modern development practices and helping businesses succeed with technology.