Vibe-Coded Apps Are Going to Be the Next Big Security Crisis

AI-generated code is being deployed to public servers by people who have never thought about what happens next. The industry is starting to notice. Here's what I've seen running my own infrastructure for years — and why this time feels different.

By David SharkeyMarch 29, 20269 min read
Terminal output showing an insecure VPS exposed to the internet
Security
VPS
Vibe Coding
Self Hosting
AI
DevOps
Linux

TL;DR

Millions of people are using AI to write and deploy apps without understanding what they're running or what it's exposed to. OpenAI and Anthropic are both building responses to this problem. Meanwhile, real servers are sitting wide open right now. Here's what that looks like from someone who's been running their own infrastructure for years.

I've been self-hosting my own services for a long time. Not as a side effect of a job — by choice. I run sharkey.io on my own Hetzner box. I have Unraid servers at home. I've built backup systems, managed bare metal, broken things, fixed them, broken them again. I've gone through the whole thing of setting up a fresh VPS properly: non-root user, closed ports, fail2ban, Tailscale, the works. I wrote a script to automate all of it because I got tired of doing it by hand every time.

The reason I'm writing this is because the environment has changed. The people deploying code to servers has changed. And the security posture of the average internet-facing app is about to get significantly worse before it gets better.


What vibe coding actually means in production

Vibe coding — letting an AI write your app while you describe what you want — is legitimately useful. I use it. It speeds up projects I care about and handles the boilerplate I find tedious. That's not the problem.

The problem is what happens at the end.

Someone builds a web app, an API, a tool for their business — and then they deploy it. To a VPS. Or a home server. Or something worse. The app is running. Port 80 is open. Maybe they added a domain. They show it to their friends. They move on.

And nobody — not the user, not the AI that wrote the code, not anyone — asked a single question about what that server looks like from the internet.

No firewall. Root access still on. Port 22 open, scanning bots hitting it within minutes of first boot. Default credentials on the app itself. A database bound to 0.0.0.0. Environment variables in a .env file that's publicly readable if you know the path. Dependencies that haven't been updated since the day they were installed.

This isn't a hypothetical. This is the default state of a VPS. And the number of people in this state is growing fast.


The industry is paying attention

Two things happened recently that caught my eye.

OpenAI is building Frontier — their initiative around the security and safety of AI-generated systems running in the real world. The framing is about AI systems operating autonomously, but the underlying concern is the same: code is being written and deployed faster than anyone can audit it, and the attack surface is expanding in ways that are hard to track.

Anthropic has their Mythos project, which is similarly focused on the downstream consequences of AI-assisted development at scale. Not just what AI writes, but what happens when that code runs on infrastructure that nobody secured.

Both of these are the same observation approached from the top. The models that write the code are starting to be held accountable for what that code does once it's live. That's a real shift. It means the industry is acknowledging that the generation pipeline — prompt to code to deployment — has a security problem baked into it.

The gap between what the AI produces and what a secure production deployment actually requires is not small. It's the entire difference between code that works and infrastructure that's safe.


My own stack, for context

I want to be specific about what I'm talking about when I say self-hosting, because it matters for how seriously you take these risks.

Hetzner VPS I run sharkey.io and about a dozen other projects on a single Hetzner box. 2–4 cores, fraction of the cost of AWS. Everything on it has been through ironboot — my own hardening script. Non-root user, SSH off port 22, UFW default-deny, fail2ban, Tailscale. The server does not have a wide-open attack surface because I spent time making sure it didn't.

Unraid servers At home I run Unraid for local services and storage. Unraid is excellent — it makes running a self-hosted stack at home genuinely manageable. But it is also, by default, not exposed to the internet, and there are very good reasons for that. The number of Plex servers, home automation dashboards, and network storage shares I've seen accidentally exposed because someone port-forwarded without thinking about what they were doing is not zero.

531 backup system I run a backup system I call 531: five local copies across multiple volumes, three off-site in separate geographic locations, one that's completely air-gapped and rotated manually. This is paranoid by most standards. I built it because I've lost data before and I've seen what happens when someone hasn't thought about this and something goes wrong. Backups are one of those things that feel unnecessary until the exact moment they become critical. The reason the "1" in 531 is air-gapped is that ransomware specifically targets connected backup systems — if your backup is mounted and writable, it can be encrypted along with everything else.

The point isn't that my setup is the right answer for everyone. The point is that there's a difference between infrastructure that someone has actually thought about and infrastructure that someone spun up and forgot.


What the actual attack surface looks like

If you're running a vibe-coded app on a VPS right now and you haven't hardened the server, here's a realistic picture of what you're exposed to.

Automated scanning starts immediately. Within minutes of a fresh VPS going live, bots are hitting port 22. They're testing default credentials, known CVEs, and weak passwords. This is not targeted — it's background noise from scripts running at scale. The vast majority of VPS compromises aren't targeted attacks. They're someone's server being picked up by a scanner that tried the same 200 credential combinations on ten thousand IPs that day.

AI-generated code ships common vulnerabilities by default. SQL injection. Command injection. Hardcoded credentials. Secrets in environment variables that get logged. Input that doesn't get validated. These aren't failures of AI specifically — they're failures of code written without a security review. The AI writes functional code. It does not write secure code by default unless you explicitly prompt for it, and even then the output isn't audited.

Dependencies are a time bomb. The AI picks a library that was current when it was trained. That library has had two CVEs since. The VPS has no automatic updates. Nobody has looked at it in three months. This is a very normal situation that ends badly.

Docker doesn't fix it. I run Docker on every server I manage, and Docker is great. But Docker has a specific and well-documented issue: by default, Docker bypasses UFW. A container binding to 0.0.0.0:5432 will be reachable from the internet even if UFW says port 5432 is closed. This is not a misconfiguration — it's the default behavior. If you're running AI-assisted apps in Docker without specifically addressing this, you may have services exposed that you believe are private.


Why this time is different

Security problems with self-hosted apps aren't new. People have been deploying insecure things forever.

What's different now is the volume and the confidence.

Previously, deploying a server required enough technical knowledge that you were also likely to know the basics of securing it. The activation energy was high enough that it selected for a certain level of awareness. That filter is gone. An AI will take someone from "I have an idea for an app" to "the app is running on a public IP" in an afternoon, and the only thing standing between that server and the internet is whatever the user thought to ask the AI about security — which is usually nothing, because they don't know what to ask.

The confidence is the other part. AI-generated code looks complete and professional. It runs. Tests pass. There's no visible indication that the deployment is missing anything important. The person who deployed it genuinely believes they've done it correctly. That false confidence is more dangerous than ignorance, because ignorance at least sometimes prompts caution.


What to actually do

If you're running something on a VPS — vibe-coded or otherwise — and you haven't thought about the security layer, here's the baseline I'd consider non-negotiable:

Use a non-root user. Get off root immediately. Create a named admin user with sudo. Everything you run as root with no reason is risk with no benefit.

Move SSH off port 22 and disable password auth. Port 22 is noise. SSH keys only. MaxAuthTries 3. Validate the config before restarting the service.

Default-deny firewall. UFW with default-deny incoming. Open only the ports you actually need. Be aware of Docker's iptables behavior — use --iptables=false in Docker config if you want UFW to be authoritative, or handle Docker-specific rules explicitly.

fail2ban. Even with password auth disabled, fail2ban keeps the noise down and your logs readable. Three retries, three-hour ban, enforced through UFW.

Tailscale. Put Tailscale on everything you own. Your SSH port doesn't need to be on the internet. tailscale ssh user@hostname works from anywhere on your Tailnet. Remove the public SSH rule once you've confirmed it's working. This is the cleanest security improvement available with the least operational overhead.

Automatic security updates. unattended-upgrades configured for daily security patches. Auto-reboot off — you want to choose when the server restarts. This doesn't require any ongoing effort once it's set up.

I built ironboot to cover all of this in a single guided run on a fresh Ubuntu or Debian VPS. Twelve steps, all optional, everything logged. It's what I run on every server I spin up.


The trajectory

OpenAI and Anthropic both building security-focused initiatives around AI-generated code is not a coincidence. It's a recognition that the deployment gap is real and getting larger.

The code generation problem is largely solved — the models are good. The deployment problem hasn't been touched. The gap between "working app" and "working app running on a hardened server that isn't going to get compromised in the first week" is still entirely manual, entirely optional, and almost entirely skipped.

At some point someone will build tooling that closes this gap at the generation step — hardened deployment configs produced alongside the application code, infrastructure-as-code that includes the security layer, deployment pipelines that refuse to push to a server that fails a basic hardening check. That tooling doesn't exist yet in a form that a non-technical user encounters.

Until it does, the responsibility sits with whoever is running the server. And most of them don't know it.

If you're self-hosting anything, take the afternoon. Run the script. Or don't use the script and do it manually — either way, do it. A server with something real on it deserves to be treated like one.

Nobody gets hacked slowly.