Skip to main content

Command Palette

Search for a command to run...

Sovereign Data: Why Bare Metal Won

Updated
8 min read

The cloud was never a technology decision. It was a convenience tax — and the invoice finally came due.


The Horror Story First

You've heard some version of this: a startup wakes up to a $47,000 AWS bill because traffic hit an unthrottled endpoint, and Lambda autoscaled into the void. Nobody was breached. No data was lost. The servers "worked" — technically. The bank account did not.

This is what the industry quietly calls a Denial of Wallet attack, and it's a 2026 risk that barely existed five years ago.

In a serverless, autoscaling world, a misconfigured function or a sufficiently motivated bot doesn't crash your infrastructure: it invoices you into a crisis. Your "availability" comes with a blank check attached.

This isn't an edge case anymore. It's a design flaw baked into the economic model of managed cloud.


The Cloud Was Right, Once

Let's be honest about why we all ended up here.

In 2018, running bare metal was genuinely painful. Linux servers were "snowflakes" — hand-crafted configurations that lived in one engineer's head. If the hardware failed, you spent two days rebuilding from memory and outdated runbooks. A managed RDS instance was worth the markup because the alternative was a 3am PagerDuty alert and a DBA on retainer.

The cloud's value proposition was never really about compute. It was about reducing the operational surface area for teams that couldn't afford a dedicated DevOps function. Click a button, get a Postgres database with automated backups, failover, and monitoring. The 300% markup was, arguably, fair for what you were getting.

That argument has aged poorly.


Two Things Broke the Equation

The shift didn't happen because bare metal got cheaper — though it did. It happened because two independent disruptions landed at the same time.

First: declarative infrastructure became mainstream. Tools like NixOS, Ansible, and Talos Linux turned server configuration from an art form into a text file. Your entire OS state — packages, users, services, kernel parameters — lives in version-controlled config. If a server dies, you point your playbook at a new IP and the machine rebuilds itself identically in minutes. The "snowflake server" problem that justified cloud convenience in 2020 is simply gone.

Second: LLMs became competent junior DevOps engineers. This is the less-discussed half of the shift. The honest cost of bare metal was never just hardware — it was the $120–150k/year engineer you needed to keep it running. Debugging a Postgres replication lag, writing a clean pg_dump rotation script, troubleshooting a Nginx upstream timeout — these tasks used to require deep specialist knowledge. Today, they're a conversation. An LLM won't replace a senior infrastructure engineer on a complex distributed system, but it has largely automated the operational toil that made bare metal feel risky for smaller teams.

Together, these two changes didn't just reduce the complexity of self-hosting. They collapsed the complexity tax that cloud providers had been monetizing for a decade.


What the Numbers Actually Look Like

The cost delta is not subtle.

A reasonable production setup on GCP — a Cloud SQL instance (4 vCPUs, 15GB RAM, Enterprise edition) with 500GB SSD storage runs approximately \(280–300/month on its own. Add two e2-standard-4 app servers (~\)96/month each), a load balancer, and Cloud Storage, and you're at $490–550/month for a mid-sized application — before factoring in egress fees, which GCP bills separately and which add up fast.

The Hetzner equivalent: a dedicated AX102 box (AMD Ryzen 9 7950X3D, 128GB DDR5 ECC RAM, 2× 1.92TB NVMe) runs €109/month (~$115). Self-hosted Postgres on NVMe storage doesn't just match managed Cloud SQL performance — for most workloads, it exceeds it, because you're no longer fighting the IOPS throttling that managed databases quietly impose to keep multi-tenant infrastructure stable. One physical machine comfortably handles what would require three separate managed services in the cloud.

The math isn't close. For a typical web application running less than 100k daily active users, bare metal is 4–5× cheaper at equivalent or better performance — and that ratio widens significantly once you factor in egress.


The Database Is the Point

The database deserves its own paragraph because it's almost always the actual bottleneck, and cloud pricing around databases is where the markup becomes most egregious.

Managed databases are priced as premium products, but what you're often buying is a throttled VM on shared infrastructure. A db.t3.medium on RDS gives you 2 vCPUs, 4GB of RAM, and IOPS limits that will make your DBA cry. The "managed" part — automated backups, minor version updates, multi-AZ failover — sounds like engineering but is mostly scripting that costs the provider almost nothing to run at scale.

On bare metal, for the same monthly spend, you can run Postgres on NVMe drives with 64–128GB of RAM available to the buffer pool. The database stops being a bottleneck not because you optimized queries, but because you stopped rationing hardware. Add a pg_dump cron written with LLM assistance in twenty minutes, a streaming replica on a second machine, and you have a production-grade setup that outperforms managed RDS on every dimension that matters for a real application.


Where Hyperscalers Still Win

This is not a universal argument, and intellectual honesty requires saying so clearly.

If your application has genuine, unpredictable, massive horizontal scaling requirements — global game launches, viral consumer products, financial systems processing millions of transactions per second — cloud hyperscalers offer something bare metal cannot easily replicate: the ability to go from 10 to 10,000 servers in minutes. Kubernetes on GKE or EKS, combined with cloud-native autoscaling, is a real engineering solution for problems that require it.

But here's the question worth asking: does your application actually have that problem?

The vast majority of B2B SaaS products, internal tools, marketplaces, and APIs will never need to scale horizontally faster than a human can provision a new server. Most "we need to handle traffic spikes" conversations are really "we need a CDN and a read replica." Most "we need multi-region" conversations are really "we need Cloudflare in front of one well-configured server."

Cloud hyperscalers built their pricing model assuming every customer might one day be Netflix. Most customers are not Netflix, and they've been paying Netflix insurance premiums for a decade.


The Hybrid Line

Even committed bare-metal operators keep two things in the cloud, and there's no shame in it.

Object storage (S3, R2, Backblaze B2): Managing petabytes of data on physical drives — with redundancy, bit-rot protection, geographic replication — is a genuinely hard operational problem. The economics of S3-compatible object storage are defensible. R2 in particular, with zero egress fees, is almost impossible to beat for asset and backup storage.

Edge and CDN (Cloudflare): A global Anycast network absorbs DDoS traffic, terminates SSL, and caches static assets before requests ever touch your hardware. Paying Cloudflare to keep garbage traffic off your bare-metal NICs is not a cloud dependency — it's sensible architecture. The free tier handles most use cases.

Everything else — compute, databases, queues, caches, search — runs better and cheaper on hardware you control.


The 2026 Stack

Layer Choice Why
Compute / App Logic Bare metal or dedicated VPS 5–8× cheaper; LLMs handle OS config
Primary Database Self-hosted Postgres on metal Order-of-magnitude better IOPS per dollar
Storage / Assets S3 / R2 / Backblaze Durability at scale is genuinely hard
Security / CDN Cloudflare Global DDoS shield; near-zero cost
Infrastructure Config Ansible / NixOS Reproducible; version-controlled; disposable

What Sovereignty Actually Means

The cost argument is compelling, but it's not the deepest reason to care about this.

When your data lives on infrastructure you control, you make decisions about jurisdiction, retention, and access. You are not subject to a provider's terms of service change, a policy update, or a compliance requirement that affects their business but not yours. You are not one account suspension away from an operational crisis.

"Sovereign data" means your infrastructure reflects your decisions, not your vendor's. In regulated industries — healthcare, finance, legal — this is not a philosophical preference. Having implemented ISO 27001 myself, the cloud's shared-responsibility model always struck me as an awkward fit: it asks you to sign off on controls you can't fully inspect.

But even outside regulated industries, there's something clarifying about infrastructure you understand end-to-end. When something breaks, you debug it. When something costs money, you know why. When a configuration needs to change, you change it. Cloud billing doesn't work that way.

I've lost hours to GCP's billing console, cross-referencing SKUs across compute, egress, storage operations, and networking — just to understand why the bill went up 15%. Sovereign infrastructure doesn't make costs disappear, but it makes them legible. One server, one line item, one number you can reason about.

The cloud sold convenience. Bare metal, in 2026, offers something more valuable: legibility. You know what your system is doing and why. In an era where AI is handling more of the operational complexity, that clarity is worth more than a managed dashboard and a 500% markup.


The complexity tax has been paid. The question now is whether you'll keep paying it.