Explain why kubernetes isn't a good choice for hosting a relational database.

sgarland · on Sept 1, 2024

For small databases (anything with ~< 10,000,000 rows), sure, it's probably fine. Other than that, no.

* Unless you are self-hosting K8s and thus have a large amount of control over the underlying storage, the amount of IOPS you're getting will be hazy at best. Tbf this is also true with every single DBaaS, because the latency on network storage is absurd, so IOPS become somewhat meaningless.

* Unless you have modified the CPU scheduling options [0] in K8s, you have no control over core pinning or NUMA layout. This is even worse due to the fact that your K8s nodes are probably multi-tenancy.

* By its nature, K8s is designed to host stateless apps. It is fantastic at doing this, to be clear. I love K8s. But a system where the nodes can (and should, if you're taking advantage of spot pricing) disappear with a few minutes' warning is not a great host for an RDBMS.

* Hot take: it makes provisioning a database even easier, which means people with even less understanding or care of how they operate will be doing so with reckless abandon, which means people like me have even more work to do cleaning up their mess. I am a big fan of gatekeeping things that keep companies afloat. If you want to touch the thing that every service is depending on, learn how it works first – I'd be thrilled to help you. But don't come in and just yolo a copy-paste YAML into prod and then start chucking JSON blobs into it because you can't be bothered to learn proper data modeling, nor reading RDBMS docs.

Re: [0], if you don't care about core pinning, then it's unlikely you're going to care about any of these other points, and you probably also don't understand (or care) how blindingly fast an RDBMS on metal with NVMe can be.

I am not a Luddite. To reiterate, I have administrated self-hosted and managed K8s professionally. I also run it at home. I just have strong opinions about understanding fundamentals (and not causing myself extra work by allowing people who don't care about them to run infra).

[0]: https://kubernetes.io/docs/tasks/administer-cluster/cpu-mana...

zsoltkacsandi · on Sept 1, 2024

> it makes provisioning a database even easier, which means people with even less understanding or care of how they operate will be doing so with reckless abandon, which means people like me have even more work to do cleaning up their mess. I am a big fan of gatekeeping things that keep companies afloat. If you want to touch the thing that every service is depending on, learn how it works first – I'd be thrilled to help you

Couldn’t agree more.

Kubernetes and such tools do not make things easier, they just give you the illusion of it.

sgarland · on Sept 1, 2024

That is an excellent way of putting it. Abstractions makes you think you’ve got a handle on things, but when they break, now you have two problems – neither of which you’re probably equipped to solve.

movedx · on Sept 1, 2024

When I was doing Econometrics at Uni, the lecturer wouldn't let us use the functions in the software that did the mathematics for us. He made us learn using Spreadsheets and doing the calculations manually, _then_ he let us use the automated functionality once we understood it.

achanda358 · on Sept 1, 2024

> By its nature, K8s is designed to host stateless apps. It is fantastic at doing this, to be clear. I love K8s. But a system where the nodes can (and should, if you're taking advantage of spot pricing) disappear with a few minutes' warning is not a great host for an RDBMS.

Why is this a problem? A typical deployment will have multiple replicas, with (hopefully) small replication lag. Those should be able to be promoted to be the new primary within a minute.

porker · on Sept 1, 2024

> A typical deployment will have multiple replicas, with (hopefully) small replication lag. Those should be able to be promoted to be the new primary within a minute.

What happens within that minute to database writes?

sgarland · on Sept 1, 2024

Do you enjoy being paged for something out of your control? I don’t.

achanda358 · on Sept 1, 2024

I don't get a page for something that is not under my control, that is an organisational problem if you get that. But how is it relevant here?

In my example, I will get a page for large replication lag. But not for an unplanned failover. That will be an alert, but not a page.

zsoltkacsandi · on Sept 1, 2024

Well, the problem is a little bit more complicated than just having replicas.

You cannot build operational procedures based on “hope”.

High replication lag occurs for many many reasons (and they are not a rare event, or something that you can prevent). As well as network partitions.

Replication and binary logs can get corrupted, there can be deadlocks, duplicated row errors, etc.

The thing is that database administration is a broad and complicated topic, a small mistake or the lack of understanding how these systems work can easily lead to huge data losses.

anonzzzies · on Sept 1, 2024

> typical deployment

Ah yes, HN. You know there are billions of sites(wp mostly), LoB apps etc that run on 1 mysql/pg/etc instance right? Replicas are not typical and a tiny minority.

movedx · on Sept 1, 2024

Exactly. Kubernetes and micro services? Sure, for about 0.5% of the industry. Everyone else needs two servers and a load balancer.

sgarland · on Sept 1, 2024

Technically four servers, because you’ll want a HA LB as well, but yes.

Tech is rife with people who have never set up an old school HA solution proffering advice on how a miasma of cloud services makes theirs better.

kgeist · on Sept 1, 2024

Why would I want k8s for a simple Wordpress site in that case?

anonzzzies · on Sept 1, 2024

OP was talking about 'typical database setup' being a replicated db. It's not typical. Nor is the use of k8s for stuff outside HN and massive companies. Not that I mentioned k8s anyway.

DavyJone · on Sept 2, 2024

I think many of the performance points here are a trait of a VM in Cloud more than K8s, and they would be no different if running in EC2 right?