←back to thread

234 points benocodes | 1 comments | | HN request time: 0.193s | source
Show context
xyst ◴[] No.41838001[source]
I wonder if an upgrade like this would be less painful if the db layer was containerized?

The migration process they described would be less painful with k8s. Especially with 2100+ nodes/VMs

replies(5): >>41838083 #>>41838418 #>>41838563 #>>41839037 #>>41839595 #
1. __turbobrew__ ◴[] No.41839595[source]
I can tell you that k8s starts to have issues once you get over 10k nodes in a single cluster. There has been some work in 1.31 to improve scalability but I would say past 5k nodes things no longer “just work”: https://kubernetes.io/blog/2024/08/15/consistent-read-from-c...

The current bottleneck appears to be etcd, boltdb is just a crappy data store. I would really like to try replacing boltdb with something like sqlite or rocksdb as the data persistence layer in etcd but that is non-trivial.

You also start seeing issues where certain k8s operators do not scale either, for example cilium cannot scale past 5k nodes currently. There are fundamental design issues where the cilium daemonset memory usage scales with the number of pods/endpoints in the cluster. In large clusters the cilium daemonset can be using multiple gigabytes of ram on every node in your cluster. https://docs.cilium.io/en/stable/operations/performance/scal...

Anyways, the TL;DR is that at this scale (16k nodes) it is hard to run k8s.