Sunsetting Contabo

For a little over a year, I've been running Kubernetes via Talos Linux on Contabo VPSes.

I originally chose Contabo because it was the cheapest provider that Talos Linux images worked out-of-the-box with. At the time, I also found value in having unreliable nodes to passively test the resilience of high availability setups.

However, as my workloads grew, Contabo stopped being a good fit. The biggest problem was absurdly high steal times (>90%!) causing timeout issues, especially with systems that must maintain consensus (e.g. Longhorn volume replication, etcd, Kafka). Additionally, the network bandwith was limited to 100Mb/s, which meant aggressive downsampling of metrics and logs, which was at odds with my desire to add even more instrumentation via ebpf, with tools like Parca.

I'm not going to pretend this wasn't a significant amount of work. In total, this was a 12-node cluster spanning three control planes and nine workers, managing ~70 vCPUs, ~200GB RAM, and 6.2TB of storage across three separate node pools.

So, before I close this chapter, here's how my setup progressed:


SM Cluster

Aespa nodepool

I started my self-hosted Kubernetes journey with this nodepool, setting up a single control plane and three workers to create a highly available foundation for my core workloads.

namerolevCPURAM (GB)Storage
Karinacontrol plane612200GB
Giselleworker8241.2TB
Ningningworker8241.2TB
Winterworker8241.2TB

Red Velvet nodepool

As the cluster grew and I needed more capacity, I added this nodepool, expanding the control plane to three nodes and adding three more workers.

namerolevCPURAM (GB)Storage
Irenecontrol plane612200GB
Seulgiworker (was control plane)46400GB
Wendyworker12481.6TB
Joyworker616400GB
Yeriworker616400GB

Hearts2Hearts nodepool

By early 2025, the cluster's demands had grown - particularly on the control plane. API server load was increasing, especially during deploy-heavy periods and when running resource-intensive workloads like Prometheus or ArgoCD. To keep things responsive and maintain headroom, I brought in a third node pool with beefier control-plane nodes and additional workers.

This also allowed me to spread the control plane across all three node pools, improving availability and reducing the blast radius of any single failure domain.

namerolevCPURAM (GB)Storage
Jiwoocontrol plane612400GB
Stellaworker612400GB
Ianworker612400GB
Yuhaworker46400GB
Yeonworker46400GB

Backlinks