📄️ CPU: noisy neighbor
What happens when one workload takes up too much CPU in a Kubernetes cluster?
📄️ Postgres: primary server failure
In our lab, we use a highly available Postgres cluster managed by the Zalando operator. Under the hood, the operator relies on Patroni to monitor cluster instances and handle automatic failover.
📄️ Instance unavailability
What happens when one of our services goes down? Most teams have a dedicated alert that monitors instance availability.