A little bit of vocabulary first! To understand well this article, we need to speak the same language, so you'll find below a few essential concepts to understand.
Requests and limits:
Pod's classes:
Now we know what requests/limits are and that pods have classes, we will now deep dive into the evicted process.
When a node reaches its disk or memory limit, a flag is set on the Kubernetes node to indicate that it is under pressure. This flag also blocks new allocations on this node, and following this, an eviction process is started to free some resources.
This is the under-pressure node's Kubelet who will take care of the eviction process. This one will start to fail Pods until the node's used resources are under the eviction threshold, which means that the Kubelet will terminate all Pod's containers and set its PodPhase as Failed.
If a Deployment manages the evicted Pod, the Deployment creates another Pod to be scheduled by Kubernetes.
The first thing that the Kubelet will do is freed the disk by deleting non-running pods and their images (this is a quick win). Then, if the disk cleaning is not enough, the Kubelet will launch a pods' eviction in this precise order:
For instance, let's take a node that has some CPU issues. If a Pod has a request on the CPU resource and uses half of its CPU request, it will be evicted after a pod with a request on the CPU resource but which uses more than its request.
As for Guaranteed pods, they are, in theory, safe in the context of an eviction.
The most important:
As you may understand, it is imperative to set requests and limits on your pods correctly.
What you can do is to set your critical applications as Guaranteed, most of them Burstable, and the non-critical applications are fault-tolerant in Best effort.
A few months ago, it happened to me that my Prometheus server pod was evicted. If you take a look at the Pod's events, you could see a message about "memory usage exceeds" looking like:
Message: The node was low on resource: memory. Container prometheus-server was using 2890108Ki, which exceeds its request of 2000Mi.
Here are the requests configured on this pod:
$ k describe pods prometheus-server-5c949c44f7-rc9sv | grep -iA2 Requests
Requests:
cpu: 500m
memory: 2000Mi
Well, it's not shocking that the Pod is consuming more than its memory request. The problem is, in the case where our node on which the Pod is running is in trouble with its memory (which is my case here), then our Pod will be evicted fast enough, just after the best effort ones.
Suppose you want to know more about the eviction process, and know how to prevent pod eviction. In that case, I encourage you to read this article from Kubernetes' official documentation, which explains the configuration of "Out of Resource Handling" more deeply.
This documentation covers eviction signals, eviction threshold, etc.