= How do Kubernetes Probes Work = **Summary**: This wiki page explains the usage of kubernetes probes. \\ **Date**: 1 March 2025 \\ {{tag>kubernetes}} Even though they are pretty good described in the [[https://kubernetes.io/docs/concepts/configuration/liveness-readiness-startup-probes/ |documentation]] and a availability of a lot of additional resources, I found it hard to find a simple clear overview without going too much in depth. This wiki page aims to do just that. == Probes == Kubernetes probes are Kubernetes capabilities that allow containerised applications to be more reliable and robust. There are two main probes: * Liveness Probe: This probe checks if the application is alive. If the liveness probe fails, Kubernetes will restart the pod. * Readiness Probe: This probe checks if the application is ready to serve traffic. If the readiness probe fails, Kubernetes will stop sending traffic to the pod until it is ready again. Additionally, there is a third type of probe: * Startup Probe: This probe checks if the application has started successfully. If the startup probe fails, Kubernetes will kill the pod and start a new one. This is useful for applications that may take a long time to start up. The startup probe is only run once, at the start of the pod, and is not run again after that. === Probe Parameters === All probes share the same parameters: * initialDelaySeconds: Number of seconds after the container has started before the probe is scheduled. The probe will first fire in a time period between the initialDelaySeconds value and (initialDelaySeconds + periodSeconds). For example if the initialDelaySeconds is 30 and the period seconds is 100 seconds then the first probe will fire at some point between 30 and 130 seconds. * periodSeconds: The delay between performing probes. * timeoutSeconds: Number of seconds of inactivity after which the probe times-out and the application is assumed to be failing. * failureThreshold: The number of times that the probe is allowed to fail before the liveness probe restarts the container (or in the case of a readiness probe marks the pod as unavailable). A startupProbe should have a higher failure threshold to account for longer startup times. * successThreshold: The number of times that the probe must report success after it begins to fail in order to reset the probe process. The successThreshold parameter has no impact on a liveness probe. In yaml, the configuration of a probe looks like this, with the default values: livenessProbe: httpGet: path: /healthz port: liveness-port initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 1 failureThreshold: 3 successThreshold: 1 > Note that the configuration for readiness and startup probes is the same, except for the name of the {{{readinessProbe}}} or {{{startupProbe}}} key. == The Schedule == I've found the graphical representation of the probe schedule below very helpful in understanding how the probes work and interact: [{{k8sprobes.drawio.png?800|Graphical representation of probe schedules}}] \\ == Useful Links == * [[https://www.redhat.com/en/blog/liveness-and-readiness-probes |A redhat blog going in depth regarding the probes with several scenarios and examples]]