Liveness and Readiness probe in microservices

1. Introduction

A liveness probe determines whether the container or application is operational. If it’s healthy, no intervention is needed as everything is running smoothly. If it’s unresponsive or unhealthy, corrective action, such as restarting the container, is triggered. In essence, the liveness probe verifies: “Is the container alive or not?”

A readiness probe checks if a container or application is prepared to handle incoming network traffic. If the container is not ready—such as during startup or while initializing dependencies—the probe ensures traffic is withheld until it is capable of processing requests. This prevents scenarios where premature traffic delivery could result in errors, like a 502 status code or a “connection refused” message for the client.

Simply put, a readiness probe answers the question: “Is this container ready to receive and process network traffic?”

2. Implementation in Spring Boot

Ensure your pom.xml includes the necessary actuator dependency:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

And in your application.yml, enable readiness and liveness probe endpoints.

management:
	readiness-state:
		enabled: true
	liveness-state:
		enabled: true
	endpoint:
		health:
			probes:
				enabled: true

Inside Spring Boot applications, actuator gathers the “Liveness” and “Readiness” information from the ‘ApplicationAvailability’ interface and uses the information in dedicated health indicators: LivenessStateHealthIndicator and ReadinessStateHealthIndicator. These indicators are shown on the global health endpoint (“/actuator/health”). They are also exposed as separate HTTP probes by using health groups: /actuator/health/liveness and /actuator/health/readiness.

2. Updating docker-compose.yml

The healthcheck section in your Docker Compose configuration is used to define a health check for the configserver container. It ensures that the container is functioning correctly by running a specified test command at regular intervals. Here’s a detailed breakdown of the section:

configserver:
	image: "learnitweb/configserver:v1"
	container_name: configserver-ms
	ports:
		- "8071:8071"
	
	healthcheck:
		test: "curl --fail --silent localhost:8071/actuator/health/readiness | grep UP || exit 1"
		interval: 10s
		timeout: 5s
		retries: 10
		start_period: 10s

test Command: Checks if the application inside the container is healthy.
- --fail: Ensures the command exits with a failure code if the HTTP status is not in the 2xx range.
- --silent: Suppresses extra curl output like progress bars.
- Pipes the response to grep to search for the word “UP”, which indicates that the application is healthy.
- If “UP” is not found, the command explicitly exits with status 1 (indicating failure).
interval: Specifies how often the health check is performed. The health check runs every 10 seconds.
timeout: The maximum time the health check command can run before it’s considered to have failed. If the command doesn’t complete within 5 seconds, it will be marked as a failure.
retries: The number of consecutive failures allowed before marking the container as unhealthy. If the health check fails 10 times in a row, Docker marks the container as unhealthy.
start_period: A grace period before the health checks begin after the container starts. The container is given 10 seconds to initialize before the first health check runs.

2.1 How It Works Together

When the container starts, it gets a 10-second grace period (start_period).
After that, the health check runs every 10 seconds (interval).
Each check runs the curl command to check the readiness endpoint.
If the command doesn’t respond within 5 seconds (timeout), it’s considered a failure.
The container is marked as unhealthy if the health check fails 10 times consecutively (retries).

3. Adding dependency on other microservice

Following is an example:

myservice:
	image: "learnitweb/myservice:v1"
	container_name: service-ms
	ports:
		- "8090:8090"
	depends_on:
		configserver:
			condition: service_healthy

depends_on specifies that myservice depends on the configserver service being healthy before it starts. The other values than service_healthy are service_started and service_completed_successfully.

4. Conclusion

Liveness and readiness probes are essential tools for ensuring the reliability and availability of microservices in a distributed system. They allow applications to gracefully handle failures, optimize resource utilization, and maintain high uptime.

Both probes work together to improve the robustness of your microservices by aligning system health checks with the dynamic needs of modern applications. Implementing them effectively requires understanding your application’s behavior and lifecycle. By doing so, you can build resilient systems that adapt to changes, recover from failures, and deliver a seamless experience to users.