Kafka Producer Retries in Spring Boot Microservice

In this tutorial, you will learn how to configure Kafka Producer retries using two distinct approaches:

  1. The first approach utilizes the spring.kafka.producer.retries and spring.kafka.producer.properties.retry.backoff.ms configuration properties to define the number of retry attempts and the delay between them.
  2. The second approach involves the spring.kafka.producer.properties.linger.ms and spring.kafka.producer.properties.request.timeout.ms properties to manage the batching time and response timeout from the Kafka broker.

Retries are crucial in distributed systems because without retries, a single failure could result in lost messages, which could, in turn, lead to data inconsistency or a poor user experience.

If you are interested in video lessons then check my video course Apache Kafka for Event-Driven Spring Boot Microservices.

Part 1: Configuring Producer Retries

In this part of the tutorial, we’re going to adjust some settings for our Kafka Producer in the application.properties file.

The first setting we’ll add is for the number of retry attempts. By setting spring.kafka.producer.retries, you tell Kafka how many times to try sending a message if it doesn’t go through the first time.

spring.kafka.producer.retries=10

If you don’t set this property, Kafka uses a default that is very high (2,147,483,647), which is effectively infinite retries. But for our learning, we’ll set it to a smaller number like 10 to see how it operates. This means if the first attempt to send a message fails, Kafka will try ten more times.

The timing between these retries is managed by another setting: spring.kafka.producer.properties.retry.backoff.ms. This decides how long Kafka waits before it tries to send a message again.

spring.kafka.producer.properties.retry.backoff.ms=1000

By setting this to 1000 milliseconds, or one second, you’re making the producer wait for a second before it retries after a failure. So, if you have the retries set to 10 and the backoff to 1000 milliseconds, Kafka will spend up to about 10 seconds on retrying, unless the message gets sent successfully before that.

With these settings, you’re instructing your Kafka Producer to be persistent but patient, giving it enough opportunities to succeed without overloading the system or giving up too quickly.

Part 2: Producer Retries with Delivery and Request Timeout

While setting up Kafka Producer retries, you might initially consider individually controlling retry counts and backoff times. However, Kafka documentation suggests a more streamlined approach, focusing on the overall message delivery timeline.

Instead of managing the number of retries with separate configurations, you can use a single property as recommended:

spring.kafka.producer.properties.delivery.timeout.ms=120000

This property, spring.kafka.producer.properties.delivery.timeout.ms, is set to 120000 milliseconds by default, which equals 2 minutes. It’s the total time allotted for the entire send operation, including retries. If an acknowledgment from all in-sync replicas isn’t received within this period, the producer will consider the send attempt as failed and will throw an error.

For a moment, let’s set aside the retries and backoff.ms configurations. I will comment them out in my application:

// spring.kafka.producer.retries=10
// spring.kafka.producer.properties.retry.backoff.ms=1000

Now, focusing on the delivery.timeout.ms property, it’s crucial to ensure its value is no less than the sum of linger.ms and request.timeout.ms.

spring.kafka.producer.properties.linger.ms=0
spring.kafka.producer.properties.request.timeout.ms=30000

The linger.ms property is set to 0, meaning the producer will send messages immediately without waiting. No buffering for batching is done, which is suitable for scenarios where each message should be sent as soon as possible.

The request.timeout.ms property is set to 30,000 milliseconds, which is the time the producer waits for a response from the broker for each individual request.

To sum it up, the delivery.timeout.ms encompasses the whole send operation, whereas the request.timeout.ms is just for a single request. It’s a subtle but significant difference.

Conclusion

I hope this tutorial was helpful and made it easier for you to set up Kafka Producer retries in your Spring Boot applications.

If you found this useful and you’re looking to learn more about building Event-Driven Microservices, check out my other tutorials. I’ve got a whole series on using Apache Kafka to create Event-Driven Microservices with Spring Boot that you might find interesting. These guides are designed to help you step by step, just like this one, so you can keep learning and improving your skills.

If you are interested in video lessons then check my video course Apache Kafka for Event-Driven Spring Boot Microservices.

Happy learning!