Retryable and Not Retryable Exceptions in Apache Kafka

In this tutorial, you will learn about the retryable and non-retryable exceptions in Apache Kafka. You’ll learn how to create your own custom exceptions and how to register then with Kafka’s DefaultErrorHandler object.

To demonstrate how retryable and not retryable exceptions work, I will use the Kafka Consumer Spring Boot Microservice, which I created earlier.

If you are interested in video lessons then check my video course Apache Kafka for Event-Driven Spring Boot Microservices.

What are Retryable and Not Retryable Exceptions?

In Apache Kafka, Retryable and not Retryable exceptions are types of errors that occur when a Kafka consumer tries to read or process messages from a Kafka cluster. Understanding the difference between these two is essential for handling errors effectively in your consumer applications.

Retryable Exceptions

Retryable Exceptions are those types of errors that are potentially temporary and may resolve on their own if the operation is retried. For example, if there is a network glitch causing a temporary loss of connection to the Kafka broker, the consumer might throw a LeaderNotAvailableException. This is considered retryable because retrying the connection after a short delay might succeed as the network issue may resolve by then.

Here are some characteristics of retryable exceptions:

They often result from temporary issues, like network failures or resource unavailability.
They do not indicate a failure in the logic of the application.
The Kafka consumer can often recover from these exceptions by retrying the operation after a delay.

Not Retryable Exceptions

Not Retryable Exceptions are the opposite. They typically indicate a more serious problem that is not likely to be resolved by simply retrying. For instance, if you encounter an InvalidConfigurationException, it implies that there’s something wrong with your consumer’s setup that needs to be addressed and fixed. Retrying the operation without making any changes will only result in the same error being thrown again.

Characteristics of non-retryable exceptions include:

They often indicate a permanent issue, such as incorrect configuration or a logic error.
Simply retrying the operation without any change is unlikely to be successful.
They require a review and correction of the underlying problem before the operation can succeed.

In your Kafka consumer application, distinguishing between these two types of exceptions is important because it helps you decide when to retry an operation and when to take corrective action.

In this tutorial, you will learn how to create your own, custom exceptions and how to register then as retryable or not retryable exception types.

Creating Custom Retryable and Not Retryable Exceptions

To give you more control over how your consumer handles errors, you can create custom exceptions. Let’s create two types of exceptions: one that’s retryable and one that’s not.

Creating a Custom Retryable Exception

First up is the retryable exception. This is the kind you want your consumer to attempt to process again, under the belief that the issue may resolve itself. For example, a network hiccup is usually transient and can often be overcome by retrying the connection.

Let’s begin by creating a new class called RetryableException.

public class RetryableException extends RuntimeException {
    public RetryableException(Exception exception) {
        super(exception);
    }

    public RetryableException(String message) {
        super(message);
    }
}

In this class, I extend RuntimeException because I want my custom exception to be unchecked, meaning it doesn’t need to be declared or handled explicitly. I provide two constructors:

One that accepts another exception. This is useful when you want to wrap an existing exception that you’ve caught, but still mark it as retryable.
Another that accepts a String message, allowing you to throw a new exception with a custom error message.

Creating a Custom Not Retryable Exception

Next, I will create an exception for when retrying won’t help—this is the non-retryable exception. When you throw this type of exception, you’re signaling that the issue requires attention before the consumer can proceed.

Here’s how you create the NotRetryableException class:

public class NotRetryableException extends RuntimeException {
    public NotRetryableException(Exception exception) {
        super(exception);
    }

    public NotRetryableException(String message) {
        super(message);
    }
}

Register Custom Retryable and Not Retryable Exceptions

Now that we have our custom exceptions, let’s integrate them into our Kafka consumer’s error handling. This is where the DefaultErrorHandler comes into play. In Apache Kafka, the DefaultErrorHandler is responsible for managing how exceptions are handled during the consumption of messages.

Understanding the DefaultErrorHandler

The DefaultErrorHandler is a configurable component in the Spring Kafka framework that dictates what happens when an exception is thrown while processing a message. If the exception is retryable, the handler can attempt to process the message again. If it’s not retryable, or if all retry attempts fail, the message can be forwarded to a dead-letter topic or logged for further investigation.

The DefaultErrorHandler uses a BackOff mechanism to determine the interval between retries and how many attempts to make. This is crucial to prevent a flood of retries in rapid succession, which can overwhelm your system.

You will see how to set it up in the code example below.

Configuring the DefaultErrorHandler

Let’s look at how to configure the DefaultErrorHandler using Spring’s Java configuration. We’ll register our custom exceptions to inform the handler which ones should trigger a retry and which should not.

Here’s the code snippet:

@Bean
ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory(
        ConsumerFactory<String, Object> consumerFactory, KafkaTemplate<String, Object> kafkaTemplate) {
    
    // Configure the errorHandler with a DeadLetterPublishingRecoverer and a fixed back-off policy.
    DefaultErrorHandler errorHandler = new DefaultErrorHandler(
        new DeadLetterPublishingRecoverer(kafkaTemplate),
        new FixedBackOff(5000,3)); // 5 seconds delay, 3 attempts

    // Register our custom exceptions with the errorHandler.
    errorHandler.addNotRetryableExceptions(NotRetryableException.class);
    errorHandler.addRetryableExceptions(RetryableException.class);

    // Create a container factory and set the custom errorHandler.
    ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory);
    factory.setCommonErrorHandler(errorHandler);
    
    return factory;
}

In the configuration above:

new DeadLetterPublishingRecoverer(kafkaTemplate) is used to publish failed messages to a dead-letter topic after retries are exhausted. To learn more about dead-letter topic read “Kafka Consumer: Send Message to a Dead Letter Topic“.
new FixedBackOff(5000,3) sets the back-off policy with a 5000-millisecond (5 seconds) interval between retries and a maximum of 3 retry attempts. To learn more about retries, read “Kafka Producer Retries in Spring Boot Microservice“.
The method addNotRetryableExceptions(Class<?>... exceptionClasses) is used to tell the handler which exceptions are non-retryable. When such an exception is encountered, the handler will not attempt to retry and will perform the recovery action immediately.
On the flip side, addRetryableExceptions(Class<?>... exceptionClasses) informs the handler of exceptions that are considered retryable. If one of these exceptions occurs, the handler will follow the back-off policy we set up before considering the message as failed and moving it to the dead letter topic or performing any other recovery action.

In the next section, we’ll see how to actually throw a RetryableException and manage the retry process in action. This will give you a complete picture of how exception handling works in a real-world Kafka consumer application.

Throwing a RetryableException or Not Retryable Exceptions

In this section, I’m going to explain how you can throw a RetryableException when you encounter a temporary issue while processing messages from Kafka.

Processing Kafka Messages

When your Kafka consumer receives a message, it will call a method annotated with @KafkaHandler. This is the method you’ll use to process the message. Let me show you how to do this with a ProductCreatedEvent message.

Here’s the Java method for handling the Kafka message:

@KafkaHandler
public void handle(ProductCreatedEvent productCreatedEvent) {
    LOGGER.info("I received a new event: " + productCreatedEvent.getTitle() + " with productId: "
            + productCreatedEvent.getProductId());

    String requestUrl = "http://localhost:8082/response/200";

    try {
        ResponseEntity<String> response = restTemplate.exchange(requestUrl, HttpMethod.GET, null, String.class);

        if (response.getStatusCode().value() == HttpStatus.OK.value()) {
            LOGGER.info("I received a response from a remote service: " + response.getBody());
        }
    } catch (ResourceAccessException ex) {
        LOGGER.error("There was an error: " + ex.getMessage());
        throw new RetryableException(ex);
    } catch (HttpServerErrorException ex) {
        LOGGER.error("There was an error: " + ex.getMessage());
        throw new NotRetryableException(ex);
    } catch (Exception ex) {
        LOGGER.error("An unexpected error occurred: " + ex.getMessage());
        throw new NotRetryableException(ex);
    }
}

Deciding When to Throw a RetryableException

In the try block, I am making an HTTP request to another microservice. If there’s a network issue or the remote service is temporarily down, a ResourceAccessException might be thrown. When I catch this exception, I log the error message and then throw a RetryableException. This lets my Kafka consumer know that it should try to process the message again.

If you catch an HttpServerErrorException, it means there’s a problem on the server side that’s likely not going to be fixed by a simple retry. Here, you can log the issue and throw a NotRetryableException, signalling that it’s not worth retrying.

For all other types of exceptions, which are unexpected, you can also throw a NotRetryableException.

Here’s what happens if you throw a NotRetryableException:

Immediate Handling: Since NotRetryableException is registered with DefaultErrorHandler as a non-retryable exception, the error handler will not use the retry logic that it would typically apply for retryable exceptions.
Recovery Logic: The DefaultErrorHandler is configured with a DeadLetterPublishingRecoverer. This recoverer comes into action immediately after a non-retryable exception is caught. Instead of retrying, it will publish the failed message to a dead-letter topic.
No Retries: No retries are attempted for the message. The message is moved to the dead-letter topic straight away. This is different from a retryable exception, where the message would be retried a few times (as per the back-off policy) before being moved to the dead-letter topic.
Manual Intervention: Typically, messages in the dead-letter topic are investigated to understand why they failed and to determine if the issue can be resolved. Depending on the nature of the problem, you might fix an issue in the system or with the message itself and then reprocess the message manually.

Conclusion

I hope this tutorial has been helpful for you. To deepen your knowledge and learn more about how Apache Kafka can be used to build Event-Driven Spring Boot Microservices, I invite you to explore my other tutorials on Apache Kafka. They’re designed to give you practical skills and insights that you can apply to your projects.

If you are interested in video lessons then check my video course Apache Kafka for Event-Driven Spring Boot Microservices.