We tried following the exception handling as per spring-cloud-stream documentation (Preface). But it is not working as expected. I have attached the spring application yaml for your review to find if something needs to be corrected.
Do you guys have a sample project to refer ?
Hi @Karthikaiselvan,
Is your handler imperative or reactive? I tested the settings out with an imperative function as documented in the reference guide and they seem to work for me w/ the Solace Binder. Note that I also included setting springframework.retry: debug so I can see that the retry lib was indeed using my values.
Note that we’re currently adding some error handling enhancements in v3 of the SCSt binder and I plan on providing some best practices and examples of consumer error handling w/ SCSt and the Solace binder once that is available
Hi @Karthikaiselvan,
Thanks for the clarification. Per the cloud stream docs, when using reactive functions the framework relies on retryBackoff capabilities for retry which seems to only work with maxAttempts, backOffInitialInterval and backOffMaxInterval. The other settings only apply to RetryTemplate used with imperative functions. I’ll reach out to a dev at Spring and see if they have a good example we can point to.
As a word of warning, you should only use Spring Cloud Stream w/ reactive functions when message loss can be tolerated. At the framework level it hands messages off to the Mono/Flux and ACKs the msg immediately so if your app were to crash the message would already be acknowledged back to the broker and removed from the queue. For this reason you’ll see most of our examples/guidance using imperative functions.
Hi @Karthikaiselvan,
I spoke to the cloud stream project lead at Spring and he confirmed my suspicion that the SCSt. Reference Guide is incorrect. Those retry settings do not work with reactive functions. He opened this issue to fix the reference guide. Basically Cloud Stream hands the messages to the Flux and you would need to handle the retry logic inside of the flux using reactor’s retry capabilities. I believe this might be a good place to start while they update the docs.
Hi @Karthikaiselvan, I just wanted to let you know that the team over at Spring added the docs and closed the issue so hopefully this will be more clear going forward.
@marc said:
Hi @Karthikaiselvan,
Thanks for the clarification. Per the cloud stream docs, when using reactive functions the framework relies on retryBackoff capabilities for retry which seems to only work with maxAttempts, backOffInitialInterval and backOffMaxInterval. The other settings only apply to RetryTemplate used with imperative functions. I’ll reach out to a dev at Spring and see if they have a good example we can point to.
As a word of warning, you should only use Spring Cloud Stream w/ reactive functions when message loss can be tolerated. At the framework level it hands messages off to the Mono/Flux and ACKs the msg immediately so if your app were to crash the message would already be acknowledged back to the broker and removed from the queue. For this reason you’ll see most of our examples/guidance using imperative functions.
if message loss can’t be tolerated, is there a solution to handle it in reactive functions? or the only option is to use the imperative functions.
Hi @Karthikaiselvan, the only option would be imperative functions. Note that this is out of the Solace binder’s control and at the framework level.
Note that we’re currently adding some error handling enhancements in v3 of the SCSt binder and I plan on providing some best practices and examples of consumer error handling w/ SCSt and the Solace binder once that is available
Reject messages that your app has no clue what to do with and retrying wouldn’t help. For example, there is vital missing information or the message wasn’t parseable. This will route them to the error queue for further processing by another app or maybe manual intervention?
Requeue messages that you think would be processed successfully if retried. This would usually be due to infrastructure issues, for example maybe a downstream service wasn’t responding or your instance of a microservice couldn’t reach the database. Once max-retries or the TTL on a message are reached then it would fall to the DMQ for further troubleshooting.