I have a general question regarding the difference between a retransmission and a redeliver when consuming messages from a queue. Sorry for the long text, wanted to just make clear what I am trying to achieve / understand.
We have a customer where I can see a lot of retransmissions happening from time to time.
Once he restarts his application we see a (longer) period in message redelivers, which takes another application restart to resolve and for the application to successfull consume the messages again.
Those retransmissions seem to happen randomly. As I have no insight into their application and have to trust that “there are no errors in our logs, we just see that messages pile up in the queue” I wanted to figgure out what causes a retransmission of a message in the first place to better understand the issue.
In my basic understanding the client API is prefetching messages from a queue to “local”, acknowledging them as “delivered” back to the broker and hands them to the client application to acknowledge them as consumed (if successfull). I tried to set up a small testing scenario with a client which simply does not acknowledge the messages.
They all get transfered and I can see them as “Unacknowledged Messages” in the consumer view of the queue. But now no matter what I try, everything just causes a redelivery, never a retransmission.
I tried to:
Disconnect the client forcefully on client side (CTRL+C)
Disconnect the client forcefully on broker side
Added iptable rules to interrupt network traffic from client towards broker
Closed the context object (I use Apache Qpid 2.0 JMS AMQP)
Some of those cause timeout disconnects, some of them disconnect me right away but everytime once I reconnect I just get all messages send again and the redeliver count increases.
So my question(s) would be:
What cause a “Message Transport Retransmitted” to happen compared to a redeliver ?
How can I simulate / force one ?
I thought it happens maybe when the client API does not acknowledge the receive of the messages back to the broker in a “quick enought” time, but not sure how to interrupt this.
Maybe someone here can help me.
Best Regards,
Jan-Filip.
You are correct that message-retransmission indicates the message was not acknowledged at the network/transport level by the API. The timeout for this is 2 seconds and it would be very difficult to simulate or cause this to happen. The connection between the API and the broker is TCP, which is reliable, and everything in the API is designed to make sure those acknowledgments are sent in time.
Filtering packets with IP tables is likely to cause either that API or the broker to detect a network failure and reconnect which will then lead to redelivery instead of retransmission.
Operationally, when this is seen, it tends to indicate a very degraded network. If the end to end transmission time takes more than 2 seconds you might see this. Or if there are a huge number of TCP retransmissions occurring due to network errors, this might degrade the network enough to cause it . In many of these cases though, the API or broker will detect this and disconnect the socket any leading to the reconnect scenario again.