Solace Java apps, shutdown hooks and deadlocks

Aaron
Aaron Member, Administrator, Moderator, Employee Posts: 579 admin
edited September 5 in Tips and Tricks #1

I build a lot of JCSMP apps, and in my latest project (my PrettyDump console pretty-print message listener), I ran into an issue I thought was pretty interesting and thought I'd share here.

It's a terminal app, and I wanted to use a shutdown hook so that it would capture Ctrl+C on the command line to initiate a graceful shutdown. This also applies to any app running inside Kubernetes, where an auto-scaler or something wants to terminate your app by sending it a SIGINT. So I made a shutdown hook inside my main(String[]) method:

final Thread shutdownThread = new Thread(() -> {
    System.out.println("\nShutdown hook triggered, quitting...");
    config.isShutdown = true;
    if (flowQueueReceiver != null) flowQueueReceiver.close();  // will remove the temp queue if required
    if (directConsumer != null) directConsumer.close();  // shutdown Direct consumer if configured
    try {
        Thread.sleep(200);
        session.closeSession();
        Thread.sleep(300);
    } catch (InterruptedException e) {  // ignore, we're quitting anyway
    }
    logger.info("### PrettyDump finishing!");
    System.out.println("Goodbye! 👋🏼");
});
shutdownThread.setName("Shutdown Hook thread");
Runtime.getRuntime().addShutdownHook(shutdownThread);

Looks about right, right? Catch the Ctrl+C, print out some helper text, initiate a graceful shutdown by closing my FlowReceiver (the "bind" to the queue I'm receiving messages from) and/or close my Direct subscriber/consumer object, disconnect the Session, and quit.

The only problem is: if I've temporarly lost my connection to the broker (e.g. some network issue, or I'm offline, or the Message VPN on the broker got shutdown) and I'm in a reconnect loop, the calls flowQueueReceiver.close() and directConsumer.close() are blocking calls to the broker!! So my shutdown hook actually blocks at those lines (4 or 5 above) if I'm disconnected and the API is trying to reconnect.

NOW: at first I thought that the API's reconnection thread wasn't configured as daemon, and therefore preventing quitting, but it is (so that's good!). But this is actually just a feature/quirk of the Java implementation of a shutdown hook: it doesn't automatically terminate all daemon threads. So if the shutdown hook makes a blocking call to something, it gets stuck. So in my app, as soon as I could reconnect to the broker, the close() calls would run successfully and then it would quit. Haha I had to reconnect to quit..! 😅

To address this, I keep track of my connection status (using the JCSMPReconnectEventHandler or the SessionEventHandler) with a volatile Boolean variable, and checking to make sure I'm currently connected before trying to close() those objects:

if (config.isConnected) {  // if we're disconnected, skip this because these will block/lock waiting on the reconnect to happen
    if (flowQueueReceiver != null) flowQueueReceiver.close();  // will remove the temp queue if required
    if (directConsumer != null) directConsumer.close();
}

Luckily the session.close() is non-blocking, so there's no issue with that one.

Anyway! Hope this might help somebody in the future… just a weird situation you might not run into unless you're doing failover testing. Obviously you can still force-quit the app with a kill -9 or SIGKILL signal (and that's exactly what Kubernetes does to a container if it doesn't respond to a SIGINT within a specified period of time), but this little hack makes for a nicer/cleaner experience.

Tagged: