Handling failures of ISession.Connect with .NET API

Kiwidude · August 2020

There is a comment in the ISession documentation for .NET of "Connect and disconnect on demand" but no more detail on that.

Currently our app has:

Code at service startup to initialise Solace, create session, call ISession.Connect()
Additional threads utilising that ISession to create guaranteed message flows etc.

The problem we are seeing right now is that ISession.Connect() is failing (which is expected due to some infrastructure not yet being ready). What is not clear to me is what we should be doing about that - at the moment we just log it out and continue. So our additional application threads are each doing calls like ISession.CreateFlow() and of course also getting failures of OperationErrorException. They each have their own retry logic - so every x interval they try again to call ISession.CreateFlow()

My questions are:
1. If the connectivity magically becomes possible (i.e. our infrastructure issues are sorted), will that repeated ISession.CreateFlow() call be sufficient to initialise the session connection, without another attempted call to ISession.Connect()?
2. If that is the case then is ISession.Connect() really an "optional" call that offers the early ability to do a health check but it isn't necessary to block any application code from continuing to "try" to use that session?

Basically I am trying to determine what sort of "recovery" mechanism we need to have in place in our codebase to retain the integrity of our application, without having to restart our services if Solace connectivity happens to be down for a moment in time. If the Solace API is for the most part "self-recovering" in terms of just throwing exceptions while connectivity is down but internally reconnecting when possible so subsequent API calls will work then all we need is retry logic when seeing those exceptions. However if instead for instance the ISession instance needs disposing/recreating until you get a valid connection etc then that requires a whole different approach.

Aaron · August 2020

Yup, I kind of agree. I think it's just that for the majority of messaging applications, Solace ones included, is that the very first thing the app does when it first starts up is to connect. So if it can't connect initially, then that (usually) indicates some failure. So the Session would almost always be connected in a normal situation. Then, during a network outage or HA failover when the API is attempting to reconnect, that temporary disconnection is kind of hidden from the application. Only once all the reconnection attempts are exhausted does the Session declare itself as "DOWN". However, for GUIs and just for more detailed state management, I watch for the "RECONNECTING" session event to indicate I've lost the connection, and then "RECONNECTED" to say it's back up.

Either way, I'm going to ask internally if there's an obvious/good reason why there is no "isConnected()". I thought there was in JCSMP (the Java API), but I can't find it. I'll let you know if I hear any decent reason.

BTW, if you are really getting into the Solace API, a good reference that is a bit buried in our documentation is the Solace Messaging API Developer Guide. Check that out.

As for disconnect/reconnect testing in Solace Cloud, and this will also work if you have a broker running locally, you can either shutdown the whole Message VPN (kicking off ALL clients), or shutdown the Client Username that you're using for that app (which will also kick off any other connected clients using the same username).

Message VPN shutdown

In PubSub+ Manager, click on Message VPN --> Settings --> double-click on "Enabled" to shut it down --> Click Apply
You can also do it programmatically using SEMPv2 (management API)... check https://docs.solace.com/API-Developer-Online-Ref-Documentation/swagger-ui/config/index.html#/msgVpn/updateMsgVpn and the parameter you want it "enabled".

Client-Username shutdown

Click on Access Control --> Client Username --> tick the box next to your username --> Action (top right) --> Change Status --> disable

Hope that helps!!

Aaron · August 2020

Hey @Kiwidude ! Ok, is this the documentation you're talking about, the intro paragraph? https://docs.solace.com/API-Developer-Online-Ref-Documentation/net/html/d0313c0c-596e-ddb3-0467-7ae2039bfb86.htm or something else?

I think what it's trying to say is that this ISession object allows users (applications) to initiate connect() and disconnect() attempts on demand, create Flows, and such.

Now, full disclosure, I'm much more familiar with the Java API JCSMP, but a lot of it is still the same. So first off, your Flows in other threads will never get created if the underlying session is not established first... so your "outer" / parent / whatever object needs to get that Session established. There are session properties for determining how long your connect attempts should be (and reconnect, but that's handled separately). https://docs.solace.com/API-Developer-Online-Ref-Documentation/net/html/82816aab-350c-a890-cc35-ac125b35421c.htm

Now, you could have the API try to connect indefinitely, but I'd rather have the API try once or twice, fail, have control pass back to the application, where you log, and then retry. Or back off and retry. It puts the control in the hands of your app, rather than just letting the API try to connect indefinitely (or for a long time). Anyhow, I took a quick browse and couldn't see an "is connected" type method for the C# API, so you should probably make a Boolean var of some sort isConnected that you set to true when the API successfully connects, or you get the UpNotice session event. This var comes in handy later if the API experiences a temporary disconnect (e.g. network flap, broker HA failover) and the API automatically reconnects for you, but you could set this Boolean var to false while the reconnection is happening.

Anyhow, so short answer is no: your Flows will never be successfully created until your Session is established. Best to check that Boolean var before attempting to create them. HOWEVER, once it is established, the API will automatically try to reconnect if the connection goes away, and once it does reconnect, the Flows will be reestablished and the application doesn't have to do anything. Make sure that your reconnection Session properties are setup for long enough.

Solace has software brokers in a bunch of different form factors (Docker, VMs, etc.) that you could download and run locally, and experiment with shutting them down, or just closing the messaging port to simulate outages. Let us know if you'd like some tips on how you could test these various connection / reconnection scenarios locally with your own infrastructure, rather than waiting for it to become available. https://solace.com/downloads/

Thanks!!

Kiwidude · August 2020

Hi @Aaron, thanks very much for the detailed reply. Yes that was exactly the page I was referring to.

All of what you said makes sense... except for the bit about adding my own boolean to keep track of the session connected state. I 100% agree I "could" do that, except it just seemed very odd that ISession did not already have an IsConnected property itself if this was the recommended pattern for dealing with connection failures. If every downstream user of the API has to maintain their own state variable listening to the UpNotice etc then surely it would instead make sense to have a property exposed automatically by the API on ISession? It was so obvious an omission that it made me question whether actually the reason was that the ISession didn't "need" to maintain that state and that each call to CreateFlow etc would attempt to "reconnect" internally if needed, based on that page I referred to above.

I do have a Solace PubSub+ Cloud test instance setup on AWS, would I be able to use that to simulate the connection not being available? I've been hunting through the screens on that trying to find a way I could effectively temporarily take it "offline" but no success so far, so if you have a pointer for that it would be most appreciated. Would love to be able to test whatever solution we implement at will.

Aaron · August 2020

Yup, I kind of agree. I think it's just that for the majority of messaging applications, Solace ones included, is that the very first thing the app does when it first starts up is to connect. So if it can't connect initially, then that (usually) indicates some failure. So the Session would almost always be connected in a normal situation. Then, during a network outage or HA failover when the API is attempting to reconnect, that temporary disconnection is kind of hidden from the application. Only once all the reconnection attempts are exhausted does the Session declare itself as "DOWN". However, for GUIs and just for more detailed state management, I watch for the "RECONNECTING" session event to indicate I've lost the connection, and then "RECONNECTED" to say it's back up.

Either way, I'm going to ask internally if there's an obvious/good reason why there is no "isConnected()". I thought there was in JCSMP (the Java API), but I can't find it. I'll let you know if I hear any decent reason.

BTW, if you are really getting into the Solace API, a good reference that is a bit buried in our documentation is the Solace Messaging API Developer Guide. Check that out.

As for disconnect/reconnect testing in Solace Cloud, and this will also work if you have a broker running locally, you can either shutdown the whole Message VPN (kicking off ALL clients), or shutdown the Client Username that you're using for that app (which will also kick off any other connected clients using the same username).

Message VPN shutdown

In PubSub+ Manager, click on Message VPN --> Settings --> double-click on "Enabled" to shut it down --> Click Apply
You can also do it programmatically using SEMPv2 (management API)... check https://docs.solace.com/API-Developer-Online-Ref-Documentation/swagger-ui/config/index.html#/msgVpn/updateMsgVpn and the parameter you want it "enabled".

Client-Username shutdown

Click on Access Control --> Client Username --> tick the box next to your username --> Action (top right) --> Change Status --> disable

Hope that helps!!

Kiwidude · August 2020

Thanks again @Aaron, another very helpful reply. I can see that enabled setting now, there are a lot of screens and tabs so not surprised I missed that one

Our use case is that Solace is not the "only" purpose of the hosting service - it is a general feeds service that initialises connectivity to a potential number of in/out types (e.g. file systems, SFTP, MQ etc). The main application cannot work without this service running. However if the Solace feeds were not available (e.g. the client's Solace infrastructure is down for patching/maintenance at the time our application service starts up) it would be inappropriate for us to completely stop our feeds service - instead we want it to sit there allowing the other feed threads to do their thing and for the Solace connectivity to "come to life" with some sort of retry logic.

I had initially hoped that the retry logic would be limited to the places where we call CreateFlow() , however the creation of the ISession is something that I had being done at service startup. I need to revisit all that code by the sounds of it - rather than just handing off the ISession to any thread that asks for it I need to wrap that with a check for if the session is currently considered "connected", as well as put that Session.Connect() stuff into an async thread with retry during startup. Obviously if your internal talks lead to further thoughts around this they would be appreciated, but thanks to your input I have a way to test this approach anyways.

Handling failures of ISession.Connect with .NET API

Best Answer

Message VPN shutdown

Client-Username shutdown

Answers

Message VPN shutdown

Client-Username shutdown

Categories

This Month's Leaders

This Week's Leaders