🎄 Happy Holidays! 🥳

Most of Solace is closed December 24–January 1 so our employees can spend time with their families. We will re-open Thursday, January 2, 2024. Please expect slower response times during this period and open a support ticket for anything needing immediate assistance.

Happy Holidays!

Please note: most of Solace is closed December 25–January 2, and will re-open Tuesday, January 3, 2023.

Handling failures of ISession.Connect with .NET API

Kiwidude
Kiwidude Member Posts: 7

There is a comment in the ISession documentation for .NET of "Connect and disconnect on demand" but no more detail on that.

Currently our app has:

  • Code at service startup to initialise Solace, create session, call ISession.Connect()
  • Additional threads utilising that ISession to create guaranteed message flows etc.

The problem we are seeing right now is that ISession.Connect() is failing (which is expected due to some infrastructure not yet being ready). What is not clear to me is what we should be doing about that - at the moment we just log it out and continue. So our additional application threads are each doing calls like ISession.CreateFlow() and of course also getting failures of OperationErrorException. They each have their own retry logic - so every x interval they try again to call ISession.CreateFlow()

My questions are:
1. If the connectivity magically becomes possible (i.e. our infrastructure issues are sorted), will that repeated ISession.CreateFlow() call be sufficient to initialise the session connection, without another attempted call to ISession.Connect()?
2. If that is the case then is ISession.Connect() really an "optional" call that offers the early ability to do a health check but it isn't necessary to block any application code from continuing to "try" to use that session?

Basically I am trying to determine what sort of "recovery" mechanism we need to have in place in our codebase to retain the integrity of our application, without having to restart our services if Solace connectivity happens to be down for a moment in time. If the Solace API is for the most part "self-recovering" in terms of just throwing exceptions while connectivity is down but internally reconnecting when possible so subsequent API calls will work then all we need is retry logic when seeing those exceptions. However if instead for instance the ISession instance needs disposing/recreating until you get a valid connection etc then that requires a whole different approach.

Tagged:

Best Answer

Answers

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 644 admin

    Hey @Kiwidude ! Ok, is this the documentation you're talking about, the intro paragraph? https://docs.solace.com/API-Developer-Online-Ref-Documentation/net/html/d0313c0c-596e-ddb3-0467-7ae2039bfb86.htm or something else?

    I think what it's trying to say is that this ISession object allows users (applications) to initiate connect() and disconnect() attempts on demand, create Flows, and such.

    Now, full disclosure, I'm much more familiar with the Java API JCSMP, but a lot of it is still the same. So first off, your Flows in other threads will never get created if the underlying session is not established first... so your "outer" / parent / whatever object needs to get that Session established. There are session properties for determining how long your connect attempts should be (and reconnect, but that's handled separately). https://docs.solace.com/API-Developer-Online-Ref-Documentation/net/html/82816aab-350c-a890-cc35-ac125b35421c.htm

    Now, you could have the API try to connect indefinitely, but I'd rather have the API try once or twice, fail, have control pass back to the application, where you log, and then retry. Or back off and retry. It puts the control in the hands of your app, rather than just letting the API try to connect indefinitely (or for a long time). Anyhow, I took a quick browse and couldn't see an "is connected" type method for the C# API, so you should probably make a Boolean var of some sort isConnected that you set to true when the API successfully connects, or you get the UpNotice session event. This var comes in handy later if the API experiences a temporary disconnect (e.g. network flap, broker HA failover) and the API automatically reconnects for you, but you could set this Boolean var to false while the reconnection is happening.

    Anyhow, so short answer is no: your Flows will never be successfully created until your Session is established. Best to check that Boolean var before attempting to create them. HOWEVER, once it is established, the API will automatically try to reconnect if the connection goes away, and once it does reconnect, the Flows will be reestablished and the application doesn't have to do anything. Make sure that your reconnection Session properties are setup for long enough.

    Solace has software brokers in a bunch of different form factors (Docker, VMs, etc.) that you could download and run locally, and experiment with shutting them down, or just closing the messaging port to simulate outages. Let us know if you'd like some tips on how you could test these various connection / reconnection scenarios locally with your own infrastructure, rather than waiting for it to become available. https://solace.com/downloads/

    Thanks!!

  • Kiwidude
    Kiwidude Member Posts: 7
    edited August 2020 #4

    Hi @Aaron, thanks very much for the detailed reply. Yes that was exactly the page I was referring to.

    All of what you said makes sense... except for the bit about adding my own boolean to keep track of the session connected state. I 100% agree I "could" do that, except it just seemed very odd that ISession did not already have an IsConnected property itself if this was the recommended pattern for dealing with connection failures. If every downstream user of the API has to maintain their own state variable listening to the UpNotice etc then surely it would instead make sense to have a property exposed automatically by the API on ISession? It was so obvious an omission that it made me question whether actually the reason was that the ISession didn't "need" to maintain that state and that each call to CreateFlow etc would attempt to "reconnect" internally if needed, based on that page I referred to above.

    I do have a Solace PubSub+ Cloud test instance setup on AWS, would I be able to use that to simulate the connection not being available? I've been hunting through the screens on that trying to find a way I could effectively temporarily take it "offline" but no success so far, so if you have a pointer for that it would be most appreciated. Would love to be able to test whatever solution we implement at will.

  • Kiwidude
    Kiwidude Member Posts: 7

    Thanks again @Aaron, another very helpful reply. I can see that enabled setting now, there are a lot of screens and tabs so not surprised I missed that one :smile:

    Our use case is that Solace is not the "only" purpose of the hosting service - it is a general feeds service that initialises connectivity to a potential number of in/out types (e.g. file systems, SFTP, MQ etc). The main application cannot work without this service running. However if the Solace feeds were not available (e.g. the client's Solace infrastructure is down for patching/maintenance at the time our application service starts up) it would be inappropriate for us to completely stop our feeds service - instead we want it to sit there allowing the other feed threads to do their thing and for the Solace connectivity to "come to life" with some sort of retry logic.

    I had initially hoped that the retry logic would be limited to the places where we call CreateFlow() , however the creation of the ISession is something that I had being done at service startup. I need to revisit all that code by the sounds of it - rather than just handing off the ISession to any thread that asks for it I need to wrap that with a check for if the session is currently considered "connected", as well as put that Session.Connect() stuff into an async thread with retry during startup. Obviously if your internal talks lead to further thoughts around this they would be appreciated, but thanks to your input I have a way to test this approach anyways.