Appropriate debug logging switch to enable to log events that keep Message spools in HA in sync

rdesoju
rdesoju Member Posts: 66
edited February 2022 in PubSub+ Event Broker #1

Hi,
I am looking for particular DEBUG logging switches that we can use to see the critical events that are exchanged between Primary, Standby and Monitoring node that keep Message Spools in Sync. And, events that detect message spools out of sync. Also, events that are initiated when monitoring node plays a role in leader election.

I already tried with REDUNDANCY, AD_REDUN, MSGBUS, MATELINK etc.

Could someone please suggest the appropriate DEBUG logging switches?

Thanks,
Raghu

Tagged:

Comments

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 972 admin

    Hi @rdesoju,
    A list of events can be found here: https://docs.solace.com/Solace-PubSub-Event-Reference/event_ref_boiler.html
    From what you're requesting I would start by looking at the events under SYSTEM -> SYSTEM_AD and SYSTEM -> SYSTEM_HA. For example I think you might be interested in SYSTEM_HA_REDUN_GROUP_NODE_JOINED

    I would also recommend you take a look at events under VPN -> VPN_AD so you're aware of message spool, endpoint usage, etc. at the individual vpn level.

    Hope that helps!
    -Marc

  • TomF
    TomF Member, Employee Posts: 412 Solace Employee

    HA going out of sync isn't a DEBUG level event and if you're seeing it, it's time to root cause it. It's important to note the "leader" isn't elected - this is a configuration option (primary) with the backup becoming active only if contact is lost with the primary. The redundancy and system_ha events will tell you all about the sequence of events that causes this to happen, for instance the HA links going down due to a network problem, the primary being declared unreachable and the backup activating.

  • rdesoju
    rdesoju Member Posts: 66

    @TomF When the primary loses the contact, does backup become active as soon as it detects primary is not reachable? or does it wait for confirmation of Monitoring node in order to assert primary is truly not reachable?

  • TomF
    TomF Member, Employee Posts: 412 Solace Employee

    @rdesoju The backup waits for confirmation from the monitor that the monitor cannot see the primary. The backup also checks it can't see the primary. Only if both are true can the backup take activity.

    In this case, the primary cannot see the monitor and it cannot see the backup. Since the backup is free to take activity, the primary now knows it must give up activity.

  • rdesoju
    rdesoju Member Posts: 66

    @TomF , Thanks for the clarification.