I followed the steps documented in the HA redundancy section, both the primary and backup nodes are online and I can see the redundancy working.
Monitoring node is offline, the show redundancy group on monitoring node shows all nodes as offline.
Configuring High-availability (HA) Redundancy Groups
I tried multiple times the behaviour is consistent, not sure what I am missing.
Thanks
Madhu
Hi @madhu
It’d be helpful to share more details, like what is your setup looks like, is it a Kubernetes-based, or you;re doing multiple VMs or EC2 instances maybe?
And a few screenshots or log snippets would be great too.
Thanks Arih for quick reply.
3 machines ec2 instances.
Primary
Backup
Monitoring Node
Primary node
Backup node
Debug on monitoring node
Thanks
Madhu
that’s awesome!
can you also share the content of command.log from the monitor node.
also, did you somehow changed hostname or IP address for the monitoring node sometime after setup?
Versions I have tried 10.2.1.51 and 10.3.0.32, AMI from community standard instances.
aws-marketplace/solace-pubsub-standard-10.3.0.32-amzn2-15.2.0-ac3bbfe4-a7d2-4591-bbc5-f43908c43764
Thanks
Madhu
noted, can you show us the output of this command from the monitor node as well as one of the pri/sec node?
solace> show ip vrf management
hmm, that looks correct.
can we go back to command.log and show the full content from beginning?
that looks good as well… I’m running out of ideas
from the monitor node linux shell, are you able to ping 10.0.0.126 and 10.0.0.128?
Yes they are reachable, any specific ports other than documentation provided… ? or this offline for monitor node has some bug?
Thanks
Madhu
I observe the below error message on the monitoring node debug log. Not exactly sure what does it mean, I just performed the steps provided in the documentation no customization performed on any node.
2023-04-26T05:52:56.174+00:00 <local0.err> ip-10-0-0-126 appuser[387]: /usr/sw ConsulFSM.cpp:860 (REDUNDANCY - 0x0000000
0) ConsulProxyThread(9)@controlplane(10) ERROR Could not determine self node configuration
2023-04-26T05:52:57.175+00:00 <local0.err> ip-10-0-0-126 appuser[387]: /usr/sw ConsulFSM.cpp:860 (REDUNDANCY - 0x0000000
0) ConsulProxyThread(9)@controlplane(10) ERROR Could not determine self node configuration
2023-04-26T05:52:58.176+00:00 <local0.err> ip-10-0-0-126 appuser[387]: /usr/sw ConsulFSM.cpp:860 (REDUNDANCY - 0x0000000
0) ConsulProxyThread(9)@controlplane(10) ERROR Could not determine self node configuration
2023-04-26T05:52:59.177+00:00 <local0.err> ip-10-0-0-126 appuser[387]: /usr/sw ConsulFSM.cpp:860 (REDUNDANCY - 0x0000000
0) ConsulProxyThread(9)@controlplane(10) ERROR Could not determine self node configuration
Thanks
Madhu
I’m not sure either. You tried with two different versions and they both had the same issue? Sure all the firewall rules are correct…? TCP & UDP?
hey @madhu ,
can you check one more thing by issuing below command on all nodes:
solace> show router name
Wow arih, that did the trick.
My mistake some reason I missed updated the router-name in monitoring node. I did it consistently in all my attempts :(.
Thanks
Madhu
Hahaha, no worries, that’s muscle memory I guess
Good to hear it got solved!
As added note, there are options to use helm charts, docker compose, cloud formation, etc. that might help automate these steps. Or even better, just use solace.com/cloud if you just need a broker that just runs
@madhu glad you got it resolved!! ??
What was it before? And what did you have to change it to? I’m surprised there wasn’t an easier error/status notification somewhere in CLI showing a mismatch or something.
I think i found the pattern, seems like when we change the node to monitoring node it overwrites the node-name.
@Aaron, no error on monitoring node. On messaging node we get error if node-name doesn’t match.