Role of monitoring node
In a HA group setup, primary is in subnet 1, secondary is in subnet 2, monitoring node is in subnet 3 and clients are in subnet4.
At the beginning, Secondary is ACTIVE and all clients connected to it and processing messages.
network access is blocked to subnet 2 and subnet 3. In this case clients could not connect to available Primary node since minimum of two nodes are required to process messages.
At this time, unblocked network access to subnet 3 where monitoring node is running. In this case, clients immediately connected to Primary and consul
logs show that monitoring is the current leader.
I was under the impression that monitoring node just plays a passive role leader election and whoever has the leadership will be able to become ACTIVE.
Could someone shed some light on this behavior please?
Thanks,
Raghu
Comments
-
Hi @rdesoju, as you've said, the triplet requires two nodes to be active and connected. The reason for this is that we must avoid "split brain." Imagine yourself to be the primary in subnet 1. You start by seeing the secondary and the monitor. Then you lose contact with both. What should you do? You could start processing messages: but how do you know that the backup, which you knew was active, isn't still doing that? You have no way of knowing. If I were to start processing messages and it turns out the backup was healthy, we'd be in an indeterminate state - split brain.
So, to avoid the case where an isolated node starts processing messages when 2 other nodes could also be processing messages, we do not allow an isolated node to take activity.
Another quick point: the monitor isn't really performing what I'd call "leadership election." The "leader" is nominated in the configuration - it's the primary. Only in the case of a failure or administrative action does the backup take activity.
0 -
Hi @TomF,
Based on your explanation, when primary is active it will always take the leadership since configuration allows to do it. Am I right?
Does messaging node own the leadership to be able to become ACTIVE and process the messages?
I have seen below combinations in logs:
When Primary is leader, voting requests from monitoring node and standby got rejected.
When stopped Primary, Secondary became leader and when primary is restored voting request from primary is rejected since secondary is already a leader.
In a particular situation, where monitoring node is started first and primary is started next and standby is down, monitoring node has become leader.
So, what does leadership indicate really here?Thanks,
Raghu0 -
@rdesoju. Yes. If the primary is active, it will own all the activity. The backup will only take activity if we know the primary isn't active. Emphasis on know.
This isn't leadership election. There is no voting. The monitor is there to ensure we know whether the primary or backup is active - not to determine which is which. You determine that via static configuration.
Note this isn't a Kafka cluster or another leadership style cluster. It isn't a cluster at all. It's an HA group. HA Groups can be clustered together - that's our DMR feature. The penalty you pay for having HA Groups is twice the number of brokers. The benefits are no cluster re-balancing, predictable fail over times, and no cluster collapse syndrome.
0