Clarifitcations regarding sticky load balancing example architecture by Mathew Hobbis

Hello,

I have read the example implementation of sticky load balancing by Mathew Hobbis on this link and am slightly confused about a few things.

Was hoping I could get some answers to the following:

Inquiry 1:

Regarding "Consumer Group Clients", is this the same client as LBG Client? or is it a different service?


Question 2 (multipart question):

in the following code snippet for Consumer Group Clients (Assuming that Consumer Group Clients is in the same codebase as the LBGClient):

if (((loop % clientCount) - clientNumber) == 0) {

    consumers[loop].primary = true;

  } else {

    consumers[loop].primary = false;

  }

what exactly is clientCount? Is it the current number of active consumers? or is it the max number (i.e clientCount = queueCount)?

what is clientNumber? is it the identification of the current consumer (i.e clientIndex)? if so how can I know my index If I am brought up dynamically? suppose the scenario with 4 queues and 2 consumers. if I am brought up as a third consumer am I supposed to know in advance that I am the 3rd? is there no way to discover this dynamically?


Thank you very much in advance

Tagged:

Comments

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 638 admin

    Hi @brigadier90 . How's it going? I'll poke Mat and tell him to come take a look at your post here. I'm reading through the blog right now, trying to refresh myself.

    For your Q1, yes I think "Consumer Group Clients" refer to the "load balanced group clients". These are the queue consumers apps that can be added/removed from the group as load dictates. Same thing, different name. Also, I did notice about halfway through the blog a mention of "Consumer Group Consumer", and I think that was meant to be the "CG Admin" process, that kind of keeps track of everything and assigns queues and topic hash subscriptions.

    For your Q2, I'm not sure. On the blog there's a another conditional block snippet:

    If (queuenumber % clientnumber) - clientindex == 0) primaryqueue = true else primary_queue = false
    

    So I'm guessing (?) that your clientCount is the same as clientNumber which looks to be the current number of clients. And clientIndex is the current index for that particular consumer, which it knows and is configured with when it starts by the "admin" process.

    But I'm just speculating! Ok, enough from me. I'll ask Mat if he can come take a look.

  • brigadier90
    brigadier90 Member Posts: 5
    edited April 2022 #3

    Hi @Aaron! thank you very much for your reply, that does help clarify things. Very much appreciated!

  • mhobbis
    mhobbis Member, Employee Posts: 2 Solace Employee

    Hi @brigadier90, My apologies for duplication of terms. It's a hazard of repurposing material for different groups. In the text a Consumer Group Client is the same as an LBG Client and is the same as the Consumer Group Consumer.

    In terms of the code snippet there are a number of attributes that are related and govern which queues a client thinks it is responsible for -

    • The total number of queues - loop counts through the queues in the snippet.
    • The max number of LBG clients (clientCount) that will run. clientCount <= queueCount
    • The LBG Client number (clientNumber) - an index of the client within the group, e.g., if there are 10 clients each will have a number 0 - 9

    What the code needs to do is work out which of all of the queues that a client is connected to (which will be all of the queues) are primary queues for that client. So if I have 10 queues and 5 clients then each client will connect to all to queues. However, client 0, for example, will treat Q0 and Q5 as primary, client 1, Q1 and Q6 as primary, etc.

    In the LBG Client code once you once you have processed and ACKed your message / completed your transaction, check to see if the allowed time has passed (60s in the blog). If the time has passed has then walk through the list of queues and close and recreate the flows to all queues for which this client does not count as primary. This action will eventually ensure that the client that does count the queue as primary end up servicing it.

    When the LBG client is started it must be started knowing it clientNumber / index within the group.

    I hope that helps. Let me know if you need anything else.

    /Mat

  • brigadier90
    brigadier90 Member Posts: 5

    Hi @mhobbis, Thank you very much for your reply and for the clarification.

    I understand what I must do now.

    Much appreciated, best regards :)

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 638 admin

    @brigadier90 keep us posted on your progress. Let us know if/how it turns out!

  • Piyush Poply
    Piyush Poply Member Posts: 3

    Thanks @brigadier90 @Aaron and @mhobbis for this thread, this helped me too to understand the concept of Sticky Load Balancing approach.. However I am still are unclear on the query that


    How will our client service instances get to know theirs sequence or index number.. Is there a way we can maintain that count and share it accross all the instances of service..

    If there is Such a way then can that be done in Microservice based architecture where all the instances are running on their seperate pods/ docker container

  • Mallu Golageri
    Mallu Golageri Member Posts: 10 ✭✭
    edited January 4 #8

    @mhobbis ,

    On your below statement

    "When the LBG client is started it must be started knowing it clientNumber / index within the group".

    How does client know it's number when started? What's the solution for this?

    In my case, all queues and topics are created in advance. How I need to know the consumerCount for the first pod started is 1 and second is 2 and third is 3, etc... ?

    There is a way to use stateful sets in kubernetes but we don't want to use that. Can you please help me here?

    Thanks in advance.

    @Aaron @Piyush Poply @brigadier90

  • Mallu Golageri
    Mallu Golageri Member Posts: 10 ✭✭
    edited January 5 #9

    @mhobbis & @Aaron ,

    In sticky load balancing pattern,

    If I have 100 exclusive queues and 100 consumers, how about secondary connections to each consumer i.e 99? which means periodically we need keep dropping/unbinding 99 secondary connections from each clients/consumers? Isn't that costly?

    Also, Does solace non-exclusive partitioned queues use exclusive queues with sticky load balancing pattern under the hood to achieve partitioning?

  • mhobbis
    mhobbis Member, Employee Posts: 2 Solace Employee

    Hi,

    The article on sticky load-balancing was a work around until the 'partitioned queues' feature made it into the product. IMHO you should be looking to deprecate the work-around and utilise the supported feature in its place - https://docs.solace.com/Messaging/Guaranteed-Msg/Queues.htm#partitioned-queues

    I believe this feature will remove all of the problems associated with the work around, especially indexing the clients, as all of the state is handled by the broker and the API with no user side code required.

    /Mat

  • vn01
    vn01 Member Posts: 17 ✭✭✭
    edited August 12 #11

    @mhobbis With partitioned queues not supporting XA transaction, if we want to use XA transaction, sticky load balancing is the only alternative ?

  • Travis
    Travis Member Posts: 4

    Hi there,

    Thank you for sharing your insights and questions about Mathew Hobbis's implementation of sticky load balancing. It's great to see your engagement with the material, and I appreciate your effort to clarify these concepts.

    Regarding your first inquiry about "Consumer Group Clients" and whether they are the same as the LBG Client, they typically refer to distinct components within a load balancing architecture. The Consumer Group Clients generally manage the consumption of messages from a queue, while the LBG Client (Load Balancing Group Client) is focused on distributing the load across those consumers. It might help to check the documentation or comments within the codebase for specific distinctions as implementations can vary.

    For your multipart question, let's break it down:

    1. ClientCount: In the code snippet, clientCount usually represents the total number of active consumer instances within the group at that moment. It can vary depending on how many consumers are currently running.
    2. ClientNumber: This variable likely acts as an index or identifier for each consumer in the group. When consumers start up, they can be assigned an index based on their order of initialization or based on their registration with a load balancer or coordinator service.

    Regarding your concern about dynamically discovering your index as a new consumer, it’s a common challenge in distributed systems. Many implementations address this by utilizing a service discovery mechanism or a coordination service (like Zookeeper or etcd) to maintain an up-to-date registry of active consumers. This way, when a new consumer joins, it can query the service to obtain its unique identifier and understand its role in the load balancing scheme.

    If you’re facing issues with dynamic indexing, consider integrating a service discovery tool to streamline this process. It will not only help you manage consumer indices but also improve fault tolerance and scalability.

    I hope this helps clarify your questions! If you have any more doubts or need further assistance, feel free to ask.