Correct handling of keepalive with go-amqp

ahabel
ahabel Member Posts: 9 ✭✭

Hi Solace Community,

as long as the SMF API for golang is not available, we would like to create a reference implementation using golang with AMQP and PubSub+.
Instead of QPID, we want to use https://github.com/Azure/go-amqp, because it does not require C bindings and looks way more easy to handle.

The example on github works pretty well, however we've ran into a problem with timeouts and keepalives. When creating a client, the idle timeout for connections is 60 seconds by default, which means listening on a queue for more then 60 seconds without activity will result in an read tcp 127.0.0.1:46732->127.0.0.1:5672: i/o timeout exception.

// stripped all error handling... 

client, err := amqp.Dial("amqp://localhost:5672",
        amqp.ConnSASLPlain("admin", "admin"),
        // amqp.ConnIdleTimeout(20*time.Second),
        amqp.ConnIdleTimeout(0),
)
session, err := client.NewSession()

// my queue is called logging and receives system and vpn events with a subscription on #LOG/>
receiver, err := session.NewReceiver(
        amqp.LinkSourceAddress("logging"),
        amqp.LinkCredit(10),
    )

for {
    // Receive next message
    err = receiver.HandleMessage(ctx, handleMessage) // handleMessage just logs and accepts the message
    if err != nil {
            return err // <<<---- This is called after ConnIdleTimeout
    }   
}

Another option is to set ConnIdleTimeout to 0. But then the client does not recognize loss of connection, for example if egress of the queue will be disabled and re-enabled, the client still waits in handleMessage while on the broker messages are piling up.

Of course, one could set a timeout and just reconnect the whole client in an endless loop, or create a sender who creates "hearbeats" every minute to fake activity on the queue, but both feels horribly wrong.

Did we miss some settings? Something in the client-profile that could help?
What golang / amqp API's are you using?
Other suggestions?

Thanks and regards,
Andreas

Tagged:

Comments

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 508 admin

    Hi Andreas @ahabel . Welcome to the Community, and thanks for the very interesting first post! I don't know Go, I'm a Java guy (unfortunately!), but I'll see if I can find some smart people at HQ that know the AMQP protocol well-enough to give you some guidance. I know that other AMQP APIs seem to work well enough with Solace, so hopefully wouldn't be a big fix to make it work for this Azure Go API.

    Stay tuned!

    And if any Community members have something more to add here, please do!!

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 914 admin
    edited August 2021 #3

    Hi @ahabel,

    Glad to hear the API has been working well besides the timeouts/keepalives. That honestly sounds like a bug to me. Your timeout/keepalives config shouldn't affect your reconnection/retries logic IMO. Maybe open an enhancement request on the API?

    In the meantime, does the connection stay open if you publish a message? I've seen a lot of different organizations have their apps publish heartbeat messages every 30 seconds (time varies) stating that their healthy, connected, etc. Then down the road you can always add a subscribing app that consumes the heartbeats for monitoring purposes if necessary.

    Hope that helps!

  • ahabel
    ahabel Member Posts: 9 ✭✭

    Hi @Aaron and @marc ,

    thanks for your replies. To be honest, I haven't tested sending heartbeats yet, but will probably do to have a better insight before creating an issue with the Azure AMQP API.

    Overall I don't like the idea of using "real" messages as heartbeats - especially not in a "reference implementation". I feel it's never a good practice to put such low level stuff into application logic. It also gives you false statistics, you have to make sure not to interfere others and also with lots of clients it puts more "useless" load on the event broker.

    Maybe it's best to wait for your go api :wink: