DMQ Eligible Flag set on Publisher (violates pub sub principles)

Robert
Robert Member Posts: 58 ✭✭

The principles for pub sub are to decouple publishers and subscribers from each other.
That publisher need to set DMQ Eligible to true (default = false) impacting a correct set-up on subscriber to make use of DMQ (Dead Message Queue) is violating this principle.

Even worse as subscriber has done everything correct:

  • Created DMQ
  • Linked DMQ to main queue
  • Defined TTL
  • Defined limited retry (not unlimited)

but if publisher does not set that flag the messages would get lost. (discarded)
That is even risky as message loss is worst to happen on message flows.

  • So can anyone explain the rational and purpose of that flag ?
  • Why can it not be turned as default to: true (instead of false) or even removed ?
  • Why does it get switched back from true to false when storing message in DMQ ?
    (that make easy copy of messages from DMQ back to main queue more complex then needed). As when moving back message from DMQ to main queue you must set it back to true as on re-publish again an error could occur which should park message again back to DMQ. Not set would loose message.
Tagged:

Answers

  • nram
    nram Member, Employee Posts: 80 Solace Employee

    @Robert, Thats an interesting semantics question. Its true Pub/Sub paradigm is designed to write loosely coupled applications. But the glue still needs to be specified somewhere - for eg, a common queue name, topic subscription on the queue managed on the broker, etc.

    On this specific question on DMQ and TTL, historically, the TTL value was provided by the publisher. You can argue that DMQ violates this constraint - publisher meant for message to be unavailable for consumption after N milli-seconds, but by expiring the message over to DMQ its available forever. So, allowing publisher to also control if the messages should be expired to DMQ makes sense.

    This brings another question - is the same behavior right when TTL is set by the infrastructure (on the Queue) and not by publisher. My take is - yes, the behavior is uniform and backward compatible.

    As per the default value being False, Solace decides on the default values based on common use-cases and feedback from the field. In older releases, there was a single DMQ per VPN, having this flag True for all TTL enabled messages would fill DMQ with potentially unwanted messages and prevent other queues from expiring to DMQ. Obviously, there is a wide array of use-cases and usages and irrespective of what we pick, its unlikely to be universally acceptable :-)

    When messages are expired to DMQ, they are intentionally prevented from expiring again (its called Dead msg Queue for a reason :-)).

    Here is Larry David's (Seinfeld) take on DMQ for funs sake: https://www.youtube.com/watch?v=lBInjuIW6Lw

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 956 admin

    Hi @Robert,

    I also wanted to chime in here! I agree with the points that you've made and I think it's safe to say if we were starting from scratch today DMQ Eligible would be true by default for the reasons that you stated above. And for these reasons, DMQ Eligible is indeed set to true by default in our new Java and Python Messaging APIs. It is also set to true in our newer connectors like Boomi, MuleSoft and in our Spring Cloud Stream Binder. That said, we do have 15+ years of history and for backwards compatibility reasons the default for DMQ Eligible is false for our classic SMF APIs, such as JCSMP.

    *Note, just a heads up that if you are using JMS you can set DMQ Eligible at the Connection Factory level

    Why does it get switched back from true to false when storing message in DMQ ? (that make easy copy of messages from DMQ back to main queue more complex then needed). As when moving back message from DMQ to main queue you must set it back to true as on re-publish again an error could occur which should park message again back to DMQ. Not set would loose message.

    This is interesting to me as well. I've seen this brought up a few times in the past and would suggest that if you work for a customer w/ paid support please open a feature request via a support ticket. I can't say it will be worked soon, but the more requests that come in a feature the more likely it will be prioritized.

  • Robert
    Robert Member Posts: 58 ✭✭

    @nram The problem is not that some settings can be done on message level (by publisher), queue level or connection factories. That exists in any JMS implementation. The simple problem is that a setting on publisher side should not impact a correct set-up of DMQ on subscriber in risk of loosing messages. The subscriber can do everything right and when publisher misses this flag the messages are gone. Worst situation you can have. I think Solace Team recognized the problem when i see that
    the API overwrite the default of false to true to avoid that risk.

    I as well understand there was maybe a history. But even the case you describe with single DMQ per VPN should not matter if no DMQ is set on subscriber. So to be clear the steering AS TODAY normally should consider to keep decisions on subscriber on subscriber and decisions on publisher with publisher. But risking message loss although subscriber did all right should never happen.
    As lost means lost. ;-)

  • nram
    nram Member, Employee Posts: 80 Solace Employee

    @Robert , Agree with you there. When the TTL & DMQ properties are available elsewhere (not just on the publisher API), its tricky. As @marc mentioned, looks like Solace is moving towards ensuring message expiry to DMQ by default.

  • Robert
    Robert Member Posts: 58 ✭✭

    Does anyone have news on that DMQ Eligible handling in solace.

    I call it historical driven anit pattern where i hope we can just get rid of that flag or set it to true by default. The risk is just to high that publisher forgets to set and it impacts the subscriber which should be decoupled and independent from publisher.

    As well JMS Toolbox can not be used to copy messages back as the flag turns in DMS back to false and you would need to change on way back to main queue. (resubmit)

    So that utility would be not needed anymore which covers the move from DMQ to main queue and setting flag to true:

    richard-lawrence/Solace-JCSMP-Queue-Resender: JCSMP example to read messages from one Solace PubSub+ queue (e.g. a Dead Message Queue) and re-send them to a different queue (github.com)

  • Robert
    Robert Member Posts: 58 ✭✭

    @marc or anyone else from Solace team can you share what is the status related to DMQ settings.

    As reminder:

    • Publisher impacts the DMQ handling on consumer side (architectural wrong and dangerous as you risk message loss on consumer without being able to influence)
    • JMSTool box often referred to from Solace as util for certain task browser, copy, paste does not work when using DMQ as flag turns in DMQ to false. Never understood why.

    It was mentioned that the value gets maybe turned to true as default but if that is the case i would like to know where exact this change was done. Which library and from which version.

    As well i still see same problem in portal. e.g. try me still uses default false so everytime i use for demo i forget and wonder why message did not end in DMQ.

    Many thanks to bring clarity for the solace community.

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 623 admin

    Hey @Robert. Another thread here..! I'll ask our Product Line Management if there's any movement on implementing a broker-side configuration for DMQ override... would probably be at the queue level, similar to max-ttl setting.

    Note that for MQTT protocol, AMQP, and our new "next-gen" APIs (Java (not JCSMP), Go, Python), DMQ-Eligible is true by default.

    Regarding use of JMSToolbox to move messages around (not the best, as this is a client, so therefore a new message), Solace has a brand new feature to allow copying of message from one queue to another within the broker. Very useful!

  • Robert
    Robert Member Posts: 58 ✭✭

    @Aaron thanks for your reply. Thanks for bringing up with Product Management. I as well raised now attention on a old feature around this to get feedback.

    It helps when we post that new APIs will include a change (e.g. DMQ eligible default turns from false to true) to mention versions numbers. So as soon they are known it would be great to share so that customers know when to switch to new libs to get that changed.

    Related to the news about copy message i am very happy to see but when i read the documentation it does not tell clear what happens with DMQ Eligible flag. As it turns from Main Queue (true) to DMQ (false) it must be switched back with copy back to true to avoid risk of message loss.

    Message VPN-level Message Spool Administration (solace.com)

    As well the copy only supports 1 message but customer need often following:

    • Copy all messages back (missing)
    • And a combination of below filters:
    • Filter on a specific message to copy back (now covered in copy command)
    • Filter on time range to copy back (missing)
    • Filter on # retry to copy back (which would require a DMQ retry count to be added and changed any time you push back to DMQ)


  • Robert
    Robert Member Posts: 58 ✭✭

    @Aaron and news on thag from you or any Solace Expert ?

    It would really help to get some clearer roadmap what Solace is planning to do to get away from that historical architectural antipattern that a publisher influences behavior on consumer with risk even to loose messages which is last you want in guaranteed delivery set-up most run there business events on.

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 623 admin

    Hi @Robert, how you doing? I've contacted a couple account people who should hopefully be in touch with you out-of-band. Sounds like you could use a meeting with our PLM. But to answer some of your questions:

    There is feature work being done right now to provide more granular broker controls on DMQ Eligibility. There will be override capabilities added to both Guaranteed Endpoints (queues and TEs), as well as to Client Profiles, that will be able to supplant whatever the publisher set as the DMQe flag on each message. PLM didn't give me a timeline though, but it design documentation is pretty much done.

    I've asked about message copying enhancements, waiting to hear back.

    OH! And I verified that the DMQe flag gets put back as true when copying a message back to the main/other queue. My steps:

    • Setup q1 with max-ttl set to 10 seconds, and respect-ttl = true
    • Setup DMQ, with q1 pointing to it
    • Publish message to q1 with DMQe = true
    • After 10 seconds, message expires to DMQ
    • Browse message in DMQ, confirm DMQe flag is not present
    • Admin copy message back to q1
    • Consume out of q1, confirm DMQe flag is present again. Yup!
    • If I don't consume out of q1, message moves to DMQ again (now there are two, b/c it is a "copy" command).

    Hope that helps!