Hey @vn01 … I’ve asked internally, but I’m pretty sure XA support for PQs is not on the near-term roadmap. So let’s brainstorm a bit other possible patterns.
You want XA because you’d like to have the DB commit, the outgoing message publish, and the ACK of the received message all handled as one “blob”. Or possibly multiple messages? (batch) If multiple, how many?
PQs do support regular Session transactions, so at least you could have some of it as a single operation.
So: PQ with multiple consumers, key = USER_ID. A single consumer reads one (or a bunch?) of messages off the queue… then posts them into a DB. When the commit from the DB is successful, the app publishes a single (multiple?) notification messages on another topic (probably Guaranteed I’m guessing), and then ACK the messages back on the PQ. So the two Solace-related parts could be in a Session transaction, so you don’t have to have additional logic that waits for the publish ACK before ACKing the PQ message(s).
So the issue is: what happens if #1: the DB commit is successful, but then either the Solace notification publish fails (due to queue full maybe?) and therefore the Session transaction would fail… OR #2: if the consumer app crashes after the DB commit and the notification doesn’t go out / PQ messages aren’t ACKed. Right? Any other failure cases…?
For #1, if the app is still up and doesn’t lose state, it could just continue to try to resend the notification messages for some period of time. At least b/c of the PQ key, we know that further messages about this USER_ID won’t be processed until this batch is sorted out, so we won’t have any “gaps” in our notification messages, just a delay. If the app decides to eventually give up, then it would have to rollback/mark cancelled somehow the previous DB update. But again, because of PQ, no other apps should be dealing with this specific USER_ID. NOTE that the " max handoff time
" on the PQ configuration is very important here… it must be at least the amount of time that the app will wait / try to deal with this message. This is because when consumer scaling up, a new consumer gets added to the PQ, the partition that this message came from could get yanked away and given to the new guy… and the original (blocking/waiting) app doesn’t know… so we don’t want the new consumer to get this message while the first app is still trying to deal with it.
For #2, if the app crashes midway through, then the partition will eventually move over to a new consumer who will receive the same messages that weren’t ACKed. The app can checked “redelivered flag” on each message to give a hint that this message might have been processed before. Obviously, the app would/should check the DB to make sure this message hadn’t already been inserted in the DB. If not: carry on as normal; if so: attempt to send the notification message and ACK the received PQ message.
So yeah… I think it could be done… just with some extra checks by the PQ consumer to make sure this message hadn’t been inserted into the DB already. Might be a bit slower, but probably faster than an XA transaction / 2PC. Hopefully the messages received off the PQ, whoever is publishing them inserts some UUID or something (application message ID?) that the PQ consumers/DB inserter apps can use to verify this message has/hasn’t been processed already.