In the .NET API, why isn't the `CorrelationKey` set on the `RejectedMessageError`?

Options
NasdaqMickelson
NasdaqMickelson Member Posts: 3
edited May 8 in General Discussions #1

I have an application with messages sent to Persistent and a session event handler checking for SessionEvent.RejectedMessageErrorstatuses. I wan to add some special handling logic for the scenario where a persistent message is too large (>30MB), but for some reason the SessionEventArgs.CorrelationKey is always null, despite it being set properly in the request handling logic. The same logic will, when the message is under the limit, result in the expected SessionEvent.Acknowledgementevent and the SessionEventArgs.CorrelationKey properly set.

Is this expected behavior? Or is there a bug in either my code or the .NET API?

Here is the condensed sending logic

public Task SendMessage()
{

  ITopic topic = CreateTopic(metadata, topicPrefix);

  IMessage responseMessage = ContextFactory.Instance.CreateMessage();

  responseMessage.Destination = topic;
  responseMessage.DeliveryMode = DeliveryMode.Persistent;


  TaskCompletionSource completionSource = new(TaskCreationOptions.RunContinuationsAsynchronously);

  responseMessage.CorrelationKey = completionSource;
  responseMessage.AckImmediately = true;
        
  // … additional message building

  return Task.CompletedTask;
}

Here is the Session event handler

 private void HandleSessionEvent(object? sender, SessionEventArgs args)
 {
  switch (args.Event)
  {
    case SessionEvent.Acknowledgement:
    {
      // CorrelationKey is not null here
      (args.CorrelationKey as TaskCompletionSource)?.SetResult();
    }
    case SessionEvent.RejectedMessageError:
    {
      // CorrelationKey is null here
      (args.CorrelationKey as TaskCompletionSource)?.SetException(
        new Exception($"Return Message Rejected: {args.Info}")
      );
    }
   }
 }

Tagged:

Best Answer

  • nicholasdgoodman
    nicholasdgoodman Member, Employee Posts: 36 Solace Employee
    #2 Answer ✓
    Options

    Hello @NasdaqMickelson - So I did some deep digging and indeed was able to reproduce the issue you have described, specifically for cases when the message attachment is too large.

    This was a bit of head scratcher, so I spoke with some of the Solace developers and discovered that you have found a known corner case bug which has been previously identified and does have a fix scheduled sometime in the near term future.

    The issue ultimately stems from fact that when a broker receives an oversized method, it performs an optimization and rejects it early before handling it off to the part of the code that deals with guaranteed messaging. So although the client side application receives a rejection, it's as if the message was rejected as a Direct Message - which means no correlation data is sent.

    This is the only case in which a GM can result in a rejection without correlation data. The planned fix within SDK will be to prevent sending oversized messages client-side and likely return an error on the Send call outright.

    As a workaround, your client-side code can guard specifically for this - my recommendation would be to add a special check for message sizes greater than 9MB (if the queue can accept a 10MB message) and reject them before sending.

    Wish we had something more clever, but at present this appears to be the most workable approach.

Answers

  • nicholasdgoodman
    nicholasdgoodman Member, Employee Posts: 36 Solace Employee
    Options

    Hello @NasdaqMickelson,

    I looked into this last night but so far have been unable reproduce the issue as you describe. I was inducing errors in a number of different ways such as shutting down ingress on the queue or allowing it to exceed the maximum message size configured for the queue. However, I did not experiment with very large attachments (>30MB) so will attempt that next.

    I am curious if the situation you are observing is a rejection by the broker, or merely a failed send. Are you checking the return code of ISession.Send(…)? Can you verify the queue received but rejected the message by going into the Broker Manager UI, Queues > { Queue Name } > Stats under the section "Incoming Messages Not Queued"?

    One final question, does your SendMessage() method actually return Task.CompletedTask or is that a workaround because of your issue? (Presumably you meant completionSource.Task, correct?)

  • NasdaqMickelson
    NasdaqMickelson Member Posts: 3
    edited May 10 #4
    Options

    Hey @nicholasdgoodman thanks so much for taking a look at this!

    However, I did not experiment with very large attachments (>30MB) so will attempt that next.

    This is the critical piece, I believe, because the scenario is when the event type is RejectedMessageError and the info is 'Document Is Too Large' . Recreating for sure requires sending in a very large binary attachment.

    Are you checking the return code of ISession.Send(…)

    Yes, it is coming back with an OK response. It is worth noting that we are using the Send(IMessage[], int int, out int) method (i.e. sending multiple messages) and that the out int messagesSent value is what we would expect.

    Can you verify the queue received but rejected the message by going into the Broker Manager UI, Queues > { Queue Name } > Stats under the section "Incoming Messages Not Queued"?

    It looks like these stats are zero, so the message never got to the queue. Could it be that Solace prevents them from even entering the Solace system because of the size limit being exceeded?

    [D]oes your SendMessage() method actually return Task.CompletedTask or is that a workaround because of your issue?

    No, it actually waits for all the tasks to complete and has some logic to handle errored flows. The code is something like the following.

    public async Task SendMessages(…)
    {
       // see above
    
      await WaitForAcknowledgements(messages);
    }
    
    public async Task WaitForAcknowledgements(IEnumerable<IMessage> messages)
    {
      var tasks = messages.Select(m ⇒ m.CorrelationKey).OfType<TaskCompletionSource>().Select(t ⇒ t.Task).ToList();
            
      # Ensure we don't wait too long with a timeout
      await Task.WhenAny(Task.WhenAll(tasks), Task.Delay(10000));
           
      // Simplified logic for determining the success of the Tasks for demonstration purposes 
      if (tasks.Exists(t ⇒ !t.isCompleted))
      {
        throw new Exception("Something went wrong");
      }
    }
    

  • nicholasdgoodman
    nicholasdgoodman Member, Employee Posts: 36 Solace Employee
    #5 Answer ✓
    Options

    Hello @NasdaqMickelson - So I did some deep digging and indeed was able to reproduce the issue you have described, specifically for cases when the message attachment is too large.

    This was a bit of head scratcher, so I spoke with some of the Solace developers and discovered that you have found a known corner case bug which has been previously identified and does have a fix scheduled sometime in the near term future.

    The issue ultimately stems from fact that when a broker receives an oversized method, it performs an optimization and rejects it early before handling it off to the part of the code that deals with guaranteed messaging. So although the client side application receives a rejection, it's as if the message was rejected as a Direct Message - which means no correlation data is sent.

    This is the only case in which a GM can result in a rejection without correlation data. The planned fix within SDK will be to prevent sending oversized messages client-side and likely return an error on the Send call outright.

    As a workaround, your client-side code can guard specifically for this - my recommendation would be to add a special check for message sizes greater than 9MB (if the queue can accept a 10MB message) and reject them before sending.

    Wish we had something more clever, but at present this appears to be the most workable approach.

  • NasdaqMickelson
    NasdaqMickelson Member Posts: 3
    Options

    Thank you @nicholasdgoodman! I really appreciate the response. Is there any way I can track the progress on that issue (some public Jira board, maybe?) or do I need to keep my eyes peeled for a Solace SDK update?

    Again, thank you for your help

  • nicholasdgoodman
    nicholasdgoodman Member, Employee Posts: 36 Solace Employee
    Options

    There is no public way to track the fix progress; however, we have linked this post to our internal development ticketing system and will provide an update here when the fix becomes available.

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 541 admin
    Options

    Hey gents. Good debugging! I've added a note to our internal bug tracking tool (REF #57039) to ping this thread when this is fixed. Don't hold your breath though, there is a backlog of much more interesting features to be implemented. At least there's a mostly workable workaround for this.

    Thanks!