Weird chars at start of text payload!?

Aaron
Aaron Member, Administrator, Moderator, Employee Posts: 664 admin
edited November 2024 in Tips and Tricks #1

TL/DR: it's not a mistake or error… it's just a structured Text Message.

Hi all! I'm making a definitive post about this because it's been asked countless times, and I still can't find a good "one-page" reference response. The issue: sometimes you'll see weird characters at the beginning of your text payload. For example, this is from the JCSMP HelloWorld sample:

Destination:                            Topic 'solace/samples/jcsmp/hello/aaron'
Priority: 4
Class Of Service: USER_COS_1
DeliveryMode: DIRECT
Message Id: 6
Binary Attachment: len=26
1c 1a 48 65 6c 6c 6f 20 57 6f 72 6c 64 20 66 72 ..Hello.World.fr
6f 6d 20 41 61 72 6f 6e 21 00 om.Aaron!.

Or if you just print out the raw payload as a String: ∟↓Hello World from Aaron!

See those first two bytes? 0x1c and 0x1a? What are they? In Solace world, when sending a TextMessage, you are not just sending a raw String as binary payload. The API is actually constructing a Structured Data Type (SDT) formatted message, where there is a single field (the String) in the container. Those first few bytes (could be 2-6 I think) define the size of the String contained within the SDT TextMessage.

Solace messages can be one of: TextMessage, BytesMessage, MapMessage, or StreamMessage. Thanks to JMS for these.

So your receiver/consumer should ideally check what type of message it is receiving and deal with it appropriately. In JCSMP Java, this looks something like:

public void onReceive(BytesXMLMessage message) {
    if (message instanceof TextMessage) {
        TextMessage msg = (TextMessage)message;
        String payload = msg.getText();
        // do more
    } else if (message instanceof BytesMessage) {
        BytesMessage msg = (BytesMessage)message;
        byte[] payload = msg.getData();   // NOT getBytes() strangely, that gets the XML payload
        // are you sure the payload is a string?
        String strPayload = new String(payload, StandardCharsets.UTF_8);
        // more
    } else if (message instanceof MapMessage) {  // not often used anymore
        MapMessage msg = (MapMessage)message;
        SDTMap map = msg.getMap();
    } else if (message instanceof StreamMessage) {  // not often used anymore
        StreamMessage msg = (StreamMessage)message;
        SDTStream stream = msg.getStream();
    } else {
        // should be impossible, these are the only 4 types
    }
    . . .

The newer Java API hides this stuff from you, BTW. Other APIs, I'm not sure..?

But in JavaScript/NodeJS, same thing… you need to check what type of message you've received and deal with appropriately:

session.on(solace.SessionEventCode.MESSAGE, function (message) {
    if (message.getType() == solace.MessageType.TEXT) {
        var strPayload = message.getSdtContainer().getValue();
        // do stuff
    } else if (message.getType() == solace.MessageType.BINARY) {
        var payload = message.getBinaryAttachment(); // binary attachment, could be String or Uint8Array
        // do stuff
    } else {
        // either a stream or a map SDT
    }
    . . .

See JavaScript docs on MessageType, and on getType().

A lot of people stumble into this with JavaScript since getBinaryAttachment() returns (usually) a String. And might not notice if their publisher app (also probably JavaScript) is sending plain Strings as raw binary / BytesMessage, instead of an SDT TextMessage. This issue usually shows up when you start mixing different types of publisher languages, APIs, or protocols, and the apps are not all formatting messages the exact same way. (e.g. Java publisher TextMessages, JavaScript consumer).

Oh, and if you want to send text messages with JavaScript, do something like this:

var msg = solace.SolclientFactory.createMessage();
msg.setDestination(solace.SolclientFactory.createTopicDestination("hello/world"));
msg.setSdtContainer(solace.SDTField.create(solace.SDTFieldType.STRING, "here is my text."));

Note that this publishing at structured text vs. binary also applies to REST/HTTP Messaging publishers. This is binary:

curl -u user:pw http://localhost:9000/hello/world -d 'hello bytes message'

And this is a structured TextMessage:

curl -u user:pw http://localhost:9000/hello/world -H 'content-type:text/plain' -d 'hello text message'

See Solace REST encoding docs here on HTTP Content-Type Mapping to Solace Message Types.

Finally, the broker has some smarts built into it for helping with protocol translation. For example, if subscribing with an MQTT client on topic hello/world, and I publish the two (SMF) HTTP messages above (or equivalently, one binary message and one text message), then the MQTT consumer receives both string payloads correctly (no weird extra chars):

I really hope this post helps people, and I've included enough keywords for Google to pagerank it highly..! 😁

Tagged:

Comments

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 664 admin

    Update here! 🎉 A colleague of mine just ran into exactly this situation, except with a REST publisher sending text data (that the broker converted into a TextMessage) and receiving via C# .NET CSCSMP API. Using regular / raw binary attachment, they could see the formatting bytes. So I thought I would include a few more snippets here for hints on how to check if receiving a TextMessage with C#:

    using SolaceSystems.Solclient.Messaging.SDT;
     ...
    var messageText = SDTUtils.GetText(message);
    
    if (messageText != null) {
        // yay, you have a string already!
        Console.WriteLine("Message text content: {0}", messageText);
    } else {
        var messageContainer = SDTUtils.GetContainer(message);
        if (messageContainer != null) {
            // this is either an SDTMap or SDTStream message then...
        } else {
            // so regular binary attachment... this line below assumes that data is actually a UTF-8 string!
            Console.WriteLine("Message binary content: {0}", Encoding.UTF8.GetString(message.BinaryAttachment));
        }
    }
    

    There are no "instanceof" type checks like in Java, but the helper methods inside SDTUtils can help figure out if your message is TextMessage, MapMessage, StreamMessage, or just regular BytesMessage.

    This is apparently true for all the APIs that derive from C (C#, Pyhon, Go) there's no concept of a message type like in Java. Either need to try different getters, or carry the type information in a user property or something.

    For completeness, the CCSMP (C API) functions for testing/retrieval are:

    solClient_msg_getBinaryAttachmentMap
    solClient_msg_getBinaryAttachmentStream
    solClient_msg_getBinaryAttachmentString
    and
    solClient_msg_getBinaryAttachment
    

    The last will succeed for any payload type as it just returns the raw contents. The other 3 will return SOLCLIENT_NOT_FOUND if the payload type doesn't match.