Parsing characters from Binary data

Naruto
Naruto Member Posts: 13 ✭✭

Hi @marc @Aaron @all

I am deserialising some stream of binary data using the payload format as —


typedef struct {

char messageType; // 'N' or 'G' ('N' – New Normal Order, 'G' – New Spread Order)

long timestamp; // Numeric Time in nanoseconds from 01-Jan-1980 00:00:00

double orderID; // Numeric Day Unique Order Reference Number

int token;
char orderType; // 'B' - Buy Order, 'S' - Sell Order int price;
int quantity; // Numeric Quantity of the order

} OrderStruct;

After the deserialisation , there are some unwanted data for char bytes being fetched.
for example —
Order 1: Type:^P Timestamp:1929249811 OrderID:0.000000 Token:74732288 OrderType:^@ Price:1364066304 Quantity:2127492296

The first 16 bytes are for snapshot header. That is being fetched correctly. The problem occurs after the 16 bytes i.e. while fetching the order messages.

w

Comments

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 664 admin
    edited April 2024 #2

    Hi @Naruto. Ok… is this C? This is just your data structure format, the "in-memory" representation within your app. But doesn't say how the data is encoded on the wire. You need that information from the publisher application or documentation. Trying to reverse engineer it is hard! For example:

    • If there are only 2 options for messageType, it could be encoded as a single bit! Or maybe a whole ASCII char (1 byte), or a UTF-16 char (2 bytes)? Or maybe 4 bytes to allow for future expansion, and is a custom encoding.
    • If your timestamp is nanos since epoch, it should look something like 1712321293665926900 (which is right now). Not 1929249811
      • EDIT: just noticed that it's nanos since Jan 1 1980, not 1970 like normal epoch. That's odd..? So the expected number should be quite a bit smaller than the one above, something like: 1396745293665926900
    • orderID is a double? Are you sure? That seems very odd. That should certainly be an int or long. Or a String.
    • You have control/escape characters ^@ and ^P listed, which probably correspond to binary 0x00 and 0x10. These aren't ASCII chars.
    • A quantity of 2,127,492,296? Someone is buying or selling 2.1 billion items?

    Please go and ask the application developers/architects for the encoding format for these messages.