OutOfMemoryError: parsing SMF Message (message too big): GC overhead limit exceeded

sateesh
sateesh Member Posts: 5
edited October 2019 in PubSub+ Event Broker #1

Hi,
We are using Solace JMS to read data from JMS Queues and we are constantly seeing the following Error and it is bringing down the client

   _ java.io.IOException: OutOfMemoryError parsing SMF message (message too big): Java heap space
           at com.solacesystems.jcsmp.protocol.smf.SMFWireMessageHandler.readMessage(SMFWireMessageHandler.java:87)
           at com.solacesystems.jcsmp.protocol.nio.impl.SubscriberMessageReader.processRead(SubscriberMessageReader.java:95)
           at com.solacesystems.jcsmp.protocol.nio.impl.SubscriberMessageReader.read(SubscriberMessageReader.java:138)
           at com.solacesystems.jcsmp.protocol.smf.SimpleSmfClient.read(SimpleSmfClient.java:1142)
           at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor.processReactorChannels(SyncEventDispatcherReactor.java:206)
           at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor.eventLoop(SyncEventDispatcherReactor.java:157)
           at com.solacesystems.jcsmp.protocol.nio.impl.SyncEventDispatcherReactor$SEDReactorThread.run(SyncEventDispatcherReactor.java:338)
           at java.lang.Thread.run(Thread.java:748)_

We are running the java app with JVM Args -Xms2048m and -Xmx12288m
Seems like there is a resource leak some where in the code.

Thanks
Sateesh

Answers

  • sateesh
    sateesh Member Posts: 5
    edited October 2019 #2

    To add some context to the Above Issue we are running 50 Consumers . Each consumer trying to consume the data from JMS Queue as single consumer is unable to keep up with the pace with which data is produced.
    We ran the SDKPerf to check the stats and if we run single consumer it is only consuming around 60 messages per second and we need to consume 2000 messages per second and hence we are running 50 consumers with each consumer getting it's own JNDI Context, JMS Connection and JMSSession.
    If we use the same JMS Connection across the 50 Consumers then we are not consuming 2000msg/sec rate and hence we decided to use consumer independent JNDI Context, JMS Connection and session.

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 594 admin

    Hi Sateesh. Ok, interesting to hear about your issues. I'm wondering a couple things:
    - How big are your messages? I'm wondering why even SdkPerf can only receive 60 msg/s. Are you bandwidth limited? Or CPU limited?
    - If you run the SdkPerf consumer again, with -cc=2 (2 consumers), does it scale linearly? If not, does running it with "-cc=2 -cpc" help?
    - On your queue, what is the value of "Max Delivered Unacked Msgs Per Flow"? This is essentially the "prefetch" from the queue that the API can receive

    It sounds like the API is pulling down a bunch of messages from the queue that the application is not processing fast enough and causing the Heap error. If you turn down the max-unacked-per-flow to something much smaller (e.g. 10?) it should help prevent that.

    Let me know if that helps!

  • [Deleted User]
    [Deleted User] Posts: 0 ✭✭

    @sateesh I wanted to check if the above response from Aaron was able to help. If not, please let us know and we can continue to assist!

  • sateesh
    sateesh Member Posts: 5

    Hi Aaron,
    Our messages are at the most 2Kb and after increasing the memory significantly ( running with 25gb now) we are able to get the data and when i ran the sdkperf with -cc=15 or -cc=50 it was running fine.

    How can i configure "max-unacked-per-flow" ?? Could you please let me know. Hope this is a client side setting.
    Thanks
    Sateesh

  • Aaron
    Aaron Member, Administrator, Moderator, Employee Posts: 594 admin

    @sateesh it is a configuration on the queue. The default is 10,000. You can configure this in either the WebManager GUI or via CLI.

    Still, 25GB for a single client application sounds pretty crazy! I'm wondering if there is something else going on here. By my quick calcs, that many consumers pulling as many 2kB messages they can, should only consumer about 1GB of RAM.

    Did the performance with SdkPerf scale linearly with the number of client connections? 60 msg/s @ 2kB is pretty bad. Are you in the cloud? I'm wondering about the disk performance..?

  • Sateesh Kommineni
    Sateesh Kommineni Member Posts: 3

    Hi Aaron,
    We did some more debugging and it looks like the Issue it not with the Solace Client. After we are the message from Solace JMS Queue we put it in memory Queue to be processed by downstream processors (one being Kafka) and if the Kafka brokers cannot keep up with the Rate that's when this Queue is growing and eventually running out of memory.

    We did multiple tests using SdkPerf and the max no.of messages we could get are 50 messages per single Consumer and in order to keep up the incoming flow rate we have to use 50 Consumers.
    And another thing we are noticing is the Consumers are constantly getting disconnected (they won't keep the connection alive)

    3951194 [SwimTool] INFO c.s.j.p.impl.TcpClientChannel - Connection attempt failed to host 'extacywjhtc5178' ConnectException com.solacesystems.jcsmp.JCSMPTransportException: (Client name: /1277/#000d0192 Local port: -1 Remote addr: extacywjhtc5178:55003) - Timeout happened when reading response from the router. cause: java.net.SocketTimeoutException ((Client name: /1277/#000d0192 Local port: -1 Remote addr: extacywjhtc5178:55003) - )
    3954194 [SwimTool] INFO c.s.j.p.impl.TcpClientChannel - Channel Closed (smfclient 804)
    3954195 [SwimTool] INFO c.s.j.p.impl.TcpClientChannel - Channel Closed (smfclient 804)
    3959195 [SwimTool] INFO c.s.j.p.impl.TcpClientChannel - Channel Closed (smfclient 802)
    3959197 [SwimTool] INFO c.s.j.p.impl.TcpClientChannel - Connecting to host 'orig=tcp://extacywjhtc5178, scheme=tcp://, host=extacywjhtc5178' (host 1 of 1, smfclient 806, attempt 1 of 1, this_host_attempt: 1 of 1)
    Thanks
    Sateesh