PubSub Docker image fails to start in an LXC/LXD container

nictas
nictas Member Posts: 5

Hi,
I have Docker installed in an LXC/LXD container and I'm trying to start Solace PubSub with the latest image (9.8.0.12). Unfortunately, I keep running into the Unable to raise event; rc(would block) error:

Host Boot ID: cdbf8a80-964d-4368-9174-90d6a004189d
Starting VMR Docker Container: Fri Jan 15 15:59:41 UTC 2021
Setting umask to 077
SolOS Version: soltr_9.8.0.12
2021-01-15T15:59:43.006+00:00 <syslog.info> a99e745b4c44 rsyslogd: [origin software="rsyslogd" swVersion="8.2012.0" x-pid="102" x-info="https://www.rsyslog.com"] start
2021-01-15T15:59:44.013+00:00 <local6.info> a99e745b4c44 appuser[100]: rsyslog startup
2021-01-15T15:59:45.037+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Log redirection enabled, beginning playback of startup log buffer
2021-01-15T15:59:45.056+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: /usr/sw/var/soltr_9.8.0.12/db/dbBaseline does not exist, generating from confd template
2021-01-15T15:59:45.099+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: repairDatabase.py: no database to process
2021-01-15T15:59:45.116+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Finished playback of log buffer
2021-01-15T15:59:45.136+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Updating dbBaseline with dynamic instance metadata
2021-01-15T15:59:45.392+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Generating SSH key
ssh-keygen: generating new host keys: RSA1 RSA DSA ECDSA ED25519 
2021-01-15T15:59:45.848+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Starting solace process
2021-01-15T15:59:47.604+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT  INFO: Launching solacedaemon: /usr/sw/loads/soltr_9.8.0.12/bin/solacedaemon --vmr -z -f /var/lib/solace/config/SolaceStartup.txt -r -1
2021-01-15T15:59:52.411+00:00 <local0.info> a99e745b4c44 appuser[186]: /usr/sw/loads/soltr_9.8.0.12/scripts/post:69    WARN   Unable to read /sys/fs/cgroup/blkio/blkio.weight
2021-01-15T15:59:52.414+00:00 <local0.info> a99e745b4c44 appuser[186]: /usr/sw/loads/soltr_9.8.0.12/scripts/post:69    WARN   Unable to read /sys/fs/cgroup/blkio/blkio.weight_device
2021-01-15T16:00:00.752+00:00 <local0.warning> a99e745b4c44 appuser[1]: /usr/sw                        main.cpp:754                          (SOLDAEMON    - 0x00000000) main(0)@solacedaemon                          WARN     Determining platform type: [  OK  ]
2021-01-15T16:00:00.966+00:00 <local0.warning> a99e745b4c44 appuser[1]: /usr/sw                        main.cpp:754                          (SOLDAEMON    - 0x00000000) main(0)@solacedaemon                          WARN     Running pre-startup checks: [  OK  ]
Unable to raise event; rc(would block)

The command I'm using to run the Docker image is:

docker run -it --shm-size 1073741824 --expose 8080 --expose 55555 -P -e "username_admin_globalaccesslevel=admin" -e "username_admin_password=admin" --security-opt apparmor:unconfined solace/solace-pubsub-standard:9.8.0.12

I saw the --security-opt apparmor:unconfined part from this thread, but it doesn't seem to do anything in my case.
The error is relatively easy to reproduce:
1. Start on a machine with Ubuntu 20.04 installed.
2. Install LXC/LXD: apt install lxd lxd-client
3. Configure LXC/LXD (I used the default values for everything): lxd init
4. Launch an LXC container with Ubuntu on it: lxc launch ubuntu:20.04 ubuntuone
5. Stop it, so that we can perform some additional configurations that allow Docker to run inside it: lxc stop ubuntuone
6. Run the following commands:

lxc config set ubuntuone security.nesting true
lxc config set ubuntuone security.privileged true
lxc config set ubuntuone raw.lxc "lxc.apparmor.profile=unconfined"
  1. Start the LXC container: lxc start ubuntuone
  2. Connect to it: lxc exec ubuntuone -- /bin/bash
  3. Install Docker by following the instructions here: https://docs.docker.com/engine/install/ubuntu/
  4. Try to start Solace with the command I listed above.

Other Docker images like PostgreSQL seem to run just fine in that LXC container, but Solace doesn't for some reason.

Can you please help me with debugging this issue? I've been at it for a couple of days and I've barely had any progress on it. I can download any logs that you may want to take a look at. Just let me know what you need.

Tagged:

Comments

  • Martin
    Martin Member Posts: 1
    edited January 2021 #2

    I am also just getting started with Solace and seeing this error in console in my Windows machine: Unable to raise event; rc(would block). Where you able to fix this?

  • nictas
    nictas Member Posts: 5

    @Martin I'm not sure whether the cause of your problem is the same. I've seen that error message in other Solace threads, but with different underlying causes (this one, for example). Also, I have no problems running Solace on my Windows machine via Docker. It'd be great if someone from Solace could point us to where we can get some logs that help to narrow down the cause.

  • nictas
    nictas Member Posts: 5
    edited January 2021 #4

    I tried several other things that also didn't help:
    1. I disabled apparmor everywhere as shown here: https://help.ubuntu.com/community/AppArmor#Disable_AppArmor_framework
    2. I increased the memory limits of the LXC container:

    lxc config set ubuntuone limits.memory 3GB
    lxc config set ubuntuone limits.memory.enforce soft
    

    I also looked for some logs that would help narrow down the problem. The /var/lib/solace/diags/shutdownMessage and /usr/sw/loads/currentload/.lastRespawnsAction files mentioned in this thread did not exist. The only files with any content in the /var/lib/diags directory were confd.log, messages and system-resources.log. Unfortunately, they don't seem to contain anything useful, but I'm attaching them anyway along with the output of docker inspect <solace-container-name>. Note that the Docker command is set to /bin/bash, because if I use the default /sbin/boot.sh command, the container is removed as soon as the script fails with Unable to raise event; rc(would block) and I'm unable to retrieve any logs.

    Is there any other place I could look for Solace logs?

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 959 admin

    Hi @nictas,
    I don't know of anyone running our docker container using LXC/LXD so I'd be interesting in learning what the issue is (note that this deployment scenario is not supported). In the container you should find some logs under /usr/sw/jail/logs/ that can help. I'd start with the system.log, but the event.log or debug.log might also prove useful.

    Hope that helps! Let us know what you find.

  • nictas
    nictas Member Posts: 5

    Big thanks for your response Marc!
    Of the three files you mentioned only debug.log exists. The error seems to be:
    Unable to fallocate file /usr/sw/internalSpool/softAdb/.diskTest to size 4096: Errno(95) Operation not supported
    I'll do my best to figure out why that fails tomorrow and I'll update this discussion.

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 959 admin

    It sounds like it may be some sort of permissions issue. Something on this docs page might help: https://docs.solace.com/Configuring-and-Managing/SW-Broker-Specific-Config/Docker-Tasks/Config-Arbitrary-User.htm

  • nictas
    nictas Member Posts: 5
    edited January 2021 #8

    It looks like the problem was the file system I was using in my LXC container - ZFS. As soon as I switched to ext4, Solace started without any issues. Apparently, fallocate has some limitations on ZFS, as indicated by this message in Solace's debug.log:

    Unable to fallocate file /usr/sw/internalSpool/softAdb/.diskTest to size 4096: Errno(95) Operation not supported
    

    For anyone that has the same problem: You can change your LXC container's file system with the following commands:

    lxc storage create lvm lvm # I think you can also use "btrfs", if that's what you prefer.
    lxc move <container-name> <container-name>-tobemoved
    lxc move <container-name>-tobemoved <container-name> --storage lvm
    

    Thanks again for your help Marc! By the way, it would be awesome if you could include the actual error in the output of /sbin/boot.shsomehow.

  • marc
    marc Member, Administrator, Moderator, Employee Posts: 959 admin

    Glad you figured it out @nictas and thank you very much for sharing the fix for others!
    I'll provide that feedback internally and see if we can make that error easier to find.