PubSub Docker image fails to start in an LXC/LXD container
Hi,
I have Docker installed in an LXC/LXD container and I'm trying to start Solace PubSub with the latest image (9.8.0.12). Unfortunately, I keep running into the Unable to raise event; rc(would block)
error:
Host Boot ID: cdbf8a80-964d-4368-9174-90d6a004189d Starting VMR Docker Container: Fri Jan 15 15:59:41 UTC 2021 Setting umask to 077 SolOS Version: soltr_9.8.0.12 2021-01-15T15:59:43.006+00:00 <syslog.info> a99e745b4c44 rsyslogd: [origin software="rsyslogd" swVersion="8.2012.0" x-pid="102" x-info="https://www.rsyslog.com"] start 2021-01-15T15:59:44.013+00:00 <local6.info> a99e745b4c44 appuser[100]: rsyslog startup 2021-01-15T15:59:45.037+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Log redirection enabled, beginning playback of startup log buffer 2021-01-15T15:59:45.056+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: /usr/sw/var/soltr_9.8.0.12/db/dbBaseline does not exist, generating from confd template 2021-01-15T15:59:45.099+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: repairDatabase.py: no database to process 2021-01-15T15:59:45.116+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Finished playback of log buffer 2021-01-15T15:59:45.136+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Updating dbBaseline with dynamic instance metadata 2021-01-15T15:59:45.392+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Generating SSH key ssh-keygen: generating new host keys: RSA1 RSA DSA ECDSA ED25519 2021-01-15T15:59:45.848+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Starting solace process 2021-01-15T15:59:47.604+00:00 <local0.info> a99e745b4c44 appuser: EXTERN_SCRIPT INFO: Launching solacedaemon: /usr/sw/loads/soltr_9.8.0.12/bin/solacedaemon --vmr -z -f /var/lib/solace/config/SolaceStartup.txt -r -1 2021-01-15T15:59:52.411+00:00 <local0.info> a99e745b4c44 appuser[186]: /usr/sw/loads/soltr_9.8.0.12/scripts/post:69 WARN Unable to read /sys/fs/cgroup/blkio/blkio.weight 2021-01-15T15:59:52.414+00:00 <local0.info> a99e745b4c44 appuser[186]: /usr/sw/loads/soltr_9.8.0.12/scripts/post:69 WARN Unable to read /sys/fs/cgroup/blkio/blkio.weight_device 2021-01-15T16:00:00.752+00:00 <local0.warning> a99e745b4c44 appuser[1]: /usr/sw main.cpp:754 (SOLDAEMON - 0x00000000) main(0)@solacedaemon WARN Determining platform type: [ OK ] 2021-01-15T16:00:00.966+00:00 <local0.warning> a99e745b4c44 appuser[1]: /usr/sw main.cpp:754 (SOLDAEMON - 0x00000000) main(0)@solacedaemon WARN Running pre-startup checks: [ OK ] Unable to raise event; rc(would block)
The command I'm using to run the Docker image is:
docker run -it --shm-size 1073741824 --expose 8080 --expose 55555 -P -e "username_admin_globalaccesslevel=admin" -e "username_admin_password=admin" --security-opt apparmor:unconfined solace/solace-pubsub-standard:9.8.0.12
I saw the --security-opt apparmor:unconfined
part from this thread, but it doesn't seem to do anything in my case.
The error is relatively easy to reproduce:
1. Start on a machine with Ubuntu 20.04 installed.
2. Install LXC/LXD: apt install lxd lxd-client
3. Configure LXC/LXD (I used the default values for everything): lxd init
4. Launch an LXC container with Ubuntu on it: lxc launch ubuntu:20.04 ubuntuone
5. Stop it, so that we can perform some additional configurations that allow Docker to run inside it: lxc stop ubuntuone
6. Run the following commands:
lxc config set ubuntuone security.nesting true lxc config set ubuntuone security.privileged true lxc config set ubuntuone raw.lxc "lxc.apparmor.profile=unconfined"
- Start the LXC container:
lxc start ubuntuone
- Connect to it:
lxc exec ubuntuone -- /bin/bash
- Install Docker by following the instructions here: https://docs.docker.com/engine/install/ubuntu/
- Try to start Solace with the command I listed above.
Other Docker images like PostgreSQL seem to run just fine in that LXC container, but Solace doesn't for some reason.
Can you please help me with debugging this issue? I've been at it for a couple of days and I've barely had any progress on it. I can download any logs that you may want to take a look at. Just let me know what you need.
Comments
-
@Martin I'm not sure whether the cause of your problem is the same. I've seen that error message in other Solace threads, but with different underlying causes (this one, for example). Also, I have no problems running Solace on my Windows machine via Docker. It'd be great if someone from Solace could point us to where we can get some logs that help to narrow down the cause.
0 -
I tried several other things that also didn't help:
1. I disabled apparmor everywhere as shown here: https://help.ubuntu.com/community/AppArmor#Disable_AppArmor_framework
2. I increased the memory limits of the LXC container:lxc config set ubuntuone limits.memory 3GB lxc config set ubuntuone limits.memory.enforce soft
I also looked for some logs that would help narrow down the problem. The
/var/lib/solace/diags/shutdownMessage
and/usr/sw/loads/currentload/.lastRespawnsAction
files mentioned in this thread did not exist. The only files with any content in the/var/lib/diags
directory wereconfd.log
,messages
andsystem-resources.log
. Unfortunately, they don't seem to contain anything useful, but I'm attaching them anyway along with the output ofdocker inspect <solace-container-name>
. Note that the Docker command is set to/bin/bash
, because if I use the default/sbin/boot.sh
command, the container is removed as soon as the script fails withUnable to raise event; rc(would block)
and I'm unable to retrieve any logs.Is there any other place I could look for Solace logs?
0 -
Hi @nictas,
I don't know of anyone running our docker container using LXC/LXD so I'd be interesting in learning what the issue is (note that this deployment scenario is not supported). In the container you should find some logs under/usr/sw/jail/logs/
that can help. I'd start with thesystem.log
, but theevent.log
ordebug.log
might also prove useful.Hope that helps! Let us know what you find.
0 -
Big thanks for your response Marc!
Of the three files you mentioned onlydebug.log
exists. The error seems to be:Unable to fallocate file /usr/sw/internalSpool/softAdb/.diskTest to size 4096: Errno(95) Operation not supported
I'll do my best to figure out why that fails tomorrow and I'll update this discussion.1 -
It sounds like it may be some sort of permissions issue. Something on this docs page might help: https://docs.solace.com/Configuring-and-Managing/SW-Broker-Specific-Config/Docker-Tasks/Config-Arbitrary-User.htm
0 -
It looks like the problem was the file system I was using in my LXC container - ZFS. As soon as I switched to ext4, Solace started without any issues. Apparently,
fallocate
has some limitations on ZFS, as indicated by this message in Solace'sdebug.log
:Unable to fallocate file /usr/sw/internalSpool/softAdb/.diskTest to size 4096: Errno(95) Operation not supported
For anyone that has the same problem: You can change your LXC container's file system with the following commands:
lxc storage create lvm lvm # I think you can also use "btrfs", if that's what you prefer. lxc move <container-name> <container-name>-tobemoved lxc move <container-name>-tobemoved <container-name> --storage lvm
Thanks again for your help Marc! By the way, it would be awesome if you could include the actual error in the output of
/sbin/boot.sh
somehow.4