[pkg-lxc-devel] Bug#978065: lxc: After upgrade lxc to 4.0.5-1, cannot start with lxc.cap.drop sys_admin

Pierre-Elliott Bécue peb at debian.org
Fri Jun 11 20:59:51 BST 2021


Le mardi 26 janvier 2021 à 00:59:52+0100, Andras Korn a écrit :
> Hi,
> 
> I hit the same issue.
> 
> I upgraded from 1:4.0.4-6 to 1:4.0.5-2, and from kernel 5.9.0-4-amd64 to 5.10.0-2-amd64, and some of my containers that used to work before don't work anyomre. The ones that still work don't drop sys_admin.
> 
> stracing lxc-start I see this:
> 
> openat2(33</usr/lib/x86_64-linux-gnu/lxc/rootfs>, "/sys/fs/cgroup", {flags=O_RDONLY|O_CLOEXEC|O_PATH, resolve=RESOLVE_NO_XDEV|RESOLVE_NO_MAGICLINKS|RESOLVE_NO_SYMLINKS|RESOLVE_BENEATH}, 24) = -1 EXDEV (Invalid cross-device link)
> 
> The corresponding message from lxc-start with loglevel debug is:
> 
> lxc-start unifiadmin 20210125231743.129 ERROR    conf - conf.c:lxc_mount_auto_mounts:727 - Invalid cross-device link - Failed to mount "/sys/fs/cgroup"
> 
> Some context from lxc-start log output:
> 
> lxc-start unifiadmin 20210125231742.854 INFO     start - start.c:lxc_init:837 - Container "unifiadmin" is initialized
> lxc-start unifiadmin 20210125231742.876 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.unifiadmin"
> lxc-start unifiadmin 20210125231742.886 INFO     cgfsng - cgroups/cgfsng.c:cgfsng_monitor_create:1368 - The monitor process uses "lxc.monitor.unifiadmin" as cgroup
> lxc-start unifiadmin 20210125231742.904 WARN     cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.unifiadmin"
> lxc-start unifiadmin 20210125231742.916 INFO     cgfsng - cgroups/cgfsng.c:cgfsng_payload_create:1471 - The container process uses "lxc.payload.unifiadmin" as cgroup
> lxc-start unifiadmin 20210125231742.944 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWNS
> lxc-start unifiadmin 20210125231742.944 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWPID
> lxc-start unifiadmin 20210125231742.945 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWUTS
> lxc-start unifiadmin 20210125231742.945 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWIPC
> lxc-start unifiadmin 20210125231742.945 INFO     start - start.c:lxc_spawn:1700 - Cloned CLONE_NEWNET
> lxc-start unifiadmin 20210125231742.945 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved mnt namespace via fd 31
> lxc-start unifiadmin 20210125231742.945 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved pid namespace via fd 32
> lxc-start unifiadmin 20210125231742.946 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved uts namespace via fd 33
> lxc-start unifiadmin 20210125231742.946 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved ipc namespace via fd 34
> lxc-start unifiadmin 20210125231742.946 DEBUG    start - start.c:lxc_try_preserve_namespaces:166 - Preserved net namespace via fd 35
> lxc-start unifiadmin 20210125231742.949 INFO     cgfsng - cgroups/cgfsng.c:cgfsng_setup_limits_legacy:2881 - Limits for the legacy cgroup hierarchies have been setup
> lxc-start unifiadmin 20210125231742.955 WARN     cgfsng - cgroups/cgfsng.c:cgfsng_setup_limits:2942 - Invalid argument - Ignoring cgroup2 limits on legacy cgroup system
> lxc-start unifiadmin 20210125231743.315 INFO     network - network.c:instantiate_veth:285 - Retrieved mtu 1500 from intra
> lxc-start unifiadmin 20210125231743.666 INFO     network - network.c:instantiate_veth:333 - Attached "veth-unifi" to bridge "intra"
> lxc-start unifiadmin 20210125231743.687 DEBUG    network - network.c:instantiate_veth:449 - Instantiated veth tunnel "veth-unifi <--> vethv7jzuF"
> lxc-start unifiadmin 20210125231743.699 WARN     start - start.c:do_start:1166 - Using /dev/null from the host for container init's standard file descriptors. Migration will not work
> lxc-start unifiadmin 20210125231743.704 INFO     start - start.c:do_start:1198 - Unshared CLONE_NEWCGROUP
> lxc-start unifiadmin 20210125231743.731 DEBUG    storage - storage/storage.c:get_storage_by_name:211 - Detected rootfs type "dir"
> lxc-start unifiadmin 20210125231743.734 DEBUG    conf - conf.c:lxc_mount_rootfs:1259 - Mounted rootfs "/var/lib/lxc/unifiadmin/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
> lxc-start unifiadmin 20210125231743.738 INFO     conf - conf.c:setup_utsname:751 - Set hostname to "unifiadmin"
> lxc-start unifiadmin 20210125231743.740 DEBUG    network - network.c:lxc_network_setup_in_child_namespaces_common:3510 - Network device "" has been setup
> lxc-start unifiadmin 20210125231743.977 DEBUG    network - network.c:setup_hw_addr:3360 - Mac address "00:16:3e:11:22:33" on "eth0" has been setup
> lxc-start unifiadmin 20210125231743.103 DEBUG    network - network.c:lxc_network_setup_in_child_namespaces_common:3510 - Network device "eth0" has been setup
> lxc-start unifiadmin 20210125231743.103 INFO     network - network.c:lxc_setup_network_in_child_namespaces:3532 - Network has been setup
> lxc-start unifiadmin 20210125231743.116 DEBUG    conf - conf.c:mount_entry:1943 - Remounting "/shared/cache/apt/lists" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/lib/apt/lists" to respect bind or remount options
> lxc-start unifiadmin 20210125231743.116 DEBUG    conf - conf.c:mount_entry:1962 - Flags for "/shared/cache/apt/lists" were 1038, required extra flags are 14
> lxc-start unifiadmin 20210125231743.117 DEBUG    conf - conf.c:mount_entry:1971 - Mountflags already were 5134, skipping remount
> lxc-start unifiadmin 20210125231743.117 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "/shared/cache/apt/lists" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/lib/apt/lists" with filesystem type "none"
> lxc-start unifiadmin 20210125231743.118 DEBUG    conf - conf.c:mount_entry:1943 - Remounting "/shared/cache/apt/archives" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/cache/apt/archives" to respect bind or remount options
> lxc-start unifiadmin 20210125231743.118 DEBUG    conf - conf.c:mount_entry:1962 - Flags for "/shared/cache/apt/archives" were 1038, required extra flags are 14
> lxc-start unifiadmin 20210125231743.118 DEBUG    conf - conf.c:mount_entry:1971 - Mountflags already were 5134, skipping remount
> lxc-start unifiadmin 20210125231743.118 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "/shared/cache/apt/archives" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/cache/apt/archives" with filesystem type "none"
> lxc-start unifiadmin 20210125231743.119 DEBUG    conf - conf.c:mount_entry:1943 - Remounting "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/kcore" to respect bind or remount options
> lxc-start unifiadmin 20210125231743.119 DEBUG    conf - conf.c:mount_entry:1962 - Flags for "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" were 1024, required extra flags are 0
> lxc-start unifiadmin 20210125231743.120 DEBUG    conf - conf.c:mount_entry:1971 - Mountflags already were 4096, skipping remount
> lxc-start unifiadmin 20210125231743.120 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/kcore" with filesystem type "none"
> lxc-start unifiadmin 20210125231743.123 DEBUG    conf - conf.c:mount_entry:1943 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
> lxc-start unifiadmin 20210125231743.123 DEBUG    conf - conf.c:mount_entry:1962 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
> lxc-start unifiadmin 20210125231743.123 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
> lxc-start unifiadmin 20210125231743.125 DEBUG    conf - conf.c:mount_entry:1943 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind or remount options
> lxc-start unifiadmin 20210125231743.125 DEBUG    conf - conf.c:mount_entry:1962 - Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
> lxc-start unifiadmin 20210125231743.125 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem type "none"
> lxc-start unifiadmin 20210125231743.127 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "run" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/run" with filesystem type "tmpfs"
> lxc-start unifiadmin 20210125231743.128 DEBUG    conf - conf.c:mount_entry:2006 - Mounted "none" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/shm" with filesystem type "tmpfs"
> lxc-start unifiadmin 20210125231743.129 ERROR    conf - conf.c:lxc_mount_auto_mounts:727 - Invalid cross-device link - Failed to mount "/sys/fs/cgroup"
> lxc-start unifiadmin 20210125231743.130 ERROR    conf - conf.c:lxc_setup:3365 - Failed to setup remaining automatic mounts
> lxc-start unifiadmin 20210125231743.130 ERROR    start - start.c:do_start:1218 - Failed to setup container "unifiadmin"
> lxc-start unifiadmin 20210125231743.131 ERROR    sync - sync.c:__sync_wait:36 - An error occurred in another process (expected sequence number 5)
> lxc-start unifiadmin 20210125231743.132 DEBUG    network - network.c:lxc_delete_network:3665 - Deleted network devices
> lxc-start unifiadmin 20210125231743.133 ERROR    lxccontainer - lxccontainer.c:wait_on_daemonized_start:859 - Received container state "ABORTING" instead of "RUNNING"
> lxc-start unifiadmin 20210125231743.134 ERROR    lxc_start - tools/lxc_start.c:main:308 - The container failed to start
> lxc-start unifiadmin 20210125231743.135 ERROR    lxc_start - tools/lxc_start.c:main:311 - To get more details, run the container in foreground mode
> lxc-start unifiadmin 20210125231743.135 ERROR    start - start.c:__lxc_start:1999 - Failed to spawn container "unifiadmin"
> lxc-start unifiadmin 20210125231743.136 ERROR    lxc_start - tools/lxc_start.c:main:313 - Additional information can be obtained by setting the --logfile and --logpriority options
> lxc-start unifiadmin 20210125231743.136 WARN     start - start.c:lxc_abort:1012 - No such process - Failed to send SIGKILL via pidfd 30 for process 15227
> lxc-start unifiadmin 20210125231743.748 INFO     conf - conf.c:run_script_argv:342 - Executing script "/usr/share/lxcfs/lxc.reboot.hook" for container "unifiadmin"
> lxc-start unifiadmin 20210125231744.288 INFO     conf - conf.c:run_script_argv:342 - Executing script "/usr/share/lxcfs/lxc.reboot.hook" for container "unifiadmin"
> 
> If I don't drop the sys_admin capability, it works again.
> 
> Before the upgrade, it also worked if I dropped sys_admin.
> 
> The configfile for this guest is:
> 
> ----- 8< -----
> lxc.include = /usr/share/lxc/config/common.conf
> 
> lxc.apparmor.profile = generated
> lxc.apparmor.allow_nesting = 0
> lxc.hook.version = 1
> 
> lxc.mount.entry = run  run     tmpfs rw,nodev,relatime,mode=755,size=20m,create=dir 0 0
> lxc.mount.entry = none dev/shm tmpfs rw,nosuid,nodev,mode=1777,size=100m,create=dir 0 0
> lxc.cap.drop = sys_resource audit_write block_suspend linux_immutable mac_admin mac_override sys_admin sys_module sys_pacct sys_rawio sys_resource sys_time sys_tty_config syslog
> lxc.start.auto = 1
> lxc.cgroup.devices.deny = a
> lxc.cgroup.devices.allow = c 1:3 rwm
> lxc.cgroup.devices.allow = c 1:5 rwm
> lxc.cgroup.devices.allow = c 1:7 rwm
> lxc.cgroup.devices.allow = c 1:8 rwm
> lxc.cgroup.devices.allow = c 1:9 rwm
> lxc.cgroup.devices.allow = c 5:0 rwm
> lxc.cgroup.devices.allow = c 5:1 rwm
> lxc.cgroup.devices.allow = c 5:2 rwm
> lxc.cgroup.devices.allow = c 136:* rwm
> lxc.cgroup.devices.allow = c 10:229 rwm
> lxc.cgroup.devices.allow = c 254:0 rm
> lxc.cgroup.devices.allow = c 10:200 rwm
> lxc.cgroup.devices.allow = c 10:228 rwm
> lxc.cgroup.devices.allow = c 10:232 rwm
> 
> lxc.autodev = 0
> lxc.tty.dir = 
> lxc.tty.max = 0
> 
> # Container specific configuration
> lxc.rootfs.path = dir:/var/lib/lxc/unifiadmin/rootfs
> lxc.uts.name = unifiadmin
> lxc.arch = amd64
> 
> # Network configuration
> lxc.net.0.type = empty
> lxc.net.1.type = veth
> lxc.net.1.link = intra
> lxc.net.1.flags = up
> lxc.net.1.name = eth0
> lxc.net.1.veth.pair = veth-unifi
> lxc.net.1.hwaddr = 00:16:3e:11:22:33
> lxc.mount.fstab = /var/lib/lxc/unifiadmin/fstab
> ----- >8 -----
> 
> The fstab doesn't reference cgroup or /sys.
> 
> I googled around and found this post from 2012: https://lists.linuxfoundation.org/pipermail/containers/2012-November/030827.html -- based on this, maybe the problem is that cap_sys_admin is dropped too early now?
> 
> Also, https://github.com/lxc/lxc/issues/1737 looks related. https://blog.iwakd.de/lxc-cap_sys_admin-jessie also suggests that running containers without cap_sys_admin used to be possible.
> 
> Or maybe I should be using cgroup2?
> 
> FWIW, both host and guest use runit, so systemd is not involved; runit doesn't interfere with cgroups or capabilities on its own in any way.

It's not possible if the init in the container is systemd, as far as I
know.

I can't see how this can be dealt with without systemd's devs trying to
make it a bit less capabilities-demanding.

-- 
Pierre-Elliott Bécue
GPG: 9AE0 4D98 6400 E3B6 7528  F493 0D44 2664 1949 74E2
It's far easier to fight for principles than to live up to them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-lxc-devel/attachments/20210611/3b347ea9/attachment.sig>


More information about the Pkg-lxc-devel mailing list