[pkg-lxc-devel] Bug#824519: lxc: After CPU hotplug, cores remain idle under LXC

Davide baldiniebaldini at gmail.com
Tue May 17 02:20:09 UTC 2016


Package: lxc
Version: 1.1.4
Severity: normal

-- Description:

After I switch off and back on the cores by toggling the flags "/sys/devices/system/cpu/cpu*/online" without physically removing the CPUs, all affected cores remain unusable by processes executing inside an LXC environment. Only the untouched cores remain usable by LXC processes. After setting the cores back online, the cores return usable only from the un-virtualized host.

I determine that cores are unusable by processes running in LXC by observing via 'atop' that these processes' threads run only on unaltered cores, while the affected cores remain idle with an amount of runnable threads greater than the number of cores. Within any span of time from a fresh hardware reboot to the toggling of these hotplug flags, all cores are usable by LXC processes.

-- How to reproduce:

    for c in /sys/devices/system/cpu/cpu*/online; do
        echo 0 >$c
        echo 1 >$c
    done

then run multiple processes or multiple threads from an LXC environment.

-- Hardware:

Supermicro x7dbr-3, socket 771
2 x Intel xeon l5410, 4 cores per chip

-- Kernel:

4.5.4 and 4.1.1 are affected.
CONFIG_BOOTPARAM_HOTPLUG_CPU0 = false.
KVM module is mounted.
LXC is in use, with cgroups.

-- Dmesg:

After toggling OFF and then ON the cores from 1 to 7, dmesg shows:
    [ 2214.621148] smpboot: CPU 1 is now offline
    [ 2214.711114] smpboot: CPU 2 is now offline
    [ 2214.801249] smpboot: CPU 3 is now offline
    [ 2214.901186] smpboot: CPU 4 is now offline
    [ 2215.010119] Broke affinity for irq 25
    [ 2215.011273] smpboot: CPU 5 is now offline
    [ 2215.110125] Broke affinity for irq 25
    [ 2215.111162] smpboot: CPU 6 is now offline
    [ 2215.220102] Broke affinity for irq 15
    [ 2215.220123] Broke affinity for irq 25
    [ 2215.221157] smpboot: CPU 7 is now offline
    [ 2231.230726] x86: Booting SMP configuration:
    [ 2231.230731] smpboot: Booting Node 0 Processor 1 APIC 0x4
    [ 2231.340393] smpboot: Booting Node 0 Processor 2 APIC 0x1
    [ 2231.430533] smpboot: Booting Node 0 Processor 3 APIC 0x5
    [ 2231.530515] smpboot: Booting Node 0 Processor 4 APIC 0x2
    [ 2231.640622] smpboot: Booting Node 0 Processor 5 APIC 0x6
    [ 2231.730666] smpboot: Booting Node 0 Processor 6 APIC 0x3
    [ 2231.860766] smpboot: Booting Node 0 Processor 7 APIC 0x7

After a hardware reboot and further testing, I noticed that hotplugging the CPUs and issuing the command (1) causes dmesg to report the segfault of "irqbalance":
    [178598.011156] smpboot: CPU 7 is now offline
    [178600.301137] smpboot: CPU 6 is now offline
    [178601.911157] smpboot: CPU 5 is now offline
    [178603.540116] Broke affinity for irq 25
    [178603.541173] smpboot: CPU 4 is now offline
    [178622.960604] smpboot: Booting Node 0 Processor 4 APIC 0x2
    [178624.620572] smpboot: Booting Node 0 Processor 5 APIC 0x6
    [178626.030665] smpboot: Booting Node 0 Processor 6 APIC 0x3
    [178627.701053] smpboot: Booting Node 0 Processor 7 APIC 0x7
    [178628.485603] irqbalance[28503]: segfault at 30203034 ip 000000000804d81f sp 00000000ff9bb2c0 error 4 in irqbalance[8048000+a000]

Command (1) is:
    # lxc-cgroup -P /media/raid1/ -n ve-106 cpuset.cpus '0,7'
    lxc-cgroup: lxc_cgroup.c: main: 103 failed to assign '0,7' value to 'cpuset.cpus' for 've-106'

-- LXC log files:

The bottom of /media/raid1/ve-106/ve-106.log, starting from ~2 days earlier than tests were performed:
    lxc-start 1463263845.605 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/perf_event/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/net_cls/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/freezer/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/devices/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/memory/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/blkio/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuacct/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpu/lxc/ve-106
    lxc-start 1463263845.634 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuset/lxc/ve-106
    lxc-start 1463263931.037 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463263931.085 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463265074.414 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/perf_event/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/net_cls/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/freezer/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/devices/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/memory/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/blkio/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuacct/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpu/lxc/ve-106
    lxc-start 1463265074.442 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuset/lxc/ve-106
    lxc-start 1463265157.980 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463265158.072 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463267729.785 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463267729.826 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463268765.226 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/perf_event/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/net_cls/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/freezer/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/devices/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/memory/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/blkio/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuacct/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpu/lxc/ve-106
    lxc-start 1463268765.246 ERROR    lxc_cgfs - cgfs.c:cgroup_rmdir:166 - cgroup_rmdir: failed to open /sys/fs/cgroup/cpuset/lxc/ve-106
    lxc-start 1463268850.613 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger
    lxc-start 1463268850.633 ERROR    lxc_utils - utils.c:open_without_symlink:1575 - No such file or directory - Error examining sysrq-trigger in /usr/local/lib/lxc/rootfs/proc/sysrq-trigger

-- System Information:
Debian Release: 8.2
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.5.4 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=ANSI_X3.4-1968) (ignored: LC_ALL set to C)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)



More information about the Pkg-lxc-devel mailing list