[pkg-go] Bug#956555: runc init hangs on reopening exec.fifo

Jan Nordholz j.nordholz at tu-berlin.de
Sun Apr 12 21:50:35 BST 2020


Package: runc
Version: 1.0.0~rc10+dfsg1-1
Severity: normal

Hi,

I'm unable to run even the simplest Docker container on my system.
Even doing 'docker build/run -it' on "FROM busybox:latest" results in
'runc init' getting stuck. This is what I get...

=====
jan at p53:~/tmp$ docker run -it my:bb
docker: Error response from daemon: no status provided on response: unknown.
ERRO[0001] error waiting for container: context canceled 
=====

... and what I found looking around:

=====
root     27066 23255  0 15:46 ?        Sl     0:00      \_ docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/5a53a9fbfd52215143a737ef16f398f0d1debe8b0e955bbff9c12f2f9b78da33 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/
root     27082 27066  4 15:46 pts/0    Ssl+   0:00          \_ runc init
=====
root at p53:/tmp/runc-1.0.0~rc10+dfsg1# docker container ls -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
5a53a9fbfd52        my:bb               "sh"                     12 seconds ago      Created                                 amazing_villani
=====
root at p53:/tmp/runc-1.0.0~rc10+dfsg1# strace -p27082
strace: Process 27082 attached
openat(AT_FDCWD, "/proc/self/fd/6", O_WRONLY|O_CLOEXEC^Cstrace: Process 27082 detached
 <detached ...>
=====
root at p53:/tmp/runc-1.0.0~rc10+dfsg1# lsof -p27082
COMMAND     PID USER   FD      TYPE DEVICE SIZE/OFF     NODE NAME
runc:[2:I 27082 root  cwd       DIR  253,4     4096   131073 /
runc:[2:I 27082 root  rtd       DIR  253,4     4096   131073 /
runc:[2:I 27082 root  txt       REG  253,0  8940480 27669163 /
runc:[2:I 27082 root  mem       REG  253,0  1831600 27670317 /usr/lib/x86_64-linux-gnu/libc-2.30.so
runc:[2:I 27082 root  mem       REG  253,0   329960 27665569 /usr/lib/x86_64-linux-gnu/libseccomp.so.2.4.3
runc:[2:I 27082 root  mem       REG  253,0   146912 27670379 /usr/lib/x86_64-linux-gnu/libpthread-2.30.so
runc:[2:I 27082 root  mem       REG  253,0   169720 27669833 /usr/lib/x86_64-linux-gnu/ld-2.30.so
runc:[2:I 27082 root    0u      CHR  136,0      0t0        3 /dev/pts/0
runc:[2:I 27082 root    1u      CHR  136,0      0t0        3 /dev/pts/0
runc:[2:I 27082 root    2u      CHR  136,0      0t0        3 /dev/pts/0
runc:[2:I 27082 root    5w     FIFO   0,11      0t0   371326 pipe
runc:[2:I 27082 root    6u     FIFO   0,19      0t0   371323 /run/docker/runtime-runc/moby/5a53a9fbfd52215143a737ef16f398f0d1debe8b0e955bbff9c12f2f9b78da33/exec.fifo
runc:[2:I 27082 root    8u  a_inode   0,12        0     1039 [eventpoll]
runc:[2:I 27082 root    9u      CHR  136,0      0t0        3 /dev/pts/0
=====
root at p53:/tmp/runc-1.0.0~rc10+dfsg1# lsof /run/docker/runtime-runc/moby/5a53a9fbfd52215143a737ef16f398f0d1debe8b0e955bbff9c12f2f9b78da33/exec.fifo 
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
runc:[2:I 27082 root    6u  FIFO   0,19      0t0 371323 /run/docker/runtime-runc/moby/5a53a9fbfd52215143a737ef16f398f0d1debe8b0e955bbff9c12f2f9b78da33/exec.fifo
=====

Manually doing a 'cat .../exec.fifo' yields a single '0' and allows the
'runc init' process to spawn into the desired shell. However I cannot
attach to the container, as docker insists it's not running.

Some googling around showed that there have been some races in opening
that fifo in the past[1]. I don't understand how this call can still be
blocked; no matter which open came first (O_RDWR that resulted in fd 6
or the blocked O_WRONLY one), once a reader is present this should
complete.

Multithreading backtraces, if that's any help (yes, different PID,
same problem as above):

=====
(gdb) inf thr
  Id   Target Id                 Frame 
* 1    LWP 27577 "runc:[2:INIT]" syscall.Syscall6 () at syscall/asm_linux_amd64.s:53
  2    LWP 27579 "runc:[2:INIT]" runtime.futex () at runtime/sys_linux_amd64.s:536
  3    LWP 27580 "runc:[2:INIT]" runtime.futex () at runtime/sys_linux_amd64.s:536
  4    LWP 27581 "runc:[2:INIT]" runtime.futex () at runtime/sys_linux_amd64.s:536
  5    LWP 27582 "runc:[2:INIT]" runtime.futex () at runtime/sys_linux_amd64.s:536
(gdb) bt
#0  syscall.Syscall6 () at syscall/asm_linux_amd64.s:53
#1  0x000000000052ef7b in golang.org/x/sys/unix.openat (dirfd=-100, path=..., flags=524289, mode=0, fd=<optimized out>, err=...) at golang.org/x/sys/unix/zsyscall_linux_amd64.go:89
#2  0x00000000007967f2 in golang.org/x/sys/unix.Open (path=..., fd=<optimized out>, err=..., mode=<optimized out>, perm=<optimized out>) at golang.org/x/sys/unix/syscall_linux.go:150
#3  github.com/opencontainers/runc/libcontainer.(*linuxStandardInit).Init (l=0xc00015da70, ~r0=...) at github.com/opencontainers/runc/libcontainer/standard_init_linux.go:188
#4  0x0000000000784ad4 in github.com/opencontainers/runc/libcontainer.(*LinuxFactory).StartInitialization (l=<optimized out>, err=...) at github.com/opencontainers/runc/libcontainer/factory_linux.go:380
#5  0x00000000007f84ab in main.glob..func6 (context=<optimized out>, ~r1=...) at github.com/opencontainers/runc/init.go:43
#6  0x00000000007b767e in github.com/urfave/cli.HandleAction (action=..., context=0xc0000d49a0, err=...) at github.com/urfave/cli/app.go:523
#7  0x00000000007b83dc in github.com/urfave/cli.Command.Run (c=..., ctx=0xc0000d4840, err=...) at github.com/urfave/cli/command.go:174
#8  0x00000000007b5725 in github.com/urfave/cli.(*App).Run (a=0xc0000b6000, arguments=..., err=...) at github.com/urfave/cli/app.go:276
#9  0x00000000007eeac4 in main.main () at github.com/opencontainers/runc/main.go:145
(gdb) thr 2
[Switching to thread 2 (LWP 27579)]
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
536     runtime/sys_linux_amd64.s: No such file or directory.
(gdb) bt
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
#1  0x000000000042fc04 in runtime.futexsleep (addr=0xc89470 <runtime.sched+272>, val=0, ns=60000000000) at runtime/os_linux.go:50
#2  0x000000000040f50e in runtime.notetsleep_internal (n=0xc89470 <runtime.sched+272>, ns=60000000000, ~r2=<optimized out>) at runtime/lock_futex.go:193
#3  0x000000000040f5e1 in runtime.notetsleep (n=0xc89470 <runtime.sched+272>, ns=60000000000, ~r2=<optimized out>) at runtime/lock_futex.go:216
#4  0x000000000043e7de in runtime.sysmon () at runtime/proc.go:4316
#5  0x0000000000436d13 in runtime.mstart1 () at runtime/proc.go:1201
#6  0x0000000000436c2e in runtime.mstart () at runtime/proc.go:1167
#7  0x0000000000800cbc in crosscall_amd64 () at gcc_amd64.S:35
#8  0x00007f340dbed700 in ?? ()
#9  0x00007f340dbecfc0 in ?? ()
#10 0x00007ffe8e5c56ef in ?? ()
#11 0x000000c000000900 in ?? ()
#12 0x0000000000436bc0 in ?? () at runtime/proc.go:1080
#13 0x0000000000000000 in ?? ()
(gdb) thr 3
[Switching to thread 3 (LWP 27580)]
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
536     in runtime/sys_linux_amd64.s
(gdb) bt
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
#1  0x000000000042fb86 in runtime.futexsleep (addr=0xca5600 <runtime.sig>, val=0, ns=-1) at runtime/os_linux.go:44
#2  0x000000000040f466 in runtime.notetsleep_internal (n=0xca5600 <runtime.sig>, ns=-1, ~r2=<optimized out>) at runtime/lock_futex.go:174
#3  0x000000000040f66c in runtime.notetsleepg (n=0xca5600 <runtime.sig>, ns=-1, ~r2=<optimized out>) at runtime/lock_futex.go:228
#4  0x00000000004480bc in os/signal.signal_recv (~r0=<optimized out>) at runtime/sigqueue.go:147
#5  0x00000000007d32f2 in os/signal.loop () at os/signal/signal_unix.go:23
#6  0x000000000045fa01 in runtime.goexit () at runtime/asm_amd64.s:1357
#7  0x0000000000000000 in ?? ()
(gdb) thr 4
[Switching to thread 4 (LWP 27581)]
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
536     in runtime/sys_linux_amd64.s
(gdb) bt
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
#1  0x000000000042fb86 in runtime.futexsleep (addr=0xc000040bc8, val=0, ns=-1) at runtime/os_linux.go:44
#2  0x000000000040f38f in runtime.notesleep (n=0xc000040bc8) at runtime/lock_futex.go:151
#3  0x0000000000438110 in runtime.stopm () at runtime/proc.go:1928
#4  0x000000000043922f in runtime.findrunnable (gp=0xc00002a000, inheritTime=false) at runtime/proc.go:2391
#5  0x0000000000439ede in runtime.schedule () at runtime/proc.go:2524
#6  0x000000000043a21d in runtime.park_m (gp=0xc000074600) at runtime/proc.go:2610
#7  0x000000000045d91b in runtime.mcall () at runtime/asm_amd64.s:318
#8  0x00007f3404000020 in ?? ()
#9  0x0000000000800000 in nsexec () at nsexec.c:947
#10 0x0000000000000000 in ?? ()
(gdb) thr 5
[Switching to thread 5 (LWP 27582)]
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
536     in runtime/sys_linux_amd64.s
(gdb) bt
#0  runtime.futex () at runtime/sys_linux_amd64.s:536
#1  0x000000000042fb86 in runtime.futexsleep (addr=0xca5518 <runtime.newmHandoff+24>, val=0, ns=-1) at runtime/os_linux.go:44
#2  0x000000000040f38f in runtime.notesleep (n=0xca5518 <runtime.newmHandoff+24>) at runtime/lock_futex.go:151
#3  0x0000000000438032 in runtime.templateThread () at runtime/proc.go:1906
#4  0x0000000000436d13 in runtime.mstart1 () at runtime/proc.go:1201
#5  0x0000000000436c2e in runtime.mstart () at runtime/proc.go:1167
#6  0x0000000000800cbc in crosscall_amd64 () at gcc_amd64.S:35
#7  0x00007f33fffff700 in ?? ()
#8  0x00007f33ffffefc0 in ?? ()
#9  0x00007ffe8e5c578f in ?? ()
#10 0x000000c000074180 in ?? ()
#11 0x0000000000436bc0 in ?? () at runtime/proc.go:1080
#12 0x0000000000000000 in ?? ()
=====

My system is an up-to-date Debian unstable (but with a custom kernel).

Thanks!

Jan

[1]: https://github.com/opencontainers/runc/pull/1698



More information about the Pkg-go-maintainers mailing list