Bug#985786: systemd: should be restartable even with mounted network filesystems
Marc Lehmann
debian-reportbug at plan9.de
Tue Mar 23 14:05:59 GMT 2021
Package: systemd
Version: 247.3-1
Severity: normal
Dear Maintainer,
we were just debugging an issue where systemd-networkd did not restart on
some systems, with the following error:
Mar 23 13:29:50 cert systemd[1]: Starting Network Service...
Mar 23 13:29:50 cert systemd[564743]: systemd-networkd.service: Failed to set up mount namespacing: /run/systemd/unit-r...
Mar 23 13:29:50 cert systemd[564743]: systemd-networkd.service: Failed at step NAMESPACE spawning /lib/systemd/systemd-...
Mar 23 13:29:50 cert systemd[1]: systemd-networkd.service: Main process exited, code=exited, status=226/NAMESPACE
Mar 23 13:29:50 cert systemd[1]: systemd-networkd.service: Failed with result 'exit-code'.
Mar 23 13:29:50 cert systemd[1]: Failed to start Network Service.
As it turns out, this is due to an extremely fragile setup of networkd -
essentially, it requires working network connectivity to be restartable.
In our case, the issue above (which, btw., we could only diagnose by
strace'ing systemd as systemd gives absolutely no useful error message for
this) was a shared network filesystem mounted as /shared. And since we
had a network problem (that restarting networkd with a better config was
supposed to fix), at some point accesses to /shared caused it to fail with
ENOTCONN.
This in turn caused sysstemd to not be able to set up the private fs
namespace, eventually causing the failure:
[pid 564996] statx(4, "shared", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, 0, 0x7ffdadcf3d30) = -1 ENOTCONN (Transport endpoint is not connected)
So while systemd and the systemd-networkd.service work as correctly
designed here, I think networkd should not have to rely on a working
network to be restartable.
I don't know what is at fault or how to solve this problem, but I think
this expectation (to be able to restart networkd to fix the nwtrok config)
is a reasonable one, so I consider it a bug that this currently can't be
done.
Also, systemd should diagnose that it cannot access /shared (or whatever
path is the problem), rather than just giving a generic error message that
something failed without telling what.
Thanks for considering my concerns!
-- System Information:
Debian Release: 10.8
APT prefers stable
APT policy: (990, 'stable'), (500, 'unstable-debug'), (500, 'testing-debug'), (500, 'stable-updates'), (500, 'stable-debug'), (500, 'unstable'), (500, 'testing'), (1, 'experimental-debug'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386, x32
Kernel: Linux 5.8.18-050818-generic (SMP w/8 CPU threads)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_DK.UTF-8, LC_CTYPE=en_DK.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages systemd depends on:
ii adduser 3.118
ii libacl1 2.2.53-4
ii libapparmor1 2.13.2-10
ii libaudit1 1:2.8.4-3
ii libblkid1 2.33.1-0.1
ii libc6 2.30-4
ii libcap2 1:2.25-2
ii libcrypt1 1:4.4.17-1
ii libcryptsetup12 2:2.3.4-2
ii libgcrypt20 1.8.4-5
ii libgnutls30 3.7.0-7
ii libgpg-error0 1.35-1
ii libip4tc2 1.8.7-1
ii libkmod2 26-1
ii liblz4-1 1.8.3-1
ii liblzma5 5.2.4-1
ii libmount1 2.36.1-7
ii libpam0g 1.3.1-5
ii libseccomp2 2.5.1-1
ii libselinux1 3.1-3
ii libsystemd0 247.3-1
ii libzstd1 1.4.8+dfsg-1
ii mount 2.33.1-0.1
ii systemd-timesyncd [time-daemon] 247.3-1
ii util-linux 2.36.1-7
Versions of packages systemd recommends:
ii dbus 1.12.20-1
Versions of packages systemd suggests:
ii policykit-1 0.105-25
ii systemd-container 247.3-1
Versions of packages systemd is related to:
pn dracut <none>
ii initramfs-tools 0.139
pn libnss-systemd <none>
ii libpam-systemd 247.3-1
ii udev 247.3-1
-- no debconf information
More information about the Pkg-systemd-maintainers
mailing list