[Pkg-systemd-maintainers] Bug#719945: Bug#719945: systemd: Hangs during shutdown (likely NFS-related)
Michael Biebl
biebl at debian.org
Tue Jan 28 16:56:24 GMT 2014
Am 28.01.2014 14:28, schrieb Sam Morris:
> On Sun, Jan 26, 2014 at 10:35:29AM +0100, Michael Stapelberg wrote:
>> control: tag -1 + pending
>>
>> Hi Sam,
>>
>> Sam Morris <sam at robots.org.uk> writes:
>>> I rebuilt with the attached patch and it does the trick. I think it's
>>> also the fix applied to fix
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=999061>.
>> Thanks, I merged it:
>> http://anonscm.debian.org/gitweb/?p=pkg-systemd/systemd.git;a=commitdiff;h=cf19e2b
>>
>> --
>> Best regards,
>> Michael
>
> Hm. It seems that the problem isn't fixed after all. I was fooled
> because I was able to reboot a few times without the problem happening,
> but I've now reproduced it with the patch applied.
>
> I've attached a debug log of the shutdown process. You can see
> ifup at eth0.service being stopped on line 8596, but home.mount isn't
> unmounted until line 9612.
>
> On line 9588 you can see that nfs-common.service is stopped before the
> NFS unmount operation completes (line 9654). I'm not an NFS expert but I
> think this service should only be stopped after all NFS filesystems are
> unmounted, so that the NFS server is informed that any locks being
> released on the filesystem (and probably other things on different NFS
> versions). This is the ordering in /etc/rc6.d as well.
Can you attach the output of
systemctl show nfs-common.service ifup at eth0.service your-nfs-mount.mount
> On line 9617, you can see that the NFS mount is being unmounted with a
> simple '/bin/umount /home' which fails since there are still user
> processes running with files open. In order to avoid potential data loss
> I get the feeling that something should be killing these processes off
> politely before the filesystem rug is yanked away from underneath them,
> but I think that's a bug for another time. When booting with sysvinit,
> /etc/init.d/umountnfs.sh uses the -f and -l options when running umount,
> which at least ensures that the filesystem will be unmounted even if the
> network is down. From my log it appears that systemd doesn't even start
> this service during the shutdown process. If it's intended that systemd
> takes over its job, then the correct options should be used (-f and -l,
> on any kernel version supported by systemd), and the service should be
> masked. If not, then umountnfs.service should be started during the
> shutdown process. Unless you have another suggestion, I'll give this a
> go and see how it works out.
>
> FYI, here's a summary of how NFS mounting during boot, and unmounting
> during shutdown, is handled in Debian.
>
> By default, d-i configures network interfaces as follows:
>
> allow-hotplug eth0
> iface eth0 inet dhcp
>
> This causes NFS mounts to be activated by ifup, via
> /etc/network/if-up.d/mountnfs, during hotplug time, but only if all
> other 'auto' interfaces have previously been brought up.
>
> The user can also configure their network interface with 'auto' instead
> of 'allow-hotplug'. In this case, NFS mounts are still mounted when ifup
> for the final 'auto' interface is run, but this will instead happen
> during the start of networking.service.
>
> There's also the existence of an /etc/default.rcS variable,
> ASYNCMOUNTNFS. By default this is unset, corresponding to 'yes'. If set
> to 'no', then NFS mounts are not activated as above; instead they are
> activated by mountnfs.service. This service is masked in the Debian
> systemd package, so I think we can say that ASYNCMOUNTNFS=no is not
> currently supported with our systemd setup.
>
> Under sysvinit, unmounting at shutdown is handled by
> /etc/init.d/umountnfs.sh, which runs before nfs-common, and then
> rpcbind, are stopped. As noted above, umountnfs.service is not started
> during shutdown under systemd.
This is all a great mess under sysvinit.
umountnfs.service is blacklisted as mounts (also remote ones) are
directly handled by systemd.
> Interfaces can also be configured with NetworkManager, which adds
> another axis to the configuration space. Simple configuration of a wired
> network interface should still work, but I think some work has to be
> done (currently by the admin) to enable
> NetworkManager-wait-online.service in order to get systemd to delay
> activating the NFS mounts until NM determines that a network connection
> is available.
>
> Incidentally, NetworkManager-wait-online.service looks wrong to me; I
> think it should declare Wants= and Before= on network-online.target,
> since that is the name of the target documented in systemd.special(7);
> however I think that it's not actually broken with its current
> settings--they will just result in network.target itself being delayed
> until NetworkManager-wait-online.service starts up, and since the .mount
> units generated by systemd-fstab-generator are After= both network.target
> and network-online.target, the mounts will still be activated at the
> right time. If NetworkManager-wait-online.service were changed to use
> network-wait-online.target instead, then could we enable
> NetworkManager-wait-online.service by default without delaying the
> startup of any services that don't run After= that target, i.e., none in
> the default install?
NM-wait-online is only really relevant for boot. It's a service which
blocks (by default up until 30 secs) and waits until a network
connection is established. And yeah, I think NM in unstable is currently
broken in that regard. The introduction of network-online.target is
something more recent. IIRC this should be fixed in the experimental
version of NM.
> As for shutting down, NetworkManager should only be stopped after remote
> filesystems are unmounted. I'm not sure if this is the case already.
> I've no idea how to deal with horrible cases such as when the user
> reboots the system while they have mounted an NFS share via a VPN
> connection that will be killed when they log out.
Since /usr could be on NFS, this is going to be tricky. That said, I
don't think NM has a problem here since it not longer shuts down the
interfaces when NM is stopped (at least ethernet devices).
As for ifup at .service: it might be a problem that we use
DefaultDependencies=yes (the default).
We probably need to use DefaultDependencies=no and tweak the dependencies.
We will probably also need native .service files for nfs-common and
rpcbind so we can ensure the correct ordering.
Michael
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 884 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20140128/56233593/attachment-0002.sig>
More information about the Pkg-systemd-maintainers
mailing list