[Pkg-systemd-maintainers] Bug#719945: systemd: Hangs during shutdown (likely NFS-related)
Sam Morris
sam at robots.org.uk
Tue Jan 28 17:43:15 GMT 2014
On Tue, Jan 28, 2014 at 05:56:24PM +0100, Michael Biebl wrote:
> Can you attach the output of
> systemctl show nfs-common.service ifup at eth0.service your-nfs-mount.mount
Id=nfs-common.service
Names=nfs-common.service
WantedBy=multi-user.target graphical.target sysinit.target
Conflicts=shutdown.target
Before=multi-user.target graphical.target sysinit.target shutdown.target
After=rpcbind.service time-sync.target systemd-journald.socket
Description=LSB: NFS support files common to client and server
LoadState=loaded
ActiveState=active
SubState=running
SourcePath=/etc/init.d/nfs-common
InactiveExitTimestamp=Tue 2014-01-28 09:04:43 GMT
InactiveExitTimestampMonotonic=10940314
ActiveEnterTimestamp=Tue 2014-01-28 09:04:43 GMT
ActiveEnterTimestampMonotonic=11236958
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=yes
CanStop=yes
CanReload=no
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=no
OnFailureIsolate=no
IgnoreOnIsolate=no
IgnoreOnSnapshot=no
NeedDaemonReload=no
JobTimeoutUSec=0
ConditionTimestamp=Tue 2014-01-28 09:04:43 GMT
ConditionTimestampMonotonic=10892427
ConditionResult=yes
Type=forking
Restart=no
NotifyAccess=none
RestartUSec=100ms
TimeoutUSec=0
TimeoutStartUSec=0
TimeoutStopUSec=0
WatchdogUSec=0
WatchdogTimestampMonotonic=0
StartLimitInterval=10000000
StartLimitBurst=5
StartLimitAction=none
ExecStart={ path=/etc/init.d/nfs-common ; argv[]=/etc/init.d/nfs-common
start ; ignore_errors=no ; start_time=[Tue 2014-01-28 09:04:43 GMT] ;
stop_time=[Tue 2014-01-28 09:04:43 GMT] ; pid=950 ; code=exited ;
status=0 }
ExecStop={ path=/etc/init.d/nfs-common ; argv[]=/etc/init.d/nfs-common
stop ; ignore_errors=no ; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ;
code=(null) ; status=0/0 }
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=yes
GuessMainPID=no
MainPID=0
ControlPID=0
Result=success
UMask=0022
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=4096
LimitAS=18446744073709551615
LimitNPROC=257456
LimitMEMLOCK=65536
LimitLOCKS=18446744073709551615
LimitSIGPENDING=257456
LimitMSGQUEUE=819200
LimitNICE=0
LimitRTPRIO=0
LimitRTTIME=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SecureBits=0
CapabilityBoundingSet=18446744073709551615
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
SameProcessGroup=no
ControlGroupModify=no
ControlGroupPersistent=no
IgnoreSIGPIPE=no
NoNewPrivileges=no
KillMode=process
KillSignal=15
SendSIGKILL=yes
ExecMainStartTimestampMonotonic=0
ExecMainExitTimestampMonotonic=0
ExecMainPID=0
ExecMainCode=0
ExecMainStatus=0
DefaultControlGroup=name=systemd:/system/nfs-common.service
ControlGroups=blkio:/system/nfs-common.service
memory:/system/nfs-common.service cpu:/system/nfs-common.service
name=systemd:/system/nfs-common.service
Id=ifup at eth0.service
Names=ifup at eth0.service
Requires=basic.target
BindsTo=sys-subsystem-net-devices-eth0.device
Conflicts=shutdown.target
Before=shutdown.target
After=local-fs.target systemd-journald.socket basic.target
Description=ifup for eth0
LoadState=loaded
ActiveState=active
SubState=exited
FragmentPath=/lib/systemd/system/ifup at .service
UnitFileState=static
InactiveExitTimestamp=Tue 2014-01-28 09:04:45 GMT
InactiveExitTimestampMonotonic=12759852
ActiveEnterTimestamp=Tue 2014-01-28 09:04:45 GMT
ActiveEnterTimestampMonotonic=12759852
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=yes
CanStop=yes
CanReload=no
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureIsolate=no
IgnoreOnIsolate=no
IgnoreOnSnapshot=no
NeedDaemonReload=no
JobTimeoutUSec=0
ConditionTimestamp=Tue 2014-01-28 09:04:45 GMT
ConditionTimestampMonotonic=12719044
ConditionResult=yes
Type=simple
Restart=no
NotifyAccess=none
RestartUSec=100ms
TimeoutUSec=1min 30s
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
WatchdogUSec=0
WatchdogTimestampMonotonic=0
StartLimitInterval=10000000
StartLimitBurst=5
StartLimitAction=none
ExecStart={ path=/sbin/ifup ; argv[]=/sbin/ifup --allow=hotplug %I ;
ignore_errors=no ; start_time=[Tue 2014-01-28 09:04:45 GMT] ;
stop_time=[Tue 2014-01-28 09:04:55 GMT] ; pid=1011 ; code=exited ;
status=0 }
ExecStop={ path=/sbin/ifdown ; argv[]=/sbin/ifdown %I ; ignore_errors=no
; start_time=[n/a] ; stop_time=[n/a] ; pid=0 ; code=(null) ; status=0/0
}
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=yes
GuessMainPID=yes
MainPID=0
ControlPID=0
Result=success
UMask=0022
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=4096
LimitAS=18446744073709551615
LimitNPROC=257456
LimitMEMLOCK=65536
LimitLOCKS=18446744073709551615
LimitSIGPENDING=257456
LimitMSGQUEUE=819200
LimitNICE=0
LimitRTPRIO=0
LimitRTTIME=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SecureBits=0
CapabilityBoundingSet=18446744073709551615
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
SameProcessGroup=no
ControlGroupModify=no
ControlGroupPersistent=no
IgnoreSIGPIPE=yes
NoNewPrivileges=no
KillMode=control-group
KillSignal=15
SendSIGKILL=yes
ExecMainStartTimestamp=Tue 2014-01-28 09:04:45 GMT
ExecMainStartTimestampMonotonic=12759840
ExecMainExitTimestamp=Tue 2014-01-28 09:04:45 GMT
ExecMainExitTimestampMonotonic=12759840
ExecMainPID=1011
ExecMainCode=1
ExecMainStatus=0
DefaultControlGroup=name=systemd:/system/ifup at .service/ifup at eth0.service
ControlGroups=blkio:/system/ifup at .service/ifup at eth0.service
memory:/system/ifup at .service/ifup at eth0.service
cpu:/system/ifup at .service/ifup at eth0.service
name=systemd:/system/ifup at .service/ifup at eth0.service
Id=home.mount
Names=home.mount
Requires=-.mount
Wants=network-online.target
Conflicts=umount.target
Before=remote-fs.target umount.target
After=remote-fs-pre.target network.target network-online.target
systemd-journald.socket -.mount
Description=/home
LoadState=loaded
ActiveState=active
SubState=mounted
FragmentPath=/run/systemd/generator/home.mount
SourcePath=/etc/fstab
InactiveExitTimestamp=Tue 2014-01-28 09:04:55 GMT
InactiveExitTimestampMonotonic=22539250
ActiveEnterTimestamp=Tue 2014-01-28 09:04:55 GMT
ActiveEnterTimestampMonotonic=22539250
ActiveExitTimestampMonotonic=0
InactiveEnterTimestampMonotonic=0
CanStart=yes
CanStop=yes
CanReload=yes
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=no
OnFailureIsolate=no
IgnoreOnIsolate=yes
IgnoreOnSnapshot=no
NeedDaemonReload=no
JobTimeoutUSec=0
ConditionTimestampMonotonic=0
ConditionResult=no
Where=/home
What=gaia:/home
Options=rw,nosuid,nodev,relatime,rw,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.0.0.253,mountvers=3,mountport=4002,mountproto=udp,local_lock=none,addr=10.0.0.253
Type=nfs
TimeoutUSec=1min 30s
ControlPID=0
DirectoryMode=0755
Result=success
UMask=0022
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=4096
LimitAS=18446744073709551615
LimitNPROC=257456
LimitMEMLOCK=65536
LimitLOCKS=18446744073709551615
LimitSIGPENDING=257456
LimitMSGQUEUE=819200
LimitNICE=0
LimitRTPRIO=0
LimitRTTIME=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SecureBits=0
CapabilityBoundingSet=18446744073709551615
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
SameProcessGroup=yes
ControlGroupModify=no
ControlGroupPersistent=no
IgnoreSIGPIPE=yes
NoNewPrivileges=no
KillMode=control-group
KillSignal=15
SendSIGKILL=yes
DefaultControlGroup=name=systemd:/system/home.mount
ControlGroups=blkio:/system/home.mount memory:/system/home.mount
cpu:/system/home.mount name=systemd:/system/home.mount
> > Under sysvinit, unmounting at shutdown is handled by
> > /etc/init.d/umountnfs.sh, which runs before nfs-common, and then
> > rpcbind, are stopped. As noted above, umountnfs.service is not started
> > during shutdown under systemd.
>
> This is all a great mess under sysvinit.
> umountnfs.service is blacklisted as mounts (also remote ones) are
> directly handled by systemd.
Sounds good--but umountnfs.service isn't masked, it just happens not to
run, due to its dependencies. If systemd is to manage the mounts then it
needs to use -l and -f when deactivating NFS mounts.
If systemd is intended to mount the filesystem the first place then that
means we should somehow kill off /etc/network/if-up.d/mountnfs since it
gets run really early (during coldplug, via /lib/udev/net.agent).
> > Interfaces can also be configured with NetworkManager, which adds
> > another axis to the configuration space. Simple configuration of a wired
> > network interface should still work, but I think some work has to be
> > done (currently by the admin) to enable
> > NetworkManager-wait-online.service in order to get systemd to delay
> > activating the NFS mounts until NM determines that a network connection
> > is available.
> >
> > Incidentally, NetworkManager-wait-online.service looks wrong to me; I
> > think it should declare Wants= and Before= on network-online.target,
> > since that is the name of the target documented in systemd.special(7);
> > however I think that it's not actually broken with its current
> > settings--they will just result in network.target itself being delayed
> > until NetworkManager-wait-online.service starts up, and since the .mount
> > units generated by systemd-fstab-generator are After= both network.target
> > and network-online.target, the mounts will still be activated at the
> > right time. If NetworkManager-wait-online.service were changed to use
> > network-wait-online.target instead, then could we enable
> > NetworkManager-wait-online.service by default without delaying the
> > startup of any services that don't run After= that target, i.e., none in
> > the default install?
>
> NM-wait-online is only really relevant for boot. It's a service which
> blocks (by default up until 30 secs) and waits until a network
> connection is established. And yeah, I think NM in unstable is currently
> broken in that regard. The introduction of network-online.target is
> something more recent. IIRC this should be fixed in the experimental
> version of NM.
Ah, I didn't check NM in experimental. The unit file looks like it does
the right thing, thanks.
> > As for shutting down, NetworkManager should only be stopped after remote
> > filesystems are unmounted. I'm not sure if this is the case already.
> > I've no idea how to deal with horrible cases such as when the user
> > reboots the system while they have mounted an NFS share via a VPN
> > connection that will be killed when they log out.
>
> Since /usr could be on NFS, this is going to be tricky. That said, I
> don't think NM has a problem here since it not longer shuts down the
> interfaces when NM is stopped (at least ethernet devices).
I agree that /usr, /var and so on being on NFS will be tricky. Hm, I
wonder if it's reasonable to mount all such filesystems, including
/home, from the initramfs? Maybe not in Debian though.
> As for ifup at .service: it might be a problem that we use
> DefaultDependencies=yes (the default).
> We probably need to use DefaultDependencies=no and tweak the dependencies.
> We will probably also need native .service files for nfs-common and
> rpcbind so we can ensure the correct ordering.
Sounds reasonable, especially since it's not systemd that activates
ifup at .service, but /lib/udev/net.agent. Under sysvinit, the order is
umountnfs → nfs-common → rpcbind → networking; but
/etc/init.d/networking bugs out without doing anything if any active
filesystems (or swap) use the network.
Regards,
--
Sam Morris <https://robots.org.uk/>
3412 EA18 1277 354B 991B C869 B219 7FDB 5EA0 1078
More information about the Pkg-systemd-maintainers
mailing list