jessie: help debugging NFS shares not mounted at boot, double mounts with mount -a, and @reboot cronjobs

Sandro Tosi morph at debian.org
Tue Feb 23 15:12:19 GMT 2016


On Tue, Feb 23, 2016 at 9:19 AM, Sandro Tosi <morph at debian.org> wrote:
> quick update: we had a couple of (real) nfs issues and
> misconfiguration (meeh) that made the script fail even if it shouldnt
> have, so no news yet; the reboot loop just restarted and will
> periodically check it and report back if something comes up.

so well, we just needed to wait a bit :)

here it is:

# journalctl -alb | grep -nE "cron|NFSSERVER"
1012:Feb 23 06:50:42 SERVER systemd[1]: Installed new job
mnt-NFSSERVER_VOL.mount/start as 99
1014:Feb 23 06:50:42 SERVER systemd[1]: Installed new job
cron.service/start as 101
1714:Feb 23 06:50:43 SERVER systemd[1]: Mounting /mnt/NFSSERVER_VOL...
1716:Feb 23 06:50:43 SERVER systemd[1]: About to execute: /bin/mount -n
XXX.YYY.32.75:/vol/VOL /mnt/NFSSERVER_VOL -t nfs -o
ro,intr,nolock,tcp,rdirplus,noatime,_netdev
1718:Feb 23 06:50:43 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount changed
dead -> mounting
1720:Feb 23 06:50:43 SERVER systemd[574]: Executing: /bin/mount -n
XXX.YYY.32.75:/vol/VOL /mnt/NFSSERVER_VOL -t nfs -o
ro,intr,nolock,tcp,rdirplus,noatime,_netdev
1905:Feb 23 06:52:13 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount mounting
timed out. Stopping.
1906:Feb 23 06:52:13 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount changed
mounting -> mounting-sigterm
1915:Feb 23 06:52:13 SERVER systemd[1]: Child 574 belongs to
mnt-NFSSERVER_VOL.mount
1916:Feb 23 06:52:13 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount mount
process exited, code=killed status=15
1917:Feb 23 06:52:13 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount changed
mounting-sigterm -> mounted
1918:Feb 23 06:52:13 SERVER systemd[1]: Job mnt-NFSSERVER_VOL.mount/start
finished, result=done
1919:Feb 23 06:52:13 SERVER systemd[1]: Mounted /mnt/NFSSERVER_VOL.
2025:Feb 23 06:52:13 SERVER systemd[1]: About to execute: /usr/sbin/cron -f
$EXTRA_OPTS
2026:Feb 23 06:52:13 SERVER systemd[1]: Forked /usr/sbin/cron as 786
2027:Feb 23 06:52:13 SERVER systemd[1]: cron.service changed dead -> running
2028:Feb 23 06:52:13 SERVER systemd[1]: Job cron.service/start finished,
result=done
2029:Feb 23 06:52:13 SERVER systemd[786]: Executing: /usr/sbin/cron -f
2038:Feb 23 06:52:13 SERVER cron[786]: (CRON) INFO (pidfile fd = 3)
2128:Feb 23 06:52:13 SERVER cron[786]: (CRON) INFO (Running @reboot jobs)
2300:Feb 23 06:52:13 SERVER systemd[1]: mnt-NFSSERVER_VOL.mount changed
mounted -> failed
2301:Feb 23 06:52:13 SERVER systemd[1]: Failed to destroy cgroup
/system.slice/mnt-NFSSERVER_VOL.mount: Device or resource busy
2302:Feb 23 06:52:13 SERVER systemd[1]: Unit mnt-NFSSERVER_VOL.mount
entered failed state.
2303:Feb 23 06:52:13 SERVER systemd[1]: Sent message type=signal sender=n/a
destination=n/a
object=/org/freedesktop/systemd1/unit/mnt_2dNFSSERVER_5fVOL_2emount
interface=org.freedesktop.DBus.Properties member=PropertiesChanged
cookie=30 reply_cookie=0 error=n/a
2304:Feb 23 06:52:13 SERVER systemd[1]: Sent message type=signal sender=n/a
destination=n/a
object=/org/freedesktop/systemd1/unit/mnt_2dNFSSERVER_5fVOL_2emount
interface=org.freedesktop.DBus.Properties member=PropertiesChanged
cookie=31 reply_cookie=0 error=n/a

so 1m30s passed and the mount didnt come up, which is (one of) the original
issue (usually, running mount -t nfs -a will bring it up, even just right
after the failure at boot, so it seems like it's not retried?). i check in
the journalctl output around those lines, there was additional msg relevant
to this.

also note like cron.service is started, even if we configured:

# grep remote-fs /etc/systemd/system/cron.service
Requires=remote-fs.target
After=remote-fs.target

checking the status of that target:

# systemctl status remote-fs.target
● remote-fs.target - Remote File Systems
   Loaded: loaded (/lib/systemd/system/remote-fs.target; enabled)
  Drop-In: /run/systemd/generator/remote-fs.target.d
           └─50-insserv.conf.conf
   Active: active since Tue 2016-02-23 06:52:13 EST; 3h 15min ago
     Docs: man:systemd.special(7)

Feb 23 06:52:13 SERVER systemd[1]: Starting Remote File Systems.
Feb 23 06:52:13 SERVER systemd[1]: Job remote-fs.target/start finished,
result=done
Feb 23 06:52:13 SERVER systemd[1]: Reached target Remote File Systems.

so at the same time when mnt-NFSSERVER_VOL.mount is marked as failed, the
remote-fs.target is marked as loaded successfully (which seems the wrong
status to me), and in fact the only failed unit is:

# systemctl --failed
  UNIT                    LOAD   ACTIVE SUB    DESCRIPTION
● mnt-NFSSERVER_VOL.mount loaded failed failed /mnt/NFSSERVER_VOL

1 loaded units listed. Pass --all to see loaded but inactive units, too.


can I provide more logs/info? do you see anything wrong on this
configuration that we might want to change?

thanks a ton again!

-- 
Sandro "morph" Tosi
My website: http://sandrotosi.me/
Me at Debian: http://wiki.debian.org/SandroTosi
G+: https://plus.google.com/u/0/+SandroTosi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20160223/fb5afebd/attachment-0002.html>


More information about the Pkg-systemd-maintainers mailing list