jessie: help debugging NFS shares not mounted at boot, double mounts with mount -a, and @reboot cronjobs

Felipe Sateler fsateler at debian.org
Thu Feb 18 17:39:40 GMT 2016


On 18 February 2016 at 14:05, Sandro Tosi <morph at debian.org> wrote:
> On Thu, Feb 18, 2016 at 4:49 PM, Felipe Sateler <fsateler at debian.org> wrote:
>> On 18 February 2016 at 13:41, Sandro Tosi <morph at debian.org> wrote:
>>> On Thu, Feb 18, 2016 at 4:11 PM, Felipe Sateler <fsateler at debian.org> wrote:
>>>> Could the networking script be exiting too early?
>>>
>>> at which network script in particular are you referring to? we are
>>> configuring our network in /etc/network/interfaces
>>
>> That would be networking.service (ie, /etc/init.d/networking).
>>
>> Are there more lines corresponding to the pids of the failed mounts
>> (the number between [])?
>
> I'm afraid i stupidly didnt save the logs for this machine status, and
> in the attempt to replicate it, i ended up in the situation described
> in the email form Feb 12th (with a mount coming up later than the
> other but cron to be started anyway).

That mail has:
1711:Feb 10 16:44:40 SERVER systemd[1]: mnt-NFSSERVER.mount changed
dead -> mounting
1817:Feb 10 16:44:43 SERVER systemd[1]: mnt-NFSSERVER.mount changed
mounting -> mounted
1818:Feb 10 16:44:43 SERVER systemd[1]: Job mnt-NFSSERVER.mount/start
finished, result=done
1819:Feb 10 16:44:43 SERVER systemd[1]: Mounted /mnt/NFSSERVER.
2106:Feb 10 16:44:43 SERVER systemd[1]: mnt-NFSSERVER.mount changed
mounted -> dead
2107:Feb 10 16:44:43 SERVER systemd[1]: Failed to destroy cgroup
/system.slice/mnt-NFSSERVER.mount: Device or resource busy
2632:Feb 10 16:44:54 SERVER systemd[1]: mnt-NFSSERVER.mount changed
dead -> mounted

So somehow the mount finishes successfully but the mount is declared
dead in rapid succession. This would suggest that systemd regards
remote-fs up at the time of the mount success exit, but the mount then
dies and revives some seconds later.

Are there any possibly relevant logs during that time?


>
> Let me know if you prefer to investigate this latest state (the
> machine is still in that state and has not been touched since, and it
> appears to be somehow relevant to the situation at hand) or do you
> want me to start rebooting the node until we are able to replicate the
> same situation as above.


Well, lets debug the most reproducible one ;) But, the above seems to
imply that the problems do not happen always? I presume it happens
frequently enough to be a problem, but does the system manage to boot
successfully sometimes?

>
>>>
>>>> Do you have more
>>>> interfaces in these machines? Are all of them configured as auto or
>>>> static?
>>>
>>> on this particular machine there is a single eth0 interface configured as auto
>>
>> So this is not the same setup as the previous one you posted? I'm
>> getting a bit confused...
>
> yes this has always been the same setup; my question about multiple
> NICs is because we do have seen this behavior on machines with
> multiple interfaces, and so we were wondering if that could make the
> issue more likely to happen, but it was more of a curiosity.
>
> the machine I am providing logs from is always the same with the same
> exact configuration, unless specified (like disabling services as
> requested, etc etc).

Well, that does not match with the info you sent in the Feb 9 mail (an
inet static configuration).


-- 

Saludos,
Felipe Sateler




More information about the Pkg-systemd-maintainers mailing list