jessie: help debugging NFS shares not mounted at boot, double mounts with mount -a, and @reboot cronjobs

Felipe Sateler fsateler at debian.org
Wed Mar 2 15:38:53 GMT 2016


On 2 March 2016 at 11:46, Sandro Tosi <morph at debian.org> wrote:
> Thanks Felipe to assist me with this and sorry for the late reply!
>
> On Thu, Feb 25, 2016 at 7:19 PM, Felipe Sateler <fsateler at debian.org> wrote:

>> Maybe the timeout is just too short. Maybe adding
>> x-systemd.device-timeout=90s helps?
>
>
> i think that's already 90s by default? at least we see the mount fail after
> 90s, so maybe we should set to a lower or higher value? I'm unsure if that's
> what could trigger a "retry" of the mount, because as soon as I see the
> machine online, i can login and issue a mount -t nfs -a and all the missing
> mountpoints (still pending for systemd) are promptly mounted, so it's like
> if they are "frozen" and a retry would just make them successful.

I think that mount itself will not retry without being given the bg
option, and that causes systemd to think the mounts are ready too
early.

So yeah, I think trying with a larger timeout might be useful (if the
server/network is overloaded). Another option would be to specify
ordering relations so that they do not happen simultaneously:

edit /etc/systemd/mnt-NFSSERVER_VOL.mount.d/local.conf and write:

[Unit]
After=mnt-NFSSERVER_VOL2.mount

It may be worthwhile to try adding a bunch of these (one for each
mount), so that they are mounted in order, to see if that changes
anything.

>
>> A completely different alternative is to setup the nfs mounts as
>> automounts instead of real mounts (ie, set x-systemd.automount
>> option). This would have the aditional benefit of removing the need to
>> specify dependency against remote-fs.target.
>
>
> i dont think we want that: we prefer (for various reason) to have all the
> mountpoint, services and processes running at the machine boot, and be
> already there when we want to use them

OK.

>> I have to confess I don't have much more ideas on where to look...
>
>
> ok, at least we are not alone in not understanding what's going on :)
>
> but as you can imagine, this is a rather nasty issue and -while we are
> moving forward with the adoption of jessie- is making a lot of people
> uncomfortable and skeptical (i just want to express the feelings we have not
> any complains on the quality of systemd or debian :) )
>
> would you think it might be viable to contact systemd upstream about this?
> Jessie runs 215 while upstream released 229 so the risk of a "get the latest
> version and report back" is high, and it's not something easily doable i
> guess (?).

Well, if you have a test machine showing the problem that can be
upgraded to stretch, it would be great to test it.

Indeed, bug reports are only accepted for the last couple of versions.
On the mailing list you may get better responses.

> would you prefer to start this discussion yourself, as you might
> have a well established relationship with systemd upstream? i can trigger
> the discussion myself as well, and copying the systemd debiam maint ml, as
> you prefer

I think its best if you ask there, as there will likely be questions
asked. It would be great also to have an as-full-as-possible log of a
problematic boot. It is hard to follow "anonymized" and filtered logs,
especially when the anonymization rules change on different mails ;)
Could you attach one? Or mail me privately if you prefer, and I can
take a look to see if there is anything else that may be fishy.

Other than that, I'm out of ideas.

-- 

Saludos,
Felipe Sateler




More information about the Pkg-systemd-maintainers mailing list