Bug#829180: possible root cause identified

Daniel Pocock daniel at pocock.pro
Sat Jan 14 14:49:04 GMT 2017


I've posted the details below in the upstream bug report on Github as well.

This may be applicable to jessie users who have btrfs-tools v3.17-1.1
and/or manually created the /lib/udev/rules.d/99-btrfs.rules file

As a potentially workaround/solution, users can upgrade to btrfs-progs
v4.7.3-1 from jessie-backports and/or remove the udev rules file(s)



When testing during the wheezy freeze, I tried putting
a btrfs filesystem on a pair of LVM logical volumes.

I observed that the btrfs scan was being done before the logical volumes
were online so they weren't mountable without manually running the
btrfs scan again.  This was reported in Debian bug #685311

A solution was proposed in the bug discussion:
creating the file /lib/udev/rules.d/99-btrfs.rules[1]

Debian wheezy was released with btrfs-tools 0.19+20120328-7.1

The udev rules file was included in the package 0.19+20120328-8,
after the wheezy release.

However, the fix included in the package used a different filename:
/lib/udev/rules.d/80-btrfs-lvm.rules

That version of the fix was included in Debian jessie in the package
btrfs-tools 3.17-1.1.  Anybody who had read the bug and manually created
99-btrfs.rules on their wheezy system would now have two files on their
system:

/lib/udev/rules.d/80-btrfs-lvm.rules
/lib/udev/rules.d/99-btrfs.rules

It was after upgrading to jessie that I started having the problem
where some filesystems would randomly fail to mount during boot.

The latest version in jessie-backports, btrfs-progs 4.7.3-1, doesn't
include the 80-btrfs-lvm.rules file any more.  Whenever I installed that
version of the package, dpkg would have removed 80-btrfs-lvm.rules but
the other file I had created manually, 99-btrfs.rules, had remained
on the system because dpkg was never aware of it.

I've removed the file 99-btrfs.rules now and put the system in a reboot
loop for about 45 minutes, it rebooted about 18 times without the mounts
failing any more.

Therefore, I feel it is likely this udev file, combined with systemd
attempting mounts much earlier, was the reason the mounts were
intermittently failing.

While I understand that systemd may not be able to identify every
specific irregularity like this, some ideas for improvement come to mind:

- when systemd finds the udev file (99-btrfs.rules was mentioned
  in the journalctl output), could it be more conservative by default
  and delay trying to mount any devices that are associated with
  udev rules like this?

- why was this connection with btrfs scan only visible in the journalctl
  output after I upgraded to stretch and systemd 232?  There was no
  hint about the problem in jessie / systemd 215.

- when a mount does fail with EBUSY, could more detail be emitted in
  the kernel log?  This would help anybody in a situation like this,
  whether it is caused by btrfs scan or anything else that potentially
  inhibits a mount



1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=685311#14



More information about the Pkg-systemd-maintainers mailing list