Bug#362028: Found in lenny too

Colin Watson cjwatson at ubuntu.com
Tue Jun 30 15:42:50 UTC 2009


tags 362028 patch
user ubuntu-devel at lists.ubuntu.com
usertags 362028 ubuntu-patch karmic
thanks

On Sat, Aug 09, 2008 at 07:46:08PM +0200, Robert Millan wrote:
> On Sat, Aug 09, 2008 at 05:15:07PM +0200, Raphael Champeimont (Almacha) wrote:
> > The best is you look at
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=362028 message #40.
> 
> Installing GRUB in a partition together with a filesystem usually works,
> but is not well supported, and in general I don't think it's a good idea.

(IME installing GRUB as a secondary boot loader is a very common
request, particularly from people who've put a lot of effort into the
boot loader currently in their MBR and get upset when a new operating
system installation decides to trash it. I don't much like it myself as
a setup, but I think if we want other systems to play nicely with us
then it behooves us to put some effort into playing nicely with them, so
I wouldn't like to remove this option from the installer. The
grub2/os-prober setup looks as though it should work well for a lot of
people, but it *is* still a bit "we can dual-boot as long as *I* run
things" :-), and not everyone likes that.)

I ran into this bug as https://bugs.launchpad.net/bugs/185878, and
finally got round to reproducing and investigating it myself today. It
seems to be fairly easy to reproduce as this sort of thing goes; just
doing a straightforward installation on JFS and telling grub-installer
to install to /dev/sda1 did it for me. Furthermore, if you zero out the
region occupied by the stage 1.5 using dd, you can reproduce it again
and again, which obviously made debugging rather easier.

Ironically, the problem is with GRUB's use of a couple of functions that
have this comment above them:

  /* Linux-only functions, because Linux has a bug that the disk cache for
     a whole disk is not consistent with the one for a partition of the
     disk.  */

Unfortunately, these are used in a way that precisely arranges to run
into the fact that the buffer caches for two file descriptors open on
/dev/sda and /dev/sda1 are not necessarily coherent! GRUB opens
/dev/sda, caches that file descriptor, and then (as part of embed_func)
opens /dev/sda1 and writes the stage 1.5 to that. install_func then uses
the cached file descriptor that's still open on /dev/sda and tries to
read the stage 1.5 back from it. Since the caches aren't coherent, this
fails in the way that's being observed.

The simple fix is to call the BLKFLSBUF ioctl on any cached file
descriptor open on the disk device after writing to the partition
device. The attached patch (to be installed in
debian/patches/cache_coherency.diff, with the obvious additional entry
at the end of debian/patches/00list) does this.

I haven't yet tested whether GRUB 2 suffers from the same problem, but
will do so.

Thanks,

-- 
Colin Watson                                       [cjwatson at ubuntu.com]





More information about the Pkg-grub-devel mailing list