Bug#239111: Grub is shockingly bad code

Robert McQueen robot101 at hadesian.co.uk
Mon Jan 12 19:57:44 UTC 2009


Robert Millan wrote:
> Grmf.  I was making wrong assumptions.  This is not about block lists
> (I still think block lists suck, but let's be fair...):
...
> So we freeze the filesystem and afterwards try to write to it.  Not a
> good idea...

Indeed not.

> #239111 initial report claims GRUB hangs before we added
> xfs_freeze.diff, but according to what Rob says, not freezing makes it
> work for him.

I didn't test that, I was basing on my experiences of installing GRUB
manually. I usually do:

mkdir /boot/grub
cp -a /usr/lib/grub/i386-pc/* /boot/grub
vi /boot/grub/device.map
grub --device-map=/boot/grub/device.map
# doesn't work because XFS hasn't put /boot/grub onto disk ...
# swear a bit
xfs_freeze -f /
xfs_freeze -u /
grub --device-map=/boot/grub/device.map
# works
update-grub
# makes me a menu.lst file

This never froze for me.

> Rob, can you test removing the patch completely?

I suppose, it would at least make it fail rather than fuck my system,
but based on my understanding of the problem, won't make installations
on XFS work reliably.

> I'm almost certain it's the fwrite() call in install_func.  I could be
> wrong, but I still don't see why we would want to use xfs_freeze anyway.

You use xfs_freeze to force XFS to flush the metadata for the newly
created stage* files out of the journal and into the directory entries
themselves. sync() puts the data on disk and the metadata in the
journal, but xfs_freeze gives the extra guarantee of the journal being
clean and empty so the filesystem can be shapshotted and mounted as
read-only (with no journal replay).

> And if we _really_ need xfs_freeze, it means we're doing something wrong.

I thought the wrongness was well known: the GRUB shell tries to read the
block device directly and traverse the filesystem directories itself to
to find which blocks on the disk are where the stage* files are located.
XFS considers putting the metadata into the journal an acceptable
implementation of sync(), but grub's filesystem reading code ignores the
journal, so can't find the files. If the shell used FIBMAP ioctl to find
the blocks when running under the OS, it'd always find them and all
would be sweetness and light. Thats what Steve was trying to do, I
think, before he gave up and sent the OP on this thread.

Regards,
Rob





More information about the Pkg-grub-devel mailing list