Bug#508834: grub-common: grub-probe is painfully slow to execute due to excessive ioctl(BLKFLSBUF)

Lars Ellenberg lars.ellenberg at linbit.com
Tue Jan 12 20:12:09 UTC 2010


as I just now was in the situation to call
update-grub quite a few times
and hit this bug, and found it somewhat annoying,
I tried to track it down.

	# time update-grub
	Searching for GRUB installation directory ... found: /boot/grub
	Searching for default file ... found: /boot/grub/default
	Testing for an existing GRUB menu.lst file ... found:
	/boot/grub/menu.lst
	Searching for splash image ... found: /boot/grub/splash.xpm.gz
	Found kernel: ...
	Updating /boot/grub/menu.lst ... done


	real    1m18.955s
	user    0m2.612s
	sys     0m41.603s


80 seconds wall clock time.
doh.
(yes, I did that several times, and it is consistent)

my root is on LVM, btw.

strace and gdb revealed
that it is spending much time in repeatedly
doing open_device for my PV,
basically you get this access pattern:
	open()
	getgeom
	flushbufs
	lseek
	read 4k
	close

repeated about 32 thousand times ;)

I tracked this down
  grub_lvm_scan_device,
  disk/lvm.c:289   err = grub_disk_read (disk, 0, mda_offset, mda_size, metadatabuf);

actually to this loop in grub_disk_read:
  /* Until SIZE is zero...  */
  while (size)
    { 
      char *data;
      grub_disk_addr_t start_sector;
      grub_size_t len;
      grub_size_t pos;

      /* For reading bulk data.  */
      start_sector = sector & ~(GRUB_DISK_CACHE_SIZE - 1);
      pos = (sector - start_sector) << GRUB_DISK_SECTOR_BITS;
      len = ((GRUB_DISK_SECTOR_SIZE << GRUB_DISK_CACHE_BITS)
             - pos - real_offset);
      if (len > size)
        len = size;

      /* Fetch the cache.  */
      data = grub_disk_cache_fetch (disk->dev->id, disk->id, start_sector);
      if (data)
        ...
      else
        {
          /* Otherwise read data from the disk actually.  */
          if ((disk->dev->read) (disk, start_sector,
                                 GRUB_DISK_CACHE_SIZE, tmp_buf)
              != GRUB_ERR_NONE)

now, guess what GRUB_DISK_CACHE_SIZE is?  right, 4k.
and, what happens to be my mda_size? hm, happens to be 128 MiB.

so I changed:
# cat debian/patches/00_increase_grub_disk_cache.diff
--- a/include/grub/disk.h.orig  2010-01-12 19:30:32.125277561 +0100
+++ b/include/grub/disk.h       2010-01-12 19:33:47.261239353 +0100
@@ -131,8 +131,8 @@
 #define GRUB_DISK_CACHE_NUM    1021

 /* The size of a disk cache in sector units.  */
-#define GRUB_DISK_CACHE_SIZE   8
-#define GRUB_DISK_CACHE_BITS   3
+#define GRUB_DISK_CACHE_BITS   11
+#define GRUB_DISK_CACHE_SIZE   (1<<GRUB_DISK_CACHE_BITS)

 /* This is called from the memory manager.  */
 void grub_disk_cache_invalidate_all (void);


rebuilt, installed:
root at rum:~/src# dpkg -i grub-common_1.96+20080724-16.1_amd64.deb

timed again:
	# time update-grub
	...
	Updating /boot/grub/menu.lst ... done


	real    0m13.463s
	user    0m1.668s
	sys     0m5.592s


not too bad for a "one line" patch.


comments welcome, maybe this approach is not the best, maybe 1MiB is overkill,
maybe I forgot to adjust some other "grub disk cache" internals...

maybe rather (or additionally) keep a single instance static last-used cache
inside of open_device(), which could save a lot of syscalls as well.

also, as has been mentioned before, some special mode for grub-probe
would be great so it need not be called five times from udate-grub.

what does grub upstream have to say about this?


Cheers,

	Lars





More information about the Pkg-grub-devel mailing list