RFC: Improving recovery from kernel or initramfs upgrade failure

Ben Hutchings ben at decadent.org.uk
Sun Apr 17 21:07:37 UTC 2016


Sometimes a kernel or initramfs-tools/dracut upgrade will result in
failure to boot, and the previous kernel or initramfs will no longer be
available as a fallback.

- update-initramfs has a config option, backup_initramfs, but it is off
  by default.  Changing that default may cause /boot to fill up.
- Kernel upgrades that involve an ABI bump don't overwrite the previous
  kernel, but most upgrades within a stable release don't bump the ABI.

We need to preserve the old kernel, modules and initramfs until we know 
that the new one is good, that works in general.  This may require user
interaction with the boot loader, and that won't be possible with every
boot loader, but we can cover most systems by making this work in GRUB.

Kernel image:
- Whenever we replace the kernel that was used for the current boot,
  keep the old kernel image as a backup
- Whenever we boot successfully with the primary kernel for a given
  kernel version string, delete the backup kernel (unless configured
  not to)
- How to identify which kernel was used?
  - Maybe by `uname -v`
  - Maybe by BOOT_IMAGE on /proc/cmdline (but this is GRUB specific)
- Rescue boot entry selects the backup kernel

Initramfs:
- Whenever we rebuild the initramfs that was used for the current boot,
  keep the old one as a backup even if backup_initramfs is diusabled
- Whenever we boot successfully with the primary initramfs for a kernel
  version and backup_initramfs is disabled, delete the backup initramfs
- How to identify which initramfs was used?
  - Include a UUID in each initramfs
  - Copy it to /run/initramfs at boot
  - Keep a mapping of UUID to filename & hash somewhere in /var
- Rescue boot entry selects the backup initramfs

Modules:
- Whenever we replace the modules that are currently used, link the old
  ones into a backup directory under /lib/modules/<kversion>
- Some new modules may fail to load on top of old kernel image, and
  network drivers may not be included in initramfs, so may be difficult
  to install old kernel package
- Whenever we boot successfully, delete the backup modules
- Rescue boot adds a configuration file under /run/depmod.d that puts
  the backup directory at the front of the search path, then runs
  depmod

Did I miss anything?  Does this look workable?

Ben.


-- 
Ben Hutchings
Make three consecutive correct guesses and you will be considered an expert.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.alioth.debian.org/pipermail/pkg-grub-devel/attachments/20160417/089499cf/attachment.sig>


More information about the Pkg-grub-devel mailing list