[Pkg-xen-devel] Bug#1118711: xen-utils-common: Xen domains shutdown fails (/run/xen/qmp-libxl-* missing); mdadm not stopped, needs resync on boot

Wiebe Cazemier wiebe at halfgaar.net
Fri Oct 24 08:40:29 BST 2025


Package: xen-utils-common
Version: 4.17.5+23-ga4e5191dc0-1+deb12u1
Severity: normal
Tags: upstream

Dear Maintainer,

On a server with mdadm+lvm with Xen domain storage on logical volumes,
shutting it down and starting it again, the mdadm volume was reported as
dirty and needed a resync. I was debating whether this falls under the
'data loss' justification of 'serious', but I'll let you decide.

Also, I know it's Debian 12, but being so specific, I still wanted to
report it.

The problem is that when shutting down xendomains, it has apparently
lost control over two domains and can't shut them down. The following
sequence shows the QMP socket errors, and skipping deactivation of the
volume groups:


# journalctl --since '2025-10-22 18:00:00' | grep -E '(blkdeactivate|xendomains|-- Boot)'
Oct 22 19:48:45 brick systemd[1]: Stopping xendomains.service - LSB: Start/stop secondary xen domains...
Oct 22 19:48:45 brick blkdeactivate[312015]: Deactivating block devices:
Oct 22 19:48:46 brick blkdeactivate[312015]:   [SKIP]: unmount of md1 (md1) mounted on [SWAP]
Oct 22 19:48:47 brick xendomains[312070]: libxl: error: libxl_qmp.c:1334:qmp_ev_lock_aquired: Domain 5:Failed to connect to QMP socket /var/run/xen/qmp-libxl-5: No such file or directory
Oct 22 19:48:47 brick xendomains[312070]: libxl: error: libxl_qmp.c:1334:qmp_ev_lock_aquired: Domain 6:Failed to connect to QMP socket /var/run/xen/qmp-libxl-6: No such file or directory
Oct 22 19:48:47 brick xendomains[312067]: Shutting down Xen domain geborsteldstaal (1)...
Oct 22 19:48:47 brick xendomains[312103]: Shutting down domain 1
Oct 22 19:48:47 brick xendomains[312067]: done.
Oct 22 19:48:47 brick xendomains[312067]: Shutting down Xen domain gold (2)...
Oct 22 19:48:47 brick xendomains[312105]: Shutting down domain 2
Oct 22 19:48:47 brick xendomains[312067]: done.
Oct 22 19:48:47 brick xendomains[312067]: Shutting down Xen domain meel (3)...
Oct 22 19:48:47 brick xendomains[312107]: Shutting down domain 3
Oct 22 19:48:47 brick xendomains[312067]: done.
Oct 22 19:48:47 brick xendomains[312067]: Shutting down Xen domain wood (4)...
Oct 22 19:48:47 brick xendomains[312109]: Shutting down domain 4
Oct 22 19:48:47 brick xendomains[312067]: done.
Oct 22 19:48:47 brick blkdeactivate[312015]:   [UMOUNT]: unmounting big-decrypted (dm-12) mounted on /mnt/big... done
Oct 22 19:48:48 brick xendomains[312113]: libxl: error: libxl_qmp.c:1334:qmp_ev_lock_aquired: Domain 5:Failed to connect to QMP socket /var/run/xen/qmp-libxl-5: No such file or directory
Oct 22 19:48:48 brick xendomains[312113]: libxl: error: libxl_qmp.c:1334:qmp_ev_lock_aquired: Domain 6:Failed to connect to QMP socket /var/run/xen/qmp-libxl-6: No such file or directory
Oct 22 19:48:49 brick blkdeactivate[312015]:   [UMOUNT]: unmounting md0 (md0) mounted on /boot... done
Oct 22 19:48:49 brick blkdeactivate[312015]:   [SKIP]: unmount of md2 (md2) mounted on /
Oct 22 19:48:49 brick blkdeactivate[312015]:   [MD]: deactivating raid1 device md0... done
Oct 22 19:49:00 brick blkdeactivate[312015]:   [DM]: deactivating crypt device big-decrypted (dm-12)... done
Oct 22 19:49:02 brick blkdeactivate[312015]:   [LVM]: deactivating Volume Group universe2... skipping
Oct 22 19:49:03 brick blkdeactivate[312015]:   [LVM]: deactivating Volume Group universe... skipping
Oct 22 19:50:20 brick xendomains[312112]: Waiting for Xen domain geborsteldstaal (1) to shut down.................................................................................................................................................................................................................................................................................done.
Oct 22 19:50:20 brick xendomains[312112]: Waiting for Xen domain gold (2) to shut down...done.
Oct 22 19:50:20 brick xendomains[312112]: Waiting for Xen domain meel (3) to shut down...done.
Oct 22 19:50:20 brick xendomains[312112]: Waiting for Xen domain wood (4) to shut down...done.
Oct 22 19:50:20 brick systemd[1]: xendomains.service: Deactivated successfully.
Oct 22 19:50:20 brick systemd[1]: xendomains.service: Unit process 1769 (xl) remains running after unit stopped.
Oct 22 19:50:20 brick systemd[1]: xendomains.service: Unit process 312720 (xl) remains running after unit stopped.
Oct 22 19:50:20 brick systemd[1]: Stopped xendomains.service - LSB: Start/stop secondary xen domains.
Oct 22 19:50:20 brick systemd[1]: xendomains.service: Consumed 4min 48.836s CPU time.
-- Boot 7cd2b4335f3d4f8aa735a24b9b57dae6 --


Note that the array in question was /dev/md3, which is not mentioned
here before the reboot. Not sure why.

See the errors about /var/run/xen/qmp-libxl-*. I happen to know which id
5 and 6 were, and these domains were indeed online and operating
normally before the shutdown.

Because they were still online, the md+lvm stack was unable to be
stopped, and shutdown proceeded. Then on boot, the array was marked as
dirty and started resyncing, which is visible on boot:


# journalctl --since '2025-10-22 18:00:00' | grep -E 'md3'
Oct 22 20:14:07 brick kernel: md/raid1:md3: not clean -- starting background reconstruction
Oct 22 20:14:07 brick kernel: md/raid1:md3: active with 2 out of 2 mirrors
Oct 22 20:14:07 brick kernel: md3: detected capacity change from 0 to 3800903680
Oct 22 20:14:12 brick lvm[655]: PV /dev/md3 online, VG universe is complete.
Oct 22 20:14:47 brick kernel: md: resync of RAID array md3
Oct 23 01:02:16 brick kernel: md: md3: resync done.


I have no idea how to reproduce it, and being a production server, I
can't really.

Perhaps this is also a bug elsewhere in blkdeactivate, that it
should force all volume groups to turn off, so that mdadm can be
stopped?




-- System Information:
Debian Release: 12.12
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-40-amd64 (SMP w/8 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages xen-utils-common depends on:
ii  libc6                      2.36-9+deb12u13
ii  libxenhypfs1               4.17.5+23-ga4e5191dc0-1+deb12u1
ii  libxenstore4               4.17.5+23-ga4e5191dc0-1+deb12u1
ii  lsb-base                   11.6
ii  python3                    3.11.2-1+b1
ii  sysvinit-utils [lsb-base]  3.06-4
ii  ucf                        3.0043+nmu1+deb12u1
ii  udev                       252.39-1~deb12u1
ii  xenstore-utils             4.17.5+23-ga4e5191dc0-1+deb12u1

xen-utils-common recommends no packages.

Versions of packages xen-utils-common suggests:
pn  xen-doc  <none>

-- Configuration Files:
/etc/default/xendomains changed [not included]
/etc/xen/xend-config.sxp changed [not included]

-- no debconf information



More information about the Pkg-xen-devel mailing list