[Pkg-libvirt-maintainers] Bug#1120389: qemu: Deleting snapshot results in unexpected disk usage when detect-zeroes=off (Was: Re: Bug#1120389: libvirt-daemon: libvirt disables detect_zeroes, resulting in denial of service during blockcommit/snapshot-delete)
Andrea Bolognani
eof at kiyuko.org
Thu Nov 13 22:55:09 GMT 2025
Control: notfound -1 libvirt/13.3.0-3
Control: reassign -1 qemu
Control: retitle -1 qemu: Deleting snapshot results in unexpected disk usage when detect-zeroes=off
Control: found -1 qemu/1:10.0.3+ds-0+deb13u1
Control: notfound -1 qemu/1:10.1.2+ds-3
I have discussed the matter with the upstream developers for libvirt,
and the consensus seems to be that there are no security implications
for the behavior you reported: any user who is authorized to create
snapshots can easily completely fill up the underlying filesystem by
creating an arbitrarily large number of them, without resorting to
snapshot deletion shenanigans. The appropriate approach is to prevent
untrusted users from creating snapshots via the usual ACL mechanism.
Regardless of the security impact, or lack thereof, the behavior is
unquestionaly surprising and undesirable.
I was able to trivially reproduce the issue on trixie by importing a
Fedora Cloud image and following your instructions. However, on a
different machine running sid, as well as on a Fedora 43 machine,
things work as one would expect instead.
After some tests, I have identified QEMU rather than libvirt as the
culprit: on the sid machine, downgrading QEMU to the version from
trixie causes the issue to show up, whereas downgrading libvirt while
leaving QEMU alone doesn't make that happen.
It looks like the problem has been addressed upstream at some point
between QEMU 10.0 and 10.1. I'll let you and the QEMU maintainer
track down the fix and decide whether it's feasible to backport it to
trixie.
On Sat, Nov 08, 2025 at 12:21:09PM -0500, zrm wrote:
> Package: libvirt-daemon
> Version: 11.3.0-3
> Severity: important
> Tags: security
> X-Debbugs-Cc: debbug at trustiosity.com, Debian Security Team <team at security.debian.org>
>
> Context: quick review of what happens if you 'virsh snapshot-delete' an external disk snapshot when the virtual disk has the properties discard='unmap' and detect_zeroes='off':
>
> # # We're going to make this disk image small so we don't have a Bad Day
> # qemu-img create -f qcow2 /var/lib/libvirt/images/trogdor.qcow2 5G
> # # [put a base install of Debian or similar into this disk image]
> # grep -A3 "device='disk'" trogdor.xml
> <disk type='file' device='disk'>
> <driver name='qemu' type='qcow2' discard='unmap' detect_zeroes='off'/>
> <source file='/var/lib/libvirt/images/trogdor.qcow2'/>
> <target dev='vda' bus='virtio'/>
> # virsh define trogdor.xml
> Domain 'trogdor' defined from trogdor.xml
> # virsh start trogdor
> Domain 'trogdor' started
> # # Make a few snapshots and have the guest TRIM the filesystem in the second one
> # N=1; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap1 created
> # N=2; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap2 created
> # virsh domfstrim trogdor
> # N=3; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap3 created
> # # Now delete the second one
> # du -h /var/lib/libvirt/images/trogdor*.qcow2
> 1.4G /var/lib/libvirt/images/trogdor.qcow2
> 324K /var/lib/libvirt/images/trogdor-snap1.qcow2
> 2.1M /var/lib/libvirt/images/trogdor-snap2.qcow2
> 456K /var/lib/libvirt/images/trogdor-snap3.qcow2
> # virsh snapshot-delete trogdor snap2
> Domain snapshot snap2 deleted
> # du -h /var/lib/libvirt/images/trogdor*.qcow2
> 1.4G /var/lib/libvirt/images/trogdor.qcow2
> 3.0G /var/lib/libvirt/images/trogdor-snap1.qcow2
> 456K /var/lib/libvirt/images/trogdor-snap3.qcow2
>
> Oops. The blockcommit implicit in deleting the snapshot turned all the TRIM'd zeroes into actual zeros because detect_zeroes is off. Which is probably a bug in itself; it would make sense to have discard='unmap' set detect_zeroes='unmap' during blockcommit/blockpull even if it doesn't the rest of the time. This is obviously a much bigger problem when VMs have 3TB of free space instead of 3GB. We must never do that. Whereas this is what happens (or should happen) if you do the same thing with detect_zeroes='unmap':
>
> # grep -A3 "device='disk'" trogdor.xml
> <disk type='file' device='disk'>
> <driver name='qemu' type='qcow2' discard='unmap' detect_zeroes='unmap'/>
> <source file='/var/lib/libvirt/images/trogdor.qcow2'/>
> <target dev='vda' bus='virtio'/>
> # virsh define trogdor.xml
> Domain 'trogdor' defined from trogdor.xml
> # virsh start trogdor
> Domain 'trogdor' started
> # # Make a few snapshots and have the guest TRIM the filesystem in the second one
> # N=1; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap1 created
> # # IT said try turning it off and on again
> # virsh shutdown trogdor
> Domain 'trogdor' is being shutdown
> # # wait for it to shut down
> # virsh start trogdor
> Domain 'trogdor' started
> # N=2; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap2 created
> # virsh domfstrim trogdor
> # N=3; virsh snapshot-create-as --domain trogdor snap$N --diskspec vda,file=/var/lib/libvirt/images/trogdor-snap${N}.qcow2 --disk-only
> Domain snapshot snap3 created
> # du -h /var/lib/libvirt/images/trogdor*.qcow2
> 1.4G /var/lib/libvirt/images/trogdor.qcow2
> 5.4M /var/lib/libvirt/images/trogdor-snap1.qcow2
> 3.9M /var/lib/libvirt/images/trogdor-snap2.qcow2
> 200K /var/lib/libvirt/images/trogdor-snap3.qcow2
> # virsh snapshot-delete trogdor snap2
> Domain snapshot snap2 deleted
> # du -h /var/lib/libvirt/images/trogdor*.qcow2
> 1.4G /var/lib/libvirt/images/trogdor.qcow2
> 5.8M /var/lib/libvirt/images/trogdor-snap1.qcow2
> 1.6M /var/lib/libvirt/images/trogdor-snap3.qcow2
>
> Now the merged snapshot is a much more appropriate size. Notice the VM shutdown and start between creating the first snapshot and the second. That's to work around the bug, because the bug is that detect_zeroes isn't enabled when it ought to be. It's related to these lines in qemuDomainPrepareDiskSourceData() in src/qemu/qemu_domain.c:
>
> /* transfer properties valid only for the top level image */
> if (src == disk->src || src == disk->src->dataFileStore)
> src->detect_zeroes = disk->detect_zeroes;
>
> The conditional here might have been crafted from a place of love, but it ain't right. The function is meant to copy properties from 'disk' to 'src'. What the detect_zeroes property does is check if data being written is all zeroes and then if it is, uses unmap instead of actually writing the zeroes. Typically writes only go to the top-level image so if none of the others ever get written to then they don't need detect_zeroes to be enabled, right? That's the theory?
>
> There are three problems with this. The first is that it's apparently fruitless. Enabling detect_zeroes is somewhat computationally expensive so the theory might be to turn it off for read-only images where it isn't needed, but if they're actually read-only then they shouldn't be getting any writes anyway and then there shouldn't be any computation to be saved by turning it off.
>
> The second is that the backing files aren't always read-only, and in particular live blockcommit writes to them and very much wants detect_zeroes='unmap'. This is where you get the denial of service, because it happens during live snapshot deletion. A user could create arbitrarily many snapshots of trivial size and TRIM them without exceeding a storage quota and the disk usage would only increase by a factor of a thousand when deleting, which renders adversarial the operation that should normally be expected to free up space. Your storage gets filled with zeroes even when you set detect_zeroes='unmap' and your VMs crash because your host filesystem is completely full.
>
> The third is that the check has a TOCTOU fail which is triggered during live snapshot creation. It seems to do what it claims to during guest boot but during snapshot creation that function is called against a new snapshot which is about to be the top level image but 'disk->src' isn't set to 'src' until qemuSnapshotDiskUpdateSource() which is after the call to qemuDomainPrepareDiskSourceData(). With the result that detect_zeroes for what is about to be the top-level image is left as its default value, which is 'off'. So detect_zeroes gets erroneously disabled for the live top-level image whenever you create a live snapshot. It also stays enabled for what was the top-level image when the guest booted. This is why snapshot-delete "worked" after shutting down the guest so it could boot with the top-level snapshot as the one we would ultimately merge the deleted one into -- detect_zeroes at that point was still 'unmap' for snap1 instead of the then-top level snap3.
>
> All of this seems to be sorted by removing the conditional in qemuDomainPrepareDiskSourceData() so that 'src->detect_zeroes = disk->detect_zeroes;' happens unconditionally. That and updating a few tests in tests/qemublocktestdata/xml2json to expect detect_zeroes to be present in the relevant JSON causes detect_zeroes='unmap' to work as expected with blockcommit/snapshot-delete.
>
> If for some reason that change isn't desirable, the most irksome of the trouble could be eliminated by having qemuBlockCommit(), and therefore snapshot-delete, temporarily set detect_zeroes='unmap' on the commit target destination image whenever discard='unmap' is set. This would be good to do even if the conditional in qemuDomainPrepareDiskSourceData() *is* removed, because blockcommit causes disk images to explode with zeroes whenever discard='unmap' and detect_zeroes isn't, and that isn't desirable even if you don't want to constantly pay the cost of enabling detect_zeroes on the live top image. Or to put it another way, you shouldn't have to enable it on the live top image if you only want it during blockcommit/snapshot-delete, but you will want it during blockcommit whenever discard is enabled.
>
> -- System Information:
> Debian Release: 13.1
> APT prefers stable-updates
> APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
> Architecture: amd64 (x86_64)
>
> Kernel: Linux 6.12.48+deb13-amd64 (SMP w/4 CPU threads; PREEMPT)
> Kernel taint flags: TAINT_FIRMWARE_WORKAROUND
> Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
> Shell: /bin/sh linked to /usr/bin/dash
> Init: systemd (via /run/systemd/system)
> LSM: AppArmor: enabled
>
> Versions of packages libvirt-daemon depends on:
> ii libc6 2.41-12
> ii libgcc-s1 14.2.0-19
> ii libglib2.0-0t64 2.84.4-3~deb13u1
> ii libtirpc3t64 1.3.6+ds-1
> ii libvirt-common 11.3.0-3
> ii libvirt-daemon-common 11.3.0-3
> ii libvirt-daemon-log 11.3.0-3
> ii libvirt0 11.3.0-3
> ii libxml2 2.12.7+dfsg+really2.9.14-2.1+deb13u1
> ii logrotate 3.22.0-1
>
> Versions of packages libvirt-daemon recommends:
> ii libvirt-daemon-driver-interface 11.3.0-3
> ii libvirt-daemon-driver-network 11.3.0-3
> ii libvirt-daemon-driver-nodedev 11.3.0-3
> ii libvirt-daemon-driver-nwfilter 11.3.0-3
> ii libvirt-daemon-driver-qemu 11.3.0-3
> ii libvirt-daemon-driver-secret 11.3.0-3
> ii libvirt-daemon-driver-storage 11.3.0-3
> ii libvirt-daemon-driver-storage-disk 11.3.0-3
> ii libvirt-daemon-driver-storage-iscsi 11.3.0-3
> ii libvirt-daemon-driver-storage-logical 11.3.0-3
> ii libvirt-daemon-driver-storage-mpath 11.3.0-3
> ii libvirt-daemon-driver-storage-scsi 11.3.0-3
> ii libvirt-daemon-lock 11.3.0-3
> ii libvirt-daemon-plugin-lockd 11.3.0-3
>
> Versions of packages libvirt-daemon suggests:
> ii libvirt-daemon-driver-lxc 11.3.0-3
> pn libvirt-daemon-driver-storage-gluster <none>
> pn libvirt-daemon-driver-storage-iscsi-direct <none>
> pn libvirt-daemon-driver-storage-rbd <none>
> pn libvirt-daemon-driver-storage-zfs <none>
> ii libvirt-daemon-driver-vbox 11.3.0-3
> ii libvirt-daemon-driver-xen 11.3.0-3
> ii libvirt-daemon-system 11.3.0-3
>
> -- no debconf information
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-libvirt-maintainers/attachments/20251113/a70adfae/attachment.sig>
More information about the Pkg-libvirt-maintainers
mailing list