[Pkg-zfsonlinux-devel] Bug#1086617: bookworm-pu: package zfs-linux/2.1.11-1+deb12u1
Shengqi Chen
harry-chen at outlook.com
Sat Nov 2 08:04:31 GMT 2024
Package: release.debian.org
Severity: normal
Tags: bookworm
User: release.debian.org at packages.debian.org
Usertags: pu
X-Debbugs-Cc: zfs-linux at packages.debian.org, aron at debian.org
Control: affects -1 + src:zfs-linux
Control: block 1063497 by -1
Control: block 1069125 by -1
[ Reason ]
zfs in bookworm (2.1.11) suffers from CVE-2023-49298 (data corruption)
and CVE-2013-20001 (nfs sharing security problem).
[ Impact ]
They are still affected by these CVEs, although risks are not grave.
[ Tests ]
These patches are in unstable / bookworm-bpo for a long time.
No problems are reported against them.
[ Risks ]
Risks are minimal. I only cherry-pick fixes, no functionality changes.
[ Checklist ]
[x] *all* changes are documented in the d/changelog
[x] I reviewed all changes and I approve them
[x] attach debdiff against the package in (old)stable
[x] the issue is verified as fixed in unstable
[ Changes ]
* dch: typo fix
* New symbols for libzfs4linux and libzpool5linux
(missing in last upload)
* d/patches: cherry-pick upstream fixes for stability issues
+ fix dnode dirty test (Closes: #1056752, #1063497, CVE-2023-49298)
+ fix sharenfx IPv6 address parsing (Closes: CVE-2013-20001)
+ and some fixes related to NULL pointer, memory allocation, etc.
[ Other info ]
This request is similar to #1042730, but removing many non-essential
patches. The remaining patches containing ~100 LOC of pure changes,
and most of other changes are commit messages.
--
Thanks,
Shengqi Chen
-------------- next part --------------
diff -Nru zfs-linux-2.1.11/debian/changelog zfs-linux-2.1.11/debian/changelog
--- zfs-linux-2.1.11/debian/changelog 2023-04-23 17:29:38.000000000 +0800
+++ zfs-linux-2.1.11/debian/changelog 2024-11-02 15:34:23.000000000 +0800
@@ -1,3 +1,14 @@
+zfs-linux (2.1.11-1+deb12u1) UNRELEASED; urgency=medium
+
+ * dch: typo fix
+ * New symbols for libzfs4linux and libzpool5linux
+ * d/patches: cherry-pick upstream fixes for stability issues
+ + fix dnode dirty test (Closes: #1056752, #1063497, CVE-2023-49298)
+ + fix sharenfx IPv6 address parsing (Closes: CVE-2013-20001)
+ + and some fixes related to NULL pointer, memory allocation, etc.
+
+ -- Shengqi Chen <harry-chen at outlook.com> Sat, 02 Nov 2024 15:34:23 +0800
+
zfs-linux (2.1.11-1) unstable; urgency=medium
[ Mo Zhou ]
@@ -5,7 +16,7 @@
[ Aron Xu ]
* New upstream stable point release version 2.1.11
- * Drop patches that are alreay in upstream stable release
+ * Drop patches that are already in upstream stable release
-- Aron Xu <aron at debian.org> Sun, 23 Apr 2023 17:29:38 +0800
diff -Nru zfs-linux-2.1.11/debian/libzfs4linux.symbols zfs-linux-2.1.11/debian/libzfs4linux.symbols
--- zfs-linux-2.1.11/debian/libzfs4linux.symbols 2023-04-17 12:44:44.000000000 +0800
+++ zfs-linux-2.1.11/debian/libzfs4linux.symbols 2024-11-02 15:27:19.000000000 +0800
@@ -102,6 +102,7 @@
snapshot_namecheck at Base 2.0
spa_feature_table at Base 0.8.2
unshare_one at Base 2.0
+ use_color at Base 2.1.11
zcmd_alloc_dst_nvlist at Base 0.8.2
zcmd_expand_dst_nvlist at Base 0.8.2
zcmd_free_nvlists at Base 0.8.2
@@ -386,6 +387,7 @@
zpool_vdev_path_to_guid at Base 0.8.2
zpool_vdev_remove at Base 0.8.2
zpool_vdev_remove_cancel at Base 0.8.2
+ zpool_vdev_remove_wanted at Base 2.1.11
zpool_vdev_split at Base 0.8.2
zpool_wait at Base 2.0
zpool_wait_status at Base 2.0
@@ -678,6 +680,8 @@
zfs_niceraw at Base 2.0
zfs_nicetime at Base 2.0
zfs_resolve_shortname at Base 2.0
+ zfs_setproctitle at Base 2.1.11
+ zfs_setproctitle_init at Base 2.1.11
zfs_strcmp_pathname at Base 2.0
zfs_strip_partition at Base 2.0
zfs_strip_path at Base 2.0
diff -Nru zfs-linux-2.1.11/debian/libzpool5linux.symbols zfs-linux-2.1.11/debian/libzpool5linux.symbols
--- zfs-linux-2.1.11/debian/libzpool5linux.symbols 2023-04-17 15:26:55.000000000 +0800
+++ zfs-linux-2.1.11/debian/libzpool5linux.symbols 2024-11-02 15:27:19.000000000 +0800
@@ -685,6 +685,7 @@
dnode_special_close at Base 0.8.2
dnode_special_open at Base 0.8.2
dnode_stats at Base 0.8.2
+ dnode_sums at Base 2.1.11
dnode_sync at Base 0.8.2
dnode_try_claim at Base 0.8.2
dnode_verify at Base 2.0
@@ -2095,6 +2096,7 @@
vdev_checkpoint_sm_object at Base 0.8.2
vdev_children_are_offline at Base 0.8.2
vdev_clear at Base 0.8.2
+ vdev_clear_kobj_evt at Base 2.1.11
vdev_clear_resilver_deferred at Base 0.8.3
vdev_clear_stats at Base 0.8.2
vdev_close at Base 0.8.2
@@ -2227,6 +2229,7 @@
vdev_open at Base 0.8.2
vdev_open_children at Base 0.8.2
vdev_open_children_subset at Base 2.1
+ vdev_post_kobj_evt at Base 2.1.11
vdev_probe at Base 0.8.2
vdev_propagate_state at Base 0.8.2
vdev_psize_to_asize at Base 0.8.2
@@ -2277,6 +2280,7 @@
vdev_removal_max_span at Base 0.8.2
vdev_remove_child at Base 0.8.2
vdev_remove_parent at Base 0.8.2
+ vdev_remove_wanted at Base 2.1.11
vdev_reopen at Base 0.8.2
vdev_replace_in_progress at Base 2.0
vdev_replacing_ops at Base 0.8.2
diff -Nru zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
--- zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch 2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,91 @@
+From a68dfdb88c88fe970343e49b48bfd3bb4cef99d2 Mon Sep 17 00:00:00 2001
+From: Ameer Hamza <106930537+ixhamza at users.noreply.github.com>
+Date: Wed, 19 Apr 2023 21:04:32 +0500
+Subject: [PATCH] Fix "Detach spare vdev in case if resilvering does not
+ happen"
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Spare vdev should detach from the pool when a disk is reinserted.
+However, spare detachment depends on the completion of resilvering,
+and if resilver does not schedule, the spare vdev?keeps attached to
+the pool until the next resilvering. When a zfs pool contains
+several disks (25+ mirror), resilvering does not always happen when
+a disk is?reinserted. In this patch, spare vdev is manually detached
+from the pool when resilvering does not occur and it has been tested
+on both Linux and FreeBSD.
+
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Reviewed-by: Alexander Motin <mav at FreeBSD.org>
+Signed-off-by: Ameer Hamza <ahamza at ixsystems.com>
+Closes #14722
+---
+ include/sys/spa.h | 1 +
+ module/zfs/spa.c | 5 +++--
+ module/zfs/vdev.c | 12 +++++++++++-
+ 3 files changed, 15 insertions(+), 3 deletions(-)
+
+diff --git a/include/sys/spa.h b/include/sys/spa.h
+index fedadab45..07e09d1ec 100644
+--- a/include/sys/spa.h
++++ b/include/sys/spa.h
+@@ -785,6 +785,7 @@ extern int bpobj_enqueue_free_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx);
+ #define SPA_ASYNC_L2CACHE_REBUILD 0x800
+ #define SPA_ASYNC_L2CACHE_TRIM 0x1000
+ #define SPA_ASYNC_REBUILD_DONE 0x2000
++#define SPA_ASYNC_DETACH_SPARE 0x4000
+
+ /* device manipulation */
+ extern int spa_vdev_add(spa_t *spa, nvlist_t *nvroot);
+diff --git a/module/zfs/spa.c b/module/zfs/spa.c
+index 1ed79eed3..8bc51f777 100644
+--- a/module/zfs/spa.c
++++ b/module/zfs/spa.c
+@@ -6987,7 +6987,7 @@ spa_vdev_attach(spa_t *spa, uint64_t guid, nvlist_t *nvroot, int replacing,
+ * Detach a device from a mirror or replacing vdev.
+ *
+ * If 'replace_done' is specified, only detach if the parent
+- * is a replacing vdev.
++ * is a replacing or a spare vdev.
+ */
+ int
+ spa_vdev_detach(spa_t *spa, uint64_t guid, uint64_t pguid, int replace_done)
+@@ -8210,7 +8210,8 @@ spa_async_thread(void *arg)
+ * If any devices are done replacing, detach them.
+ */
+ if (tasks & SPA_ASYNC_RESILVER_DONE ||
+- tasks & SPA_ASYNC_REBUILD_DONE) {
++ tasks & SPA_ASYNC_REBUILD_DONE ||
++ tasks & SPA_ASYNC_DETACH_SPARE) {
+ spa_vdev_resilver_done(spa);
+ }
+
+diff --git a/module/zfs/vdev.c b/module/zfs/vdev.c
+index 4b9d7e7c0..ee0c1d862 100644
+--- a/module/zfs/vdev.c
++++ b/module/zfs/vdev.c
+@@ -4085,9 +4085,19 @@ vdev_online(spa_t *spa, uint64_t guid, uint64_t flags, vdev_state_t *newstate)
+
+ if (wasoffline ||
+ (oldstate < VDEV_STATE_DEGRADED &&
+- vd->vdev_state >= VDEV_STATE_DEGRADED))
++ vd->vdev_state >= VDEV_STATE_DEGRADED)) {
+ spa_event_notify(spa, vd, NULL, ESC_ZFS_VDEV_ONLINE);
+
++ /*
++ * Asynchronously detach spare vdev if resilver or
++ * rebuild is not required
++ */
++ if (vd->vdev_unspare &&
++ !dsl_scan_resilvering(spa->spa_dsl_pool) &&
++ !dsl_scan_resilver_scheduled(spa->spa_dsl_pool) &&
++ !vdev_rebuild_active(tvd))
++ spa_async_request(spa, SPA_ASYNC_DETACH_SPARE);
++ }
+ return (spa_vdev_state_exit(spa, vd, 0));
+ }
+
+--
+2.39.2
+
diff -Nru zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
--- zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch 2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,65 @@
+From 671b1af1bc4b20ddd939c2ede22748bd027d30be Mon Sep 17 00:00:00 2001
+From: =?UTF-8?q?Lu=C3=ADs=20Henriques?=
+ <73643340+lumigch at users.noreply.github.com>
+Date: Tue, 30 May 2023 23:15:24 +0100
+Subject: [PATCH] Fix NULL pointer dereference when doing concurrent 'send'
+ operations
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+A NULL pointer will occur when doing a 'zfs send -S' on a dataset that
+is still being received. The problem is that the new 'send' will
+rightfully fail to own the datasets (i.e. dsl_dataset_own_force() will
+fail), but then dmu_send() will still do the dsl_dataset_disown().
+
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Lu?s Henriques <henrix at camandro.org>
+Closes #14903
+Closes #14890
+---
+ module/zfs/dmu_send.c | 8 ++++++--
+ 1 file changed, 6 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dmu_send.c b/module/zfs/dmu_send.c
+index cd9ecc07f..0dd1ec210 100644
+--- a/module/zfs/dmu_send.c
++++ b/module/zfs/dmu_send.c
+@@ -2797,6 +2797,7 @@ dmu_send(const char *tosnap, const char *fromsnap, boolean_t embedok,
+ }
+
+ if (err == 0) {
++ owned = B_TRUE;
+ err = zap_lookup(dspp.dp->dp_meta_objset,
+ dspp.to_ds->ds_object,
+ DS_FIELD_RESUME_TOGUID, 8, 1,
+@@ -2810,21 +2811,24 @@ dmu_send(const char *tosnap, const char *fromsnap, boolean_t embedok,
+ sizeof (dspp.saved_toname),
+ dspp.saved_toname);
+ }
+- if (err != 0)
++ /* Only disown if there was an error in the lookups */
++ if (owned && (err != 0))
+ dsl_dataset_disown(dspp.to_ds, dsflags, FTAG);
+
+ kmem_strfree(name);
+ } else {
+ err = dsl_dataset_own(dspp.dp, tosnap, dsflags,
+ FTAG, &dspp.to_ds);
++ if (err == 0)
++ owned = B_TRUE;
+ }
+- owned = B_TRUE;
+ } else {
+ err = dsl_dataset_hold_flags(dspp.dp, tosnap, dsflags, FTAG,
+ &dspp.to_ds);
+ }
+
+ if (err != 0) {
++ /* Note: dsl dataset is not owned at this point */
+ dsl_pool_rele(dspp.dp, FTAG);
+ return (err);
+ }
+--
+2.39.2
+
diff -Nru zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
--- zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch 2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,45 @@
+From 93a99c6daae6e8c126ead2bcf331e5772c966cc7 Mon Sep 17 00:00:00 2001
+From: Rich Ercolani <214141+rincebrain at users.noreply.github.com>
+Date: Wed, 31 May 2023 19:58:41 -0400
+Subject: [PATCH] Revert "initramfs: use `mount.zfs` instead of `mount`"
+
+This broke mounting of snapshots on / for users.
+
+See https://github.com/openzfs/zfs/issues/9461#issuecomment-1376162949 for more context.
+
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Rich Ercolani <rincebrain at gmail.com>
+Closes #14908
+---
+ contrib/initramfs/scripts/zfs | 6 +++---
+ 1 file changed, 3 insertions(+), 3 deletions(-)
+
+--- a/contrib/initramfs/scripts/zfs
++++ b/contrib/initramfs/scripts/zfs
+@@ -342,7 +342,7 @@
+
+ # Need the _original_ datasets mountpoint!
+ mountpoint=$(get_fs_value "$fs" mountpoint)
+- ZFS_CMD="mount.zfs -o zfsutil"
++ ZFS_CMD="mount -o zfsutil -t zfs"
+ if [ "$mountpoint" = "legacy" ] || [ "$mountpoint" = "none" ]; then
+ # Can't use the mountpoint property. Might be one of our
+ # clones. Check the 'org.zol:mountpoint' property set in
+@@ -359,7 +359,7 @@
+ fi
+ # Don't use mount.zfs -o zfsutils for legacy mountpoint
+ if [ "$mountpoint" = "legacy" ]; then
+- ZFS_CMD="mount.zfs"
++ ZFS_CMD="mount -t zfs"
+ fi
+ # Last hail-mary: Hope 'rootmnt' is set!
+ mountpoint=""
+@@ -930,7 +930,7 @@
+ echo " not specified on the kernel command line."
+ echo ""
+ echo "Manually mount the root filesystem on $rootmnt and then exit."
+- echo "Hint: Try: mount.zfs -o zfsutil ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
++ echo "Hint: Try: mount -o zfsutil -t zfs ${ZFS_RPOOL-rpool}/ROOT/system $rootmnt"
+ shell
+ fi
+
diff -Nru zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
--- zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0022-zil-Don-t-expect-zio_shrink-to-succeed.patch 2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,31 @@
+From b01a8cc2c0fe6ee4af05bb0b1911afcbd39da64b Mon Sep 17 00:00:00 2001
+From: Alexander Motin <mav at FreeBSD.org>
+Date: Thu, 11 May 2023 17:27:12 -0400
+Subject: [PATCH] zil: Don't expect zio_shrink() to succeed.
+
+At least for RAIDZ zio_shrink() does not reduce zio size, but reduced
+wsz in that case likely results in writing uninitialized memory.
+
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Alexander Motin <mav at FreeBSD.org>
+Sponsored by: iXsystems, Inc.
+Closes #14853
+---
+ module/zfs/zil.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/module/zfs/zil.c b/module/zfs/zil.c
+index 0456d3801..cca061040 100644
+--- a/module/zfs/zil.c
++++ b/module/zfs/zil.c
+@@ -1593,6 +1593,7 @@ zil_lwb_write_issue(zilog_t *zilog, lwb_t *lwb)
+ wsz = P2ROUNDUP_TYPED(lwb->lwb_nused, ZIL_MIN_BLKSZ, uint64_t);
+ ASSERT3U(wsz, <=, lwb->lwb_sz);
+ zio_shrink(lwb->lwb_write_zio, wsz);
++ wsz = lwb->lwb_write_zio->io_size;
+
+ } else {
+ wsz = lwb->lwb_sz;
+--
+2.39.2
+
diff -Nru zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
--- zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch 2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,48 @@
+From 837e426c1f302e580a18a213fd216322f480caf8 Mon Sep 17 00:00:00 2001
+From: Brian Behlendorf <behlendorf1 at llnl.gov>
+Date: Wed, 7 Jun 2023 10:43:43 -0700
+Subject: [PATCH] Linux: Never sleep in kmem_cache_alloc(..., KM_NOSLEEP)
+ (#14926)
+
+When a kmem cache is exhausted and needs to be expanded a new
+slab is allocated. KM_SLEEP callers can block and wait for the
+allocation, but KM_NOSLEEP callers were incorrectly allowed to
+block as well.
+
+Resolve this by attempting an emergency allocation as a best
+effort. This may fail but that's fine since any KM_NOSLEEP
+consumer is required to handle an allocation failure.
+
+Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Reviewed-by: Adam Moss <c at yotes.com>
+Reviewed-by: Brian Atkinson <batkinson at lanl.gov>
+Reviewed-by: Richard Yao <richard.yao at alumni.stonybrook.edu>
+Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
+---
+ module/os/linux/spl/spl-kmem-cache.c | 12 +++++++++++-
+ 1 file changed, 11 insertions(+), 1 deletion(-)
+
+--- a/module/os/linux/spl/spl-kmem-cache.c
++++ b/module/os/linux/spl/spl-kmem-cache.c
+@@ -1017,10 +1017,20 @@
+ ASSERT0(flags & ~KM_PUBLIC_MASK);
+ ASSERT(skc->skc_magic == SKC_MAGIC);
+ ASSERT((skc->skc_flags & KMC_SLAB) == 0);
+- might_sleep();
++
+ *obj = NULL;
+
+ /*
++ * Since we can't sleep attempt an emergency allocation to satisfy
++ * the request. The only alterative is to fail the allocation but
++ * it's preferable try. The use of KM_NOSLEEP is expected to be rare.
++ */
++ if (flags & KM_NOSLEEP)
++ return (spl_emergency_alloc(skc, flags, obj));
++
++ might_sleep();
++
++ /*
+ * Before allocating a new slab wait for any reaping to complete and
+ * then return so the local magazine can be rechecked for new objects.
+ */
diff -Nru zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
--- zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch 2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,93 @@
+From 77b0c6f0403b2b7d145bf6c244b6acbc757ccdc9 Mon Sep 17 00:00:00 2001
+From: Rob N <robn at despairlabs.com>
+Date: Wed, 29 Nov 2023 04:16:49 +1100
+Subject: [PATCH] dnode_is_dirty: check dnode and its data for dirtiness
+
+Over its history this the dirty dnode test has been changed between
+checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
+`dn_dirty_record`.
+
+ de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
+ 2531ce372 Revert "Report holes when there are only metadata changes"
+ ec4f9b8f3 Report holes when there are only metadata changes
+ 454365bba Fix dirty check in dmu_offset_next()
+ 66aca2473 SEEK_HOLE should not block on txg_wait_synced()
+
+Also illumos/illumos-gate at c543ec060d illumos/illumos-gate at 2bcf0248e9
+
+It turns out both are actually required.
+
+In the case of appending data to a newly created file, the dnode proper
+is dirtied (at least to change the blocksize) and dirty records are
+added. Thus, a single logical operation is represented by separate
+dirty indicators, and must not be separated.
+
+The incorrect dirty check becomes a problem when the first block of a
+file is being appended to while another process is calling lseek to skip
+holes. There is a small window where the dnode part is undirtied while
+there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
+would not know that the file is dirty, and would go to
+`dnode_next_offset()`. Since the object has no data blocks yet, it
+returns `ESRCH`, indicating no data found, which results in `ENXIO`
+being returned to `lseek()`'s caller.
+
+Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
+uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
+replicate the holes in the target. When it hits the bug, its initial
+search for data fails, and it goes on to call `fallocate()` to create a
+hole over the entire destination file.
+
+This has come up more recently as users upgrade their systems, getting
+OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
+reproduced against 2.1, as well as on FreeBSD 13 and 14.
+
+This change simply updates the dirty check to check both types of dirty.
+If there's anything dirty at all, we immediately go to the "wait for
+sync" stage, It doesn't really matter after that; both changes are on
+disk, so the dirty fields should be correct.
+
+Sponsored-by: Klara, Inc.
+Sponsored-by: Wasabi Technology, Inc.
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Reviewed-by: Alexander Motin <mav at FreeBSD.org>
+Reviewed-by: Rich Ercolani <rincebrain at gmail.com>
+Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
+Closes #15571
+Closes #15526
+---
+ module/zfs/dnode.c | 12 ++++++++++--
+ 1 file changed, 10 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
+index a9aaa4d21..efebc443a 100644
+--- a/module/zfs/dnode.c
++++ b/module/zfs/dnode.c
+@@ -1773,7 +1773,14 @@ dnode_try_claim(objset_t *os, uint64_t object, int slots)
+ }
+
+ /*
+- * Checks if the dnode contains any uncommitted dirty records.
++ * Checks if the dnode itself is dirty, or is carrying any uncommitted records.
++ * It is important to check both conditions, as some operations (eg appending
++ * to a file) can dirty both as a single logical unit, but they are not synced
++ * out atomically, so checking one and not the other can result in an object
++ * appearing to be clean mid-way through a commit.
++ *
++ * Do not change this lightly! If you get it wrong, dmu_offset_next() can
++ * detect a hole where there is really data, leading to silent corruption.
+ */
+ boolean_t
+ dnode_is_dirty(dnode_t *dn)
+@@ -1781,7 +1788,8 @@ dnode_is_dirty(dnode_t *dn)
+ mutex_enter(&dn->dn_mtx);
+
+ for (int i = 0; i < TXG_SIZE; i++) {
+- if (multilist_link_active(&dn->dn_dirty_link[i])) {
++ if (multilist_link_active(&dn->dn_dirty_link[i]) ||
++ !list_is_empty(&dn->dn_dirty_records[i])) {
+ mutex_exit(&dn->dn_mtx);
+ return (B_TRUE);
+ }
+--
+2.39.2
+
diff -Nru zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
--- zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch 2024-11-02 15:27:19.000000000 +0800
@@ -0,0 +1,104 @@
+From 1ca531971f176ae7b8ca440e836985ae1d7fa0ec Mon Sep 17 00:00:00 2001
+From: Jason King <jasonbking at users.noreply.github.com>
+Date: Thu, 12 Oct 2023 13:01:54 -0500
+Subject: [PATCH] Zpool can start allocating from metaslab before TRIMs have
+ completed
+
+When doing a manual TRIM on a zpool, the metaslab being TRIMmed is
+potentially re-enabled before all queued TRIM zios for that metaslab
+have completed. Since TRIM zios have the lowest priority, it is
+possible to get into a situation where allocations occur from the
+just re-enabled metaslab and cut ahead of queued TRIMs to the same
+metaslab. If the ranges overlap, this will cause corruption.
+
+We were able to trigger this pretty consistently with a small single
+top-level vdev zpool (i.e. small number of metaslabs) with heavy
+parallel write activity while performing a manual TRIM against a
+somewhat 'slow' device (so TRIMs took a bit of time to complete).
+With the patch, we've not been able to recreate it since. It was on
+illumos, but inspection of the OpenZFS trim code looks like the
+relevant pieces are largely unchanged and so it appears it would be
+vulnerable to the same issue.
+
+Reviewed-by: Igor Kozhukhov <igor at dilos.org>
+Reviewed-by: Alexander Motin <mav at FreeBSD.org>
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Jason King <jking at racktopsystems.com>
+Illumos-issue: https://www.illumos.org/issues/15939
+Closes #15395
+---
+ module/zfs/vdev_trim.c | 28 +++++++++++++++++++---------
+ 1 file changed, 19 insertions(+), 9 deletions(-)
+
+diff --git a/module/zfs/vdev_trim.c b/module/zfs/vdev_trim.c
+index 92daed48f..c0ce2ac28 100644
+--- a/module/zfs/vdev_trim.c
++++ b/module/zfs/vdev_trim.c
+@@ -23,6 +23,7 @@
+ * Copyright (c) 2016 by Delphix. All rights reserved.
+ * Copyright (c) 2019 by Lawrence Livermore National Security, LLC.
+ * Copyright (c) 2021 Hewlett Packard Enterprise Development LP
++ * Copyright 2023 RackTop Systems, Inc.
+ */
+
+ #include <sys/spa.h>
+@@ -572,6 +573,7 @@ vdev_trim_ranges(trim_args_t *ta)
+ uint64_t extent_bytes_max = ta->trim_extent_bytes_max;
+ uint64_t extent_bytes_min = ta->trim_extent_bytes_min;
+ spa_t *spa = vd->vdev_spa;
++ int error = 0;
+
+ ta->trim_start_time = gethrtime();
+ ta->trim_bytes_done = 0;
+@@ -591,19 +593,32 @@ vdev_trim_ranges(trim_args_t *ta)
+ uint64_t writes_required = ((size - 1) / extent_bytes_max) + 1;
+
+ for (uint64_t w = 0; w < writes_required; w++) {
+- int error;
+-
+ error = vdev_trim_range(ta, VDEV_LABEL_START_SIZE +
+ rs_get_start(rs, ta->trim_tree) +
+ (w *extent_bytes_max), MIN(size -
+ (w * extent_bytes_max), extent_bytes_max));
+ if (error != 0) {
+- return (error);
++ goto done;
+ }
+ }
+ }
+
+- return (0);
++done:
++ /*
++ * Make sure all TRIMs for this metaslab have completed before
++ * returning. TRIM zios have lower priority over regular or syncing
++ * zios, so all TRIM zios for this metaslab must complete before the
++ * metaslab is re-enabled. Otherwise it's possible write zios to
++ * this metaslab could cut ahead of still queued TRIM zios for this
++ * metaslab causing corruption if the ranges overlap.
++ */
++ mutex_enter(&vd->vdev_trim_io_lock);
++ while (vd->vdev_trim_inflight[0] > 0) {
++ cv_wait(&vd->vdev_trim_io_cv, &vd->vdev_trim_io_lock);
++ }
++ mutex_exit(&vd->vdev_trim_io_lock);
++
++ return (error);
+ }
+
+ static void
+@@ -922,11 +937,6 @@ vdev_trim_thread(void *arg)
+ }
+
+ spa_config_exit(spa, SCL_CONFIG, FTAG);
+- mutex_enter(&vd->vdev_trim_io_lock);
+- while (vd->vdev_trim_inflight[0] > 0) {
+- cv_wait(&vd->vdev_trim_io_cv, &vd->vdev_trim_io_lock);
+- }
+- mutex_exit(&vd->vdev_trim_io_lock);
+
+ range_tree_destroy(ta.trim_tree);
+
+--
+2.39.2
+
diff -Nru zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
--- zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch 1970-01-01 08:00:00.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch 2024-11-02 15:34:23.000000000 +0800
@@ -0,0 +1,94 @@
+From 6cb5e1e7591da20af3a15793e022345a73e40fb7 Mon Sep 17 00:00:00 2001
+From: felixdoerre <felixdoerre at users.noreply.github.com>
+Date: Wed, 20 Oct 2021 19:40:00 +0200
+Subject: [PATCH] libshare: nfs: pass through ipv6 addresses in bracket
+ notation
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Recognize when the host part of a sharenfs attribute is an ipv6
+Literal and pass that through without modification.
+
+Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
+Signed-off-by: Felix D?rre <felix at dogcraft.de>
+Closes: #11171
+Closes #11939
+Closes: #1894
+---
+--- a/lib/libshare/os/linux/nfs.c
++++ b/lib/libshare/os/linux/nfs.c
+@@ -180,8 +180,9 @@
+ {
+ int error;
+ const char *access;
+- char *host_dup, *host, *next;
++ char *host_dup, *host, *next, *v6Literal;
+ nfs_host_cookie_t *udata = (nfs_host_cookie_t *)pcookie;
++ int cidr_len;
+
+ #ifdef DEBUG
+ fprintf(stderr, "foreach_nfs_host_cb: key=%s, value=%s\n", opt, value);
+@@ -204,10 +205,46 @@
+ host = host_dup;
+
+ do {
+- next = strchr(host, ':');
+- if (next != NULL) {
+- *next = '\0';
+- next++;
++ if (*host == '[') {
++ host++;
++ v6Literal = strchr(host, ']');
++ if (v6Literal == NULL) {
++ free(host_dup);
++ return (SA_SYNTAX_ERR);
++ }
++ if (v6Literal[1] == '\0') {
++ *v6Literal = '\0';
++ next = NULL;
++ } else if (v6Literal[1] == '/') {
++ next = strchr(v6Literal + 2, ':');
++ if (next == NULL) {
++ cidr_len =
++ strlen(v6Literal + 1);
++ memmove(v6Literal,
++ v6Literal + 1,
++ cidr_len);
++ v6Literal[cidr_len] = '\0';
++ } else {
++ cidr_len = next - v6Literal - 1;
++ memmove(v6Literal,
++ v6Literal + 1,
++ cidr_len);
++ v6Literal[cidr_len] = '\0';
++ next++;
++ }
++ } else if (v6Literal[1] == ':') {
++ *v6Literal = '\0';
++ next = v6Literal + 2;
++ } else {
++ free(host_dup);
++ return (SA_SYNTAX_ERR);
++ }
++ } else {
++ next = strchr(host, ':');
++ if (next != NULL) {
++ *next = '\0';
++ next++;
++ }
+ }
+
+ error = udata->callback(udata->filename,
+--- a/man/man8/zfs.8
++++ b/man/man8/zfs.8
+@@ -545,7 +545,7 @@
+ on the
+ .Ar tank/home
+ file system:
+-.Dl # Nm zfs Cm set Sy sharenfs Ns = Ns ' Ns Ar rw Ns =@123.123.0.0/16,root= Ns Ar neo Ns ' tank/home
++.Dl # Nm zfs Cm set Sy sharenfs Ns = Ns ' Ns Ar rw Ns =@123.123.0.0/16:[::1],root= Ns Ar neo Ns ' tank/home
+ .Pp
+ If you are using DNS for host name resolution,
+ specify the fully-qualified hostname.
+
diff -Nru zfs-linux-2.1.11/debian/patches/series zfs-linux-2.1.11/debian/patches/series
--- zfs-linux-2.1.11/debian/patches/series 2023-04-19 13:37:42.000000000 +0800
+++ zfs-linux-2.1.11/debian/patches/series 2024-11-02 15:34:23.000000000 +0800
@@ -27,3 +27,12 @@
0006-rootdelay-on-zfs-should-be-adaptive.patch
0009-zdb-zero-pad-checksum-output.patch
0010-zdb-zero-pad-checksum-output-follow-up.patch
+# 2.1.11+deb12u1
+0013-Fix-Detach-spare-vdev-in-case-if-resilvering-does-no.patch
+0020-Fix-NULL-pointer-dereference-when-doing-concurrent-s.patch
+0021-Revert-initramfs-use-mount.zfs-instead-of-mount.patch
+0022-zil-Don-t-expect-zio_shrink-to-succeed.patch
+0027-Linux-Never-sleep-in-kmem_cache_alloc-.-KM_NOSLEEP-1.patch
+0028-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
+0029-Zpool-can-start-allocating-from-metaslab-before-TRIM.patch
+0030-libshare-nfs-pass-through-ipv6-addresses-in-bracket.patch
More information about the Pkg-zfsonlinux-devel
mailing list