[Pkg-zfsonlinux-devel] Bug#1104724: [RESEND] RE: Bug#1104724: Bug#1104724: zfs-dkms: Kernel panic and kernel blocks with zfs 2.3.2-1 and linux 6.12.25-1

Thu May 8 09:22:21 BST 2025

Hi,

My reply on 5/7 got consistently rejected by debian BTS with reason:

 550 Blacklisted URL in message. (openzfs . github . io) in [black]. See http://lookup.uribl.com.

So resending this message to the list.

> 2025年5月7日 02:38，Stefan Bellon <sbellon at sbellon.de> 写道：
> 
> On Wed, 07 May, Shengqi Chen wrote:
> 
>> Since the problem originates from one single directory (~/.config),
> 
> I'm not sure this is actually true. It's just that I noticed that
> at least ~/.config/fish/* is affected as I am unable to open a "fish"
> shell as that user.
> 
> On the other hand, the stack traces I provided in my bug report also
> show other processes besides "fish" to be affected (e.g. txg_sync,
> z_wr_iss, ...), so I am not sure how "local" the issue is.

They are zfs’s kernel threads, so it might still be related to your
recent I/O.

>> If you encountered any errors, you can try zdb -r further.
> 
> I think I would require some pointers to what to do with that, in order
> not to make things worse.

See https : / /openzfs .  github .   io / openzfs-docs/man/master/8/zdb.8.html
(Please remove the spacing which is used to avoid being rejected again)
It’s only used for copy objects without going through the vfs level.
E.g. zdb -r pool path/to/file /tmp/file

> For my understanding: the kernel panic and hangs I am experiencing are
> all when accessing data in the active dataset. So, if I would get
> that clean, I could still encounter the issue when accessing the same
> affected files via a snapshot, but at least the active dataset would
> be clean and if the (automatic) snapshots get dropped over time, this
> would cure it for the whole pool?

Correct. This is maybe the best situation we could hope for. But since
zfs literally uses blkptr everywhere, it is possible that the corrupted
data structure can not be easily fixed / recreated by copying files.

> And another question: When booting into the kernel 6.12.25 with ZFS
> 2.3.2, would it make sense to run "zpool scrub" from there, or could
> this make things worse?

I do not see changes to scrub in 2.3.2. So maybe it’s better choice
not to run scrub on a version that would panic on access to your pool.

> And final question for now: Would it make sense to open an issue at
> OpenZFS on GitHub, or at least open a discussion topic there? I still
> think, this is pretty unexpected for a mere ZFS user to end in a
> situation where the system breaks if you upgrade but works "flawlessly"
> on the older patch-level version.

Sure, definitely it helps. One reason is what you mentioned - some abrupt
changes might shock users. And actually some of the IO hangs you got
do not seem normal either, e.g. blocking of user programs or kthreads.

Thanks,
Shengqi Chen