[Pkg-zfsonlinux-devel] zfs-dkms and Intel QAT

Chandler admin at genome.arizona.edu
Mon Feb 27 13:05:51 GMT 2023


Aron Xu wrote on 2/25/23 1:17 AM:
> It seems that nobody here has access to Intel's QAT hardware, so I'm
> afraid it's hard to get help. But in case you find the solution, it's
> welcomed to contribute back.

Hi Aron, yes that's understandable and what I figured as well.  I've
asked for help in several places, thanks for following up and reminding
me here.  Most of my testing and results and conclusions have been
posted to various issues on the ZFS Github page
https://github.com/openzfs/zfs  There is another Debian ZFS QAT user
there who was having issues too so we at least could work together.

Regarding this particular issue.  The problem seems to be that the
kernel will automatically load modules as soon as it detects hardware
that needs/uses those modules.  In this case, the kernel was detecting
ZFS partitions on the disks and auto-loading the ZFS module before the
QAT had loaded its modules.  It wasn't clear to me why this happened,
for at least a couple reasons:

1. I thought the ZFS module had a dependency on the QAT module, so if it
was being loaded/inserted then it would trigger the QAT module to be
inserted first.  `lsmod` states that qat_api is used by zfs, and
`modinfo` states that zfs depends on qat_api.

2. The kernel scans the PCI bus before it scans the disk partitions, and
so it was actually detecting the Intel QAT in the PCI-E slot first.  I
guess it doesn't have the proper code to recognize that and load the
related modules... probably similarly because no one has access to this
hardware.

So it took me a while to figure out how to overcome this and I asked for
help in several places too but in the end I figured it out after many
days if not weeks of trial and error.  I tried so many things in
frustration and I probably enumerated them all on some linux-modules
mailing list but in the end what I had to do was falsify the ZFS module
to the system with this configuration:

# cat /etc/modprobe.d/zfs-falsify.conf
install	zfs	/usr/bin/false
#

Now, whenever anyone wanted to load the zfs module it would just get
false in return!  This of course also affected the systemd
zfs-load-module.service which is the one I wanted to be loading the
module, so I had to update that service with `systemctl edit --full
zfs-load-module.service` with this modification to ExecStart:

ExecStart=/sbin/modprobe -iv zfs

The "-i" option is the needed one, to ignore the install command and
load the module normally.  Finally ZFS was back in charge of loading its
own module, but that was only part of the battle: the module was still
loading before the QAT module was loaded.  In the end I had to add
several "Requires" and "After" to a few of the ZFS systemd services so
they waited for the QAT service to finish starting which brings up the
QAT engines... although when I check my current setup, I don't have any
of those anymore in /etc/systemd/system, so maybe I messed something up
previously that affected the service start order, so that's good at
least.  I must have rebooted our computer 100 times over the past month.
 After it had been up for ~750 days previously, it deserved it!  But
everything is good now, so here's to another 750 days!  I put `echo
--with-qat=</path/to/Intel/QAT/drivers>` in /etc/dkms/zfs.conf and it
picks it right up, version 2.1.9-1~bpo11+1 up and running currently.

Also it seems the new zstd compression algorithm is nearly as efficient
as gzip and many times faster, not requiring co-processors for decent
performance.  I'll be moving our home directories over soon from a
machine with ZFS+QAT to ZFS+Zstd and will see how it goes.  My
preliminary testing indicates there won't be much difference in
performance or storage savings.  So Intel QAT may become even less
accessible in the future.

Anyway I think that's it for now, take care!

Best Regards,
Chandler



More information about the Pkg-zfsonlinux-devel mailing list