Bug#980517: debsums: Parallel checking

Axel Beckert abe at debian.org
Wed Jan 20 11:19:53 GMT 2021

Control: severity -1 wishlist

Hi Witold,

Witold Baryluk wrote:
> On my 32 core system, and a lot of packages installed (~7000), it takes
> about one hour for the debsum to check all the files. Despite ability to
> read all files from the storage in about 4 minutes.

Oh, ok. I would have expected the I/O to be the bottleneck.

> The issue is use of just one thread and most likely not optimized
> md5sum implementation.

It uses Perl's Digest::MD5 module. 

> I think it would be very useful to be able to specify number of parallel
> threads to use when doing checking manually or from cron.

Ack. I though don't see that as feature and not really as a bug. Hence
setting the severity to wishlist.

Also I'm currently not sure how to implement this properly. Splitting
up the list of files or packages to check and then starting debsums
for 1/n of these as a subprocess, gathering the output and merging it?

Or trying to find a hashing tool which parallelises this for each
package. But then again, this likely will generate extra overhead for
the huge number of small packages and hence might outweight

IO::AIO might be a potential solution. There's even an MD5 example in
https://metacpan.org/pod/IO::AIO, but it focusses huge files to hash
via mmapping and not really on parallelism (I guess).

> I think it would even be good to enable it by default.

Not sure about that as it takes a lot of factors more to choose the
right value:

* Current load of the system
* Installation on what kind of medium? (Spinning disk, SATA SSD, NVMe
  SSD, RAID1, etc.)
* Number of threads available in the CPU (of course, too :-)

> (for the case of use from cron, usage of nice / schedtool and/or
> ionice could mitigate any issues on server or laptops).

The cron jobs already use ionice if it is installed. :-)

		Regards, Axel
 ,''`.  |  Axel Beckert <abe at debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE

More information about the pkg-perl-maintainers mailing list