Bug#944617: python3-h5py import performance severely degraded in 2.10.0 release (due to OpenMPI?)
Drew Parsons
dparsons at debian.org
Mon Nov 18 04:29:17 GMT 2019
Source: h5py
Followup-For: Bug #944617
There is additional overhead in h5py 2.10 compared to 2.8.
Comparing 2.10 with and without mpi support shows the load-up
difference with mpi to be slower by a factor of only 2-3 rather
than ×7.
h5py 2.10.0 with mpi support:
$ multitime -qq -n 10 python3 -c 'import h5py'
===> multitime results
1: -qq python3 -c "import h5py"
Mean Std.Dev. Min Median Max
real 0.696 0.123 0.637 0.659 1.065
user 0.608 0.052 0.480 0.617 0.665
sys 0.313 0.049 0.196 0.315 0.394
h5py 2.10.0 without mpi support:
$ multitime -qq -n 10 python3 -c 'import h5py'
===> multitime results
1: -qq python3 -c "import h5py"
Mean Std.Dev. Min Median Max
real 0.293 0.048 0.260 0.270 0.414
user 0.549 0.036 0.479 0.552 0.605
sys 0.269 0.022 0.228 0.264 0.301
But note that this test only measures the time for loading the h5py
module itself, so it does not provide a good measure of performance
with mpi support available. It's not fair to characterise it as ×2.5
slower, since this is a once-off cost in CPU time. i.e. the relevant
quantity here is the additional 0.4 sec of time to load the module.
It's a bit of a stretch to say that 0.4 sec is a severe performance
penalty, I think.
To measure performance, you would need to measure the time taken to
work with the actual data files, e.g. to load a large data file (say
4-8 GB data). It would be interesting if you could run this kind of
performance test.
More information about the debian-science-maintainers
mailing list