Bug#1053314: Depends: python3-h5py-mpi without python3-h5py

Rafael Laboissière rafael at debian.org
Mon Oct 9 07:30:41 BST 2023


Thanks for this detailed explanation, Drew. I released version 
3.1.0+dfsg2-5 of the xmds2 package before reading it. I added 
python3-h5py to Build-Depends and libhdf5-mpi-dev to Depends, as you 
suggested (even though there is a typo in the debian/changelog entry, 
stating eroneaously that libhdf5-serial-dev has been added; I will fix 
this in the next release).

I also used H5PY_ALWAYS_USE_MPI=1, as you mentioned.

As regards adding also python3-h5py-serial to Depends and putting a 
fallback code in place, I will have to give it a little thought. Maybe, I 
should discuss this with the upstream authors, to know what they thing. 
Let us see how things evolve. At least, I hope that version 3.1.0+dfsg2-5 
will really fix Bug#1053314 and the h5py transition will be completed.

Best,

Rafael

* Drew Parsons <dparsons at debian.org> [2023-10-09 02:23]:

> Nilesh explained most of the situation correctly.  I can give some 
> more background.    It made sense (to me) to have h5py built against 
> hdf5-mpi, since I figured that if you need the complexity of the hdf5 
> file format then you probably want to use it in an MPI environment.
>
> There was a complaint from a user though, who wanted to make use of a 
> massive ensemble of HDF5 (h5py) serial jobs, and the small cost of 
> loading up MPI support was interfering with their throughput.
>
> So the compromise solution was to provide both builds, with a custom 
> __init__.py to select the serial or MPI build depending on runtime 
> environment.  If an MPI environment is detected then the h5py MPI 
> build is loaded, otherwise the serial build is loaded.
>
> If you want to run h5py in a serial process, then one might say you'd 
> normally want the serial build.  As Nilesh noted, I put in a mechanism 
> to load the MPI build if you really want to access the mpi build in a 
> serial process (mpirun -n 1 is not a "serial" process as such, it's 
> still an MPI environment even though using only 1 process).
>
> The mechanism to force MPI loading is NOT to set OMPI_COMM_WORLD_SIZE. 
> I recommend NOT doing that. I couldn't promise it won't mess up other 
> things, certainly it will get in the way of an MPICH environment.  No, 
> the mechanism for handling this for h5py is described in 
> /usr/share/doc/python3-h5py/README.Debian: set H5PY_ALWAYS_USE_MPI=1
>
>> Is there a way to force h5py to import _debian_h5py_serial instead of 
>> _debian_h5py_mpi, via the generic h5py namespace?
>
> It sounds like there is some confusion about how xmds2 should be used. 
> Is it intended to be used as a serial process or MPI?  I noted in the 
> bug report that xmds2 Depends: libhdf5-serial-dev.  Is it even using 
> MPI?  If you want it to be using h5py-serial, then why does xmsd2 
> depend on python3-h5py-mpi?
>
> It seems to me that xmds2's h5py dependency should be the same as its 
> hdf5 dependency.  If it uses libhdf5-serial then should it be 
> depending on just python3-h5py (implying python3-h5py-serial, make it 
> explicit if needed) and not depend on python3-h5py-mpi?
>
> If xmds2 is intended to be flexible, equally happy in serial and MPI 
> environments (and can actually make use of h5py-mpi) then perhaps the 
> dependency should cover all cases, 
>  Depends: python3-h5py, python3-h5py-serial, python3-h5py-mpi 
> all three explicitly, since otherwise one or the other of -serial or 
> -mpi would be missed.
>
> The problem raises interesting questions about h5py configuration. I 
> set up it so you could choose how you want it to work, with or without 
> MPI support.  But it looks like an edge case is missing: it's failing 
> in serial jobs if you chose to set up your installation with 
> python3-h5py-mpi and explicitly don't want python3-h5py-serial (unless 
> you always set H5PY_ALWAYS_USE_MPI). Perhaps I should add an 
> additional fallback to try h5py-mpi if h5py-serial is not found (in a 
> serial job), the same way that h5py-serial is loaded as a fallback in 
> an MPI job if h5py-mpi is not found. On the other hand maybe that just 
> hides the real problem, that h5py-serial was not installed when 
> actually it was wanted after all.  The ImportError correctly 
> identifies that case.
>
>
>
>
> On 2023-10-08 17:38, Nilesh Patra wrote:
>> Hello,
>>
>> On 10/8/23 17:22, Rafael Laboissière wrote:
>>> Ok, I tried to fix the building problem by including python3-h5py,
>>> alongside with python3-h5py-mpi, into Build-Depends, as suggested
>>> by Drew, but the xmds2 package FTBFS.
>>>
>>> Here is a way to reproduce the problem without building the package:
>>>
>>>   $ dpkg -l python3-h5py\*
>>>   Desired=Unknown/Install/Remove/Purge/Hold
>>>   | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
>>>   |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
>>>   ||/ Name                Version      Architecture Description
>>>   +++-===================-============-============-=======================================================
>>>   ii  python3-h5py        3.9.0-3      all         
>>> general-purpose Python interface to hdf5 
>>>   ii  python3-h5py-mpi    3.9.0-3      amd64        
>>> general-purpose Python interface to hdf5 (Python 3 MPI) 
>>>   un  python3-h5py-serial <none>       <none>       (no 
>>> description available) 
>>>   $ echo 'import h5py' | python3 
>>>   Traceback (most recent call last): 
>>>     File "<stdin>", line 1, in <module> 
>>>     File "/usr/lib/python3/dist-packages/h5py/__init__.py", line 
>>> 21, in <module> 
>>>       from . import _debian_h5py_serial as _h5py 
>>>   ImportError: cannot import name '_debian_h5py_serial' from 
>>> partially initialized module 'h5py' (most likely due to a circular 
>>> import) (/usr/lib/python3/dist-packages/h5py/__init__.py)
>>>
>>> Is there a way to force h5py to import _debian_h5py_serial instead 
>>> of _debian_h5py_mpi, via the generic h5py namespace?
>>
>> Drew would probably answer that question better but from taking a 
>> brief look, it seems to be on expected lines. 
>> This should work if you run it explicitly with mpi.
>>
>> $ mpirun -n 1 python3 -c "import h5py" && echo "true" 
>> true
>>
>> or with setting the MPI var manually.
>>
>> $ OMPI_COMM_WORLD_SIZE=1 python3 -c "import h5py" && echo "true" 
>> true
>>
>> If you want the _debian_h5py_serial interface then you need 
>> python3-h5py-serial and the B-D (and Depends) on h5py-mpi 
>> should be dropped which would mean this package does not need the 
>> -mpi package.
>>
>> Otherwise, a (unreliable) hack that you could do it that add a B-D on 
>> h5py *before* mpi and then -serial should also be installed (at least 
>> on my env).
>>
>> If the code really needs h5py-mpi, then it should be running the 
>> build/tests with mpi enabled (via openmpi). 
>> At least that's the impression I get from reading.
>>
>> 	https://sources.debian.org/src/h5py/3.9.0-3/debian/README.Debian/
>>
>> This patch gets the package building for me with h5py-mpi+h5py, but 
>> not sure if it is the right thing to do -- please verify for yourself 
>> as package maintainer :)
>>
>> --- a/xpdeint/XSILFile.py 
>> +++ b/xpdeint/XSILFile.py 
>> @@ -31,6 +31,9 @@ 
>>  numpy = None 
>> +# Set env var to use h5py-mpi 
>> +os.environ['OMPI_COMM_WORLD_SIZE'] = '1' 
>> + 
>> def require_h5py(): 
>>   global h5py 
>>   if not h5py:
>



More information about the debian-science-maintainers mailing list