Bug#1053314: Depends: python3-h5py-mpi without python3-h5py
Rafael Laboissière
rafael at debian.org
Mon Oct 9 07:30:41 BST 2023
Thanks for this detailed explanation, Drew. I released version
3.1.0+dfsg2-5 of the xmds2 package before reading it. I added
python3-h5py to Build-Depends and libhdf5-mpi-dev to Depends, as you
suggested (even though there is a typo in the debian/changelog entry,
stating eroneaously that libhdf5-serial-dev has been added; I will fix
this in the next release).
I also used H5PY_ALWAYS_USE_MPI=1, as you mentioned.
As regards adding also python3-h5py-serial to Depends and putting a
fallback code in place, I will have to give it a little thought. Maybe, I
should discuss this with the upstream authors, to know what they thing.
Let us see how things evolve. At least, I hope that version 3.1.0+dfsg2-5
will really fix Bug#1053314 and the h5py transition will be completed.
Best,
Rafael
* Drew Parsons <dparsons at debian.org> [2023-10-09 02:23]:
> Nilesh explained most of the situation correctly. I can give some
> more background. It made sense (to me) to have h5py built against
> hdf5-mpi, since I figured that if you need the complexity of the hdf5
> file format then you probably want to use it in an MPI environment.
>
> There was a complaint from a user though, who wanted to make use of a
> massive ensemble of HDF5 (h5py) serial jobs, and the small cost of
> loading up MPI support was interfering with their throughput.
>
> So the compromise solution was to provide both builds, with a custom
> __init__.py to select the serial or MPI build depending on runtime
> environment. If an MPI environment is detected then the h5py MPI
> build is loaded, otherwise the serial build is loaded.
>
> If you want to run h5py in a serial process, then one might say you'd
> normally want the serial build. As Nilesh noted, I put in a mechanism
> to load the MPI build if you really want to access the mpi build in a
> serial process (mpirun -n 1 is not a "serial" process as such, it's
> still an MPI environment even though using only 1 process).
>
> The mechanism to force MPI loading is NOT to set OMPI_COMM_WORLD_SIZE.
> I recommend NOT doing that. I couldn't promise it won't mess up other
> things, certainly it will get in the way of an MPICH environment. No,
> the mechanism for handling this for h5py is described in
> /usr/share/doc/python3-h5py/README.Debian: set H5PY_ALWAYS_USE_MPI=1
>
>> Is there a way to force h5py to import _debian_h5py_serial instead of
>> _debian_h5py_mpi, via the generic h5py namespace?
>
> It sounds like there is some confusion about how xmds2 should be used.
> Is it intended to be used as a serial process or MPI? I noted in the
> bug report that xmds2 Depends: libhdf5-serial-dev. Is it even using
> MPI? If you want it to be using h5py-serial, then why does xmsd2
> depend on python3-h5py-mpi?
>
> It seems to me that xmds2's h5py dependency should be the same as its
> hdf5 dependency. If it uses libhdf5-serial then should it be
> depending on just python3-h5py (implying python3-h5py-serial, make it
> explicit if needed) and not depend on python3-h5py-mpi?
>
> If xmds2 is intended to be flexible, equally happy in serial and MPI
> environments (and can actually make use of h5py-mpi) then perhaps the
> dependency should cover all cases,
> Depends: python3-h5py, python3-h5py-serial, python3-h5py-mpi
> all three explicitly, since otherwise one or the other of -serial or
> -mpi would be missed.
>
> The problem raises interesting questions about h5py configuration. I
> set up it so you could choose how you want it to work, with or without
> MPI support. But it looks like an edge case is missing: it's failing
> in serial jobs if you chose to set up your installation with
> python3-h5py-mpi and explicitly don't want python3-h5py-serial (unless
> you always set H5PY_ALWAYS_USE_MPI). Perhaps I should add an
> additional fallback to try h5py-mpi if h5py-serial is not found (in a
> serial job), the same way that h5py-serial is loaded as a fallback in
> an MPI job if h5py-mpi is not found. On the other hand maybe that just
> hides the real problem, that h5py-serial was not installed when
> actually it was wanted after all. The ImportError correctly
> identifies that case.
>
>
>
>
> On 2023-10-08 17:38, Nilesh Patra wrote:
>> Hello,
>>
>> On 10/8/23 17:22, Rafael Laboissière wrote:
>>> Ok, I tried to fix the building problem by including python3-h5py,
>>> alongside with python3-h5py-mpi, into Build-Depends, as suggested
>>> by Drew, but the xmds2 package FTBFS.
>>>
>>> Here is a way to reproduce the problem without building the package:
>>>
>>> $ dpkg -l python3-h5py\*
>>> Desired=Unknown/Install/Remove/Purge/Hold
>>> | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
>>> |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
>>> ||/ Name Version Architecture Description
>>> +++-===================-============-============-=======================================================
>>> ii python3-h5py 3.9.0-3 all
>>> general-purpose Python interface to hdf5
>>> ii python3-h5py-mpi 3.9.0-3 amd64
>>> general-purpose Python interface to hdf5 (Python 3 MPI)
>>> un python3-h5py-serial <none> <none> (no
>>> description available)
>>> $ echo 'import h5py' | python3
>>> Traceback (most recent call last):
>>> File "<stdin>", line 1, in <module>
>>> File "/usr/lib/python3/dist-packages/h5py/__init__.py", line
>>> 21, in <module>
>>> from . import _debian_h5py_serial as _h5py
>>> ImportError: cannot import name '_debian_h5py_serial' from
>>> partially initialized module 'h5py' (most likely due to a circular
>>> import) (/usr/lib/python3/dist-packages/h5py/__init__.py)
>>>
>>> Is there a way to force h5py to import _debian_h5py_serial instead
>>> of _debian_h5py_mpi, via the generic h5py namespace?
>>
>> Drew would probably answer that question better but from taking a
>> brief look, it seems to be on expected lines.
>> This should work if you run it explicitly with mpi.
>>
>> $ mpirun -n 1 python3 -c "import h5py" && echo "true"
>> true
>>
>> or with setting the MPI var manually.
>>
>> $ OMPI_COMM_WORLD_SIZE=1 python3 -c "import h5py" && echo "true"
>> true
>>
>> If you want the _debian_h5py_serial interface then you need
>> python3-h5py-serial and the B-D (and Depends) on h5py-mpi
>> should be dropped which would mean this package does not need the
>> -mpi package.
>>
>> Otherwise, a (unreliable) hack that you could do it that add a B-D on
>> h5py *before* mpi and then -serial should also be installed (at least
>> on my env).
>>
>> If the code really needs h5py-mpi, then it should be running the
>> build/tests with mpi enabled (via openmpi).
>> At least that's the impression I get from reading.
>>
>> https://sources.debian.org/src/h5py/3.9.0-3/debian/README.Debian/
>>
>> This patch gets the package building for me with h5py-mpi+h5py, but
>> not sure if it is the right thing to do -- please verify for yourself
>> as package maintainer :)
>>
>> --- a/xpdeint/XSILFile.py
>> +++ b/xpdeint/XSILFile.py
>> @@ -31,6 +31,9 @@
>> numpy = None
>> +# Set env var to use h5py-mpi
>> +os.environ['OMPI_COMM_WORLD_SIZE'] = '1'
>> +
>> def require_h5py():
>> global h5py
>> if not h5py:
>
More information about the debian-science-maintainers
mailing list