Bug#1043968: How to handle upstream dependencies on PyPI tzdata?
Rebecca N. Palmer
rebecca_palmer at zoho.com
Sun Aug 13 14:33:12 BST 2023
Package: python3-pandas
Version: 2.0.3+dfsg-1
Control: notfound -1 1.5.3+dfsg-4
Control: block 1043240 by -1
python3-pip and similar Python tools usually count system python3-X as
satisfying Python dependencies on X. Hence, upstream build scripts that
attempt to pip install X can usually work without network access (e.g.
in a Debian package build) if python3-X is already installed.
PyPI has a tzdata package [0]. This is mostly intended for Windows, and
the Python standard library defaults to using the system tzdata not the
PyPI tzdata if both are installed [1]. However, because the system
tzdata is not a Python package, it does not participate in the above
mechanism: python3-pip does *not* treat system tzdata as satisfying
Python dependencies on tzdata.
pandas 2.0 declares an unconditional dependency on Python tzdata [2].
As far as I know, it only actually uses tzdata via the Python standard
library, and hence works without Python tzdata. However, mlpack,
macsyfinder and emperor attempt to pip install pandas as part of their
build-time tests. With pandas 2.0, this notices the broken dependency,
can't download tzdata from PyPI because package builds are blocked from
accessing the network, and hence fails [3].
python-django-ca also declares an unconditional dependency on Python
tzdata. According to codesearch, nothing else in Debian currently does.
How should this be handled?
One option is for Debian python3-pandas to patch out the declaration of
a Python tzdata dependency. (It already does declare a Debian
dependency on python3-tz.) This is my current plan for pandas, but only
works for software that only uses tzdata via the Python standard
library, and it would be easy to not notice if that changed.
Another option would be to package PyPI tzdata, but that involves
duplication. It may be possible to reduce the wasted space using symlinks.
[0] https://pypi.org/project/tzdata/
[1] https://peps.python.org/pep-0615/#sources-for-time-zone-data
[2] https://github.com/pandas-dev/pandas/pull/51247
[3]
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483272/+files/buildlog_ubuntu-mantic-amd64.mlpack_4.1.0-1ubuntu1_BUILDING.txt.gz
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483264/+files/buildlog_ubuntu-mantic-amd64.macsyfinder_2.0-2_BUILDING.txt.gz
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483248/+files/buildlog_ubuntu-mantic-amd64.emperor_1.0.3+ds-7_BUILDING.txt.gz
More information about the debian-science-maintainers
mailing list