Bug#1043968: How to handle upstream dependencies on PyPI tzdata?

Rebecca N. Palmer rebecca_palmer at zoho.com
Sun Aug 13 14:33:12 BST 2023


Package: python3-pandas
Version: 2.0.3+dfsg-1
Control: notfound -1 1.5.3+dfsg-4
Control: block 1043240 by -1

python3-pip and similar Python tools usually count system python3-X as 
satisfying Python dependencies on X.  Hence, upstream build scripts that 
attempt to pip install X can usually work without network access (e.g. 
in a Debian package build) if python3-X is already installed.

PyPI has a tzdata package [0].  This is mostly intended for Windows, and 
the Python standard library defaults to using the system tzdata not the 
PyPI tzdata if both are installed [1].  However, because the system 
tzdata is not a Python package, it does not participate in the above 
mechanism: python3-pip does *not* treat system tzdata as satisfying 
Python dependencies on tzdata.

pandas 2.0 declares an unconditional dependency on Python tzdata [2]. 
As far as I know, it only actually uses tzdata via the Python standard 
library, and hence works without Python tzdata.  However, mlpack, 
macsyfinder and emperor attempt to pip install pandas as part of their 
build-time tests.  With pandas 2.0, this notices the broken dependency, 
can't download tzdata from PyPI because package builds are blocked from 
accessing the network, and hence fails [3].

python-django-ca also declares an unconditional dependency on Python 
tzdata.  According to codesearch, nothing else in Debian currently does.

How should this be handled?

One option is for Debian python3-pandas to patch out the declaration of 
a Python tzdata dependency.  (It already does declare a Debian 
dependency on python3-tz.)  This is my current plan for pandas, but only 
works for software that only uses tzdata via the Python standard 
library, and it would be easy to not notice if that changed.

Another option would be to package PyPI tzdata, but that involves 
duplication.  It may be possible to reduce the wasted space using symlinks.

[0] https://pypi.org/project/tzdata/
[1] https://peps.python.org/pep-0615/#sources-for-time-zone-data
[2] https://github.com/pandas-dev/pandas/pull/51247
[3] 
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483272/+files/buildlog_ubuntu-mantic-amd64.mlpack_4.1.0-1ubuntu1_BUILDING.txt.gz 
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483264/+files/buildlog_ubuntu-mantic-amd64.macsyfinder_2.0-2_BUILDING.txt.gz 
https://launchpad.net/~rebecca-palmer/+archive/ubuntu/pandas2p0/+build/26483248/+files/buildlog_ubuntu-mantic-amd64.emperor_1.0.3+ds-7_BUILDING.txt.gz



More information about the debian-science-maintainers mailing list