Bug#923707: statsmodels FTBFS: array |= frame now returns frame
Rebecca N. Palmer
rebecca_palmer at zoho.com
Mon Mar 4 07:50:47 GMT 2019
Source: pandas
Version: 0.23+dfsg-2
Severity: serious
Control: affects -1 src:statsmodels
Control: tags -1 patch
The fix for #918206 (setting __array_priority__ to make np.array @
DataFrame work) is technically API breaking: in-place operators
arr = np.array(...)
df = pd.DataFrame(...)
arr += df
used to leave arr as an np.array (though this does not appear to have
been documented), but now turn it into a DataFrame.
The only known [0] place where this fails a test is in statsmodels: as
[1] now returns a DataFrame rather than a Series, it is now an exception
for data passed to statsmodels.formula.api.OLS to contain both missing
(NaN) values and duplicate index labels. As
statsmodels.regression.tests.test_regression.test_missing_formula_predict
contains such data, statsmodels 0.8.0-9 hence failed to build on the
architectures where pandas was built first [2].
pandas upstream don't appear to have noticed this: they don't mention it
in the discussion or release notes of the fix [3]. statsmodels
upstream's response was to make this test stop using duplicate index
names (without explicit comment) [4].
An alternative fix for #918206 that doesn't do this is
pandas/core/generic.py
def __array_wrap__(self, result, context=None):
d = self._construct_axes_dict(self._AXIS_ORDERS, copy=False)
+ if context is not None and context[0]==np.matmul
+ and not hasattr(context[1][0],'index'):
+ del d['index']
return self._constructor(result, **d).__finalize__(self)
# ideally we would define this to avoid the getattr checks, but
but this has not yet been tested in a full build.
[0] autopkgtest results - https://release.debian.org/britney/excuses.yaml
[1]
https://sources.debian.org/src/patsy/0.5.0+git13-g54dcf7b-1/patsy/missing.py/#L136
[2]
https://buildd.debian.org/status/fetch.php?pkg=statsmodels&arch=amd64&ver=0.8.0-9&stamp=1551564937&raw=0
[3] https://github.com/pandas-dev/pandas/pull/23114
https://github.com/pandas-dev/pandas/commit/ad2a14f4bec8a004b2972c12f12ed3e4ce37ff52
[4] (please *do not upload* this without discussion, or we may lose my
other statsmodels work to the freeze)
https://github.com/statsmodels/statsmodels/commit/30c9ddbff8a072cbc1bebc7550b667e760cb386a#diff-2708847815406a7890933a960465d8e8
More information about the debian-science-maintainers
mailing list