Bug#933366: scikit-learn: Please upgrade to 0.21.0 or later

Wed Jan 8 23:52:03 GMT 2020

Package: src:scikit-learn
Followup-For: Bug #933366

Hello,
i was trying to rebuild 0.20.3+dfsg-0.1 to disable the python2 autopkgtests but
now that version FTBFS; i believe it's becaues it's rather old (March 2019) and
a lot of other packages have been upgraded recently (namely scipy and numpy,
probably more anyway).

so i'm gonna start working on packaging 0.22.1 which has just been released,
hopefully i'll complete this work soon.

FTR the failure is:

```
=================================== FAILURES ===================================
___________________________ test_scale_and_stability ___________________________

    def test_scale_and_stability():
        # We test scale=True parameter
        # This allows to check numerical stability over platforms as well

        d = load_linnerud()
        X1 = d.data
        Y1 = d.target
        # causes X[:, -1].std() to be zero
        X1[:, -1] = 1.0

        # From bug #2821
        # Test with X2, T2 s.t. clf.x_score[:, 1] == 0, clf.y_score[:, 1] == 0
        # This test robustness of algorithm when dealing with value close to 0
        X2 = np.array([[0., 0., 1.],
                       [1., 0., 0.],
                       [2., 2., 2.],
                       [3., 5., 4.]])
        Y2 = np.array([[0.1, -0.2],
                       [0.9, 1.1],
                       [6.2, 5.9],
                       [11.9, 12.3]])

        for (X, Y) in [(X1, Y1), (X2, Y2)]:
            X_std = X.std(axis=0, ddof=1)
            X_std[X_std == 0] = 1
            Y_std = Y.std(axis=0, ddof=1)
            Y_std[Y_std == 0] = 1

            X_s = (X - X.mean(axis=0)) / X_std
            Y_s = (Y - Y.mean(axis=0)) / Y_std

            for clf in [CCA(), pls_.PLSCanonical(), pls_.PLSRegression(),
                        pls_.PLSSVD()]:
                clf.set_params(scale=True)
                X_score, Y_score = clf.fit_transform(X, Y)
                clf.set_params(scale=False)
                X_s_score, Y_s_score = clf.fit_transform(X_s, Y_s)
>               assert_array_almost_equal(X_s_score, X_score)
E               AssertionError: 
E               Arrays are not almost equal to 6 decimals
E               
E               Mismatch: 50%
E               Max absolute difference: 5.15227746e-06
E               Max relative difference: 0.00011717
E                x: array([[-1.337317, -0.041709],
E                      [-1.108472,  0.098156],
E                      [ 0.407632, -0.10308 ],
E                      [ 2.038158,  0.046633]])
E                y: array([[-1.337317, -0.041713],
E                      [-1.108472,  0.098159],
E                      [ 0.407632, -0.103075],
E                      [ 2.038158,  0.04663 ]])

sklearn/cross_decomposition/tests/test_pls.py:360: AssertionError
____________________________ test_unsorted_indices _____________________________

    def test_unsorted_indices():
        # test that the result with sorted and unsorted indices in csr is the same
        # we use a subset of digits as iris, blobs or make_classification didn't
        # show the problem
        digits = load_digits()
        X, y = digits.data[:50], digits.target[:50]
        X_test = sparse.csr_matrix(digits.data[50:100])

        X_sparse = sparse.csr_matrix(X)
        coef_dense = svm.SVC(kernel='linear', probability=True,
                             random_state=0).fit(X, y).coef_
        sparse_svc = svm.SVC(kernel='linear', probability=True,
                             random_state=0).fit(X_sparse, y)
        coef_sorted = sparse_svc.coef_
        # make sure dense and sparse SVM give the same result
        assert_array_almost_equal(coef_dense, coef_sorted.toarray())

        X_sparse_unsorted = X_sparse[np.arange(X.shape[0])]
        X_test_unsorted = X_test[np.arange(X_test.shape[0])]

        # make sure we scramble the indices
>       assert_false(X_sparse_unsorted.has_sorted_indices)

sklearn/svm/tests/test_sparse.py:118: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <sklearn.utils._unittest_backport.TestCase testMethod=__init__>, expr = 1
msg = '1 is not false'

    def assertFalse(self, expr, msg=None):
        """Check that the expression is false."""
        if expr:
            msg = self._formatMessage(msg, "%s is not false" % safe_repr(expr))
>           raise self.failureException(msg)
E           AssertionError: 1 is not false

/usr/lib/python3.8/unittest/case.py:759: AssertionError
=========================== short test summary info ============================
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: This test is failing on the buildbot, but cannot reproduce. Temporarily disabling it until it can be reproduced and  fixed.
SKIPPED [3] /usr/lib/python3/dist-packages/_pytest/nose.py:32: Download 20 newsgroups to run this test
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: California housing dataset can not be loaded.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: Covertype dataset can not be loaded.
SKIPPED [2] /usr/lib/python3/dist-packages/_pytest/nose.py:32: kddcup99 dataset can not be loaded.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: Download RCV1 dataset to run this test.
SKIPPED [2] /usr/lib/python3/dist-packages/_pytest/nose.py:32: skipping mini_batch_fit_transform.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: test_bayesian_on_diabetes is broken
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: pyamg not available.
SKIPPED [4] /build/scikit-learn-0.20.3+dfsg/.pybuild/cpython3_3.8/build/sklearn/preprocessing/tests/test_data.py:732: 'with_mean=True' cannot be used with sparse matrix.
SKIPPED [2] /build/scikit-learn-0.20.3+dfsg/.pybuild/cpython3_3.8/build/sklearn/preprocessing/tests/test_data.py:939: RobustScaler cannot center sparse matrix
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: score_samples of BernoulliRBM is not invariant when applied to a subset.
SKIPPED [3] /usr/lib/python3/dist-packages/_pytest/nose.py:32: Skipping check_estimators_data_not_an_array for cross decomposition module as estimators are not deterministic.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: transform of MiniBatchSparsePCA is not invariant when applied to a subset.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: Not testing NuSVC class weight as it is ignored.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: decision_function of SVC is not invariant when applied to a subset.
SKIPPED [1] /usr/lib/python3/dist-packages/_pytest/nose.py:32: transform of SparsePCA is not invariant when applied to a subset.
SKIPPED [1] /build/scikit-learn-0.20.3+dfsg/.pybuild/cpython3_3.8/build/sklearn/tests/test_site_joblib.py:18: joblib is physically unvendored (e.g. as in debian)
SKIPPED [1] .pybuild/cpython3_3.8/build/sklearn/utils/tests/test_show_versions.py:28: https://buildd.debian.org/status/fetch.php?pkg=scikit-learn&arch=ppc64el&ver=0.18-3&stamp=1475603905
= 2 failed, 10160 passed, 29 skipped, 1 deselected, 1 xfailed, 986 warnings in 414.08 seconds =
E: pybuild pybuild:341: test: plugin distutils failed with: exit code=1: cd /build/scikit-learn-0.20.3+dfsg/.pybuild/cpython3_3.8/build; python3.8 -m pytest -m "not network" -v
```

Cheers,
Sandro

-- System Information:
Debian Release: 10.0
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, 'experimental-debug'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.19.0-5-amd64 (SMP w/8 CPU cores)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE= (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled