Bug#1015805: scikit-learn tries to access network during documentation build

M. Zhou lumin at debian.org
Thu Jul 21 16:37:13 BST 2022


Source: scikit-learn
Version: 1.1.1-1
Severity: serious
Justification: Policy section 4.9 violation

There are loads of similar traceback message saying the documentation build
has failed to retrieve some URL, like this:

```
generating gallery for auto_examples/decomposition... [ 30%] plot_faces_decomposition.py                                           
WARNING: /<<PKGBUILDDIR>>/examples/decomposition/plot_faces_decomposition.py failed to execute correctly: Traceback (most recent ca
ll last):                                                                                                                          
  File "/<<PKGBUILDDIR>>/examples/decomposition/plot_faces_decomposition.py", line 36, in <module>                                 
    faces, _ = fetch_olivetti_faces(return_X_y=True, shuffle=True, random_state=rng)                                               
  File "/<<PKGBUILDDIR>>/.pybuild/cpython3_3.10/build/sklearn/datasets/_olivetti_faces.py", line 117, in fetch_olivetti_faces      
    mat_path = _fetch_remote(FACES, dirname=data_home)                                                                             
  File "/<<PKGBUILDDIR>>/.pybuild/cpython3_3.10/build/sklearn/datasets/_base.py", line 1511, in _fetch_remote                      
    urlretrieve(remote.url, file_path)                                                                                             
  File "/usr/lib/python3.10/urllib/request.py", line 241, in urlretrieve                                                           
    with contextlib.closing(urlopen(url, data)) as fp:                                                                             
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen                                                               
    return opener.open(url, data, timeout)                                                                                         
  File "/usr/lib/python3.10/urllib/request.py", line 519, in open                                                                  
    response = self._open(req, data)                                                                                               
  File "/usr/lib/python3.10/urllib/request.py", line 536, in _open                                                                 
    result = self._call_chain(self.handle_open, protocol, protocol +                                                               
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain                                                           
    result = func(*args)                                                                                                           
  File "/usr/lib/python3.10/urllib/request.py", line 1391, in https_open                                                           
    return self.do_open(http.client.HTTPSConnection, req,                                                                          
  File "/usr/lib/python3.10/urllib/request.py", line 1351, in do_open                                                              
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>
```

This is clearly policy violation and should be patched.
This issue is found during the QEMU build on ppc64el machine for the armel architecture, and it extremly slows down the building
process likely due to URL access timeout.

As a result, the URL access timeout took the whole night and the doc build is not yet finished by a half.
Well, I guess I will have to wait for two or three days to see the discussed armel segfault in qemu with this problem unfixed.



More information about the debian-science-maintainers mailing list