scipy and pooch fetching data during build

PICCA Frederic-Emmanuel frederic-emmanuel.picca at synchrotron-soleil.fr
Sun Jun 16 09:44:25 BST 2024


Hello,

I am trying to package the next version of silx.

I get the error message bellow during the build 

Indeed silx use pooch via scipy in order to download the test data.

So my question is, what is the best solution in order to solve this issue ?

I want to run the unit test with these data. So at some point we should have them packaged.

I can download this dataset and embeded it in my package, put it at the right place, so pooch knows that it already have the data.
or should we propose a scipy-dataset packages with a collection of dataset usefull for other packages tests
and find a way to register in pooch the fac tha these data's are already on the system.

I do not know how many packages are about to use pooch like this in the futur.

Nevetheless, it would be great to have  the pont of view of scipy and pooch maintainer.


thanks

Frederic

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/scipy/datasets/_fetchers.py:71: in ascent
    fname = fetch_data("ascent.dat")
/usr/lib/python3/dist-packages/scipy/datasets/_fetchers.py:31: in fetch_data
    return data_fetcher.fetch(dataset_name)
/usr/lib/python3/dist-packages/pooch/core.py:589: in fetch
    stream_download(
/usr/lib/python3/dist-packages/pooch/core.py:807: in stream_download
    downloader(url, tmp, pooch)
/usr/lib/python3/dist-packages/pooch/downloaders.py:220: in __call__
    response = requests.get(url, timeout=timeout, **kwargs)
/usr/lib/python3/dist-packages/requests/api.py:73: in get
    return request("get", url, params=params, **kwargs)
/usr/lib/python3/dist-packages/requests/api.py:59: in request
    return session.request(method=method, url=url, **kwargs)
/usr/lib/python3/dist-packages/requests/sessions.py:589: in request
    resp = self.send(prep, **send_kwargs)
/usr/lib/python3/dist-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <requests.adapters.HTTPAdapter object at 0x7f9d07db24e0>
request = <PreparedRequest [GET]>, stream = True
timeout = Timeout(connect=30, read=30, total=None), verify = True, cert = None
proxies = OrderedDict({'no': 'localhost', 'https': 'https://127.0.0.1:9/', 'http': 'http://127.0.0.1:9/'})

    def send(
        self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
    ):
        """Sends PreparedRequest object. Returns Response object.
    
        :param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
        :param stream: (optional) Whether to stream the request content.
        :param timeout: (optional) How long to wait for the server to send
            data before giving up, as a float, or a :ref:`(connect timeout,
            read timeout) <timeouts>` tuple.
        :type timeout: float or tuple or urllib3 Timeout object
        :param verify: (optional) Either a boolean, in which case it controls whether
            we verify the server's TLS certificate, or a string, in which case it
            must be a path to a CA bundle to use
        :param cert: (optional) Any user-provided SSL certificate to be trusted.
        :param proxies: (optional) The proxies dictionary to apply to the request.
        :rtype: requests.Response
        """
    
        try:
            conn = self.get_connection_with_tls_context(
                request, verify, proxies=proxies, cert=cert
            )
        except LocationValueError as e:
            raise InvalidURL(e, request=request)
    
        self.cert_verify(conn, request.url, verify, cert)
        url = self.request_url(request, proxies)
        self.add_headers(
            request,
            stream=stream,
            timeout=timeout,
            verify=verify,
            cert=cert,
            proxies=proxies,
        )
    
        chunked = not (request.body is None or "Content-Length" in request.headers)
    
        if isinstance(timeout, tuple):
            try:
                connect, read = timeout
                timeout = TimeoutSauce(connect=connect, read=read)
            except ValueError:
                raise ValueError(
                    f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
                    f"or a single float to set both timeouts to the same value."
                )
        elif isinstance(timeout, TimeoutSauce):
            pass
        else:
            timeout = TimeoutSauce(connect=timeout, read=timeout)
    
        try:
            resp = conn.urlopen(
                method=request.method,
                url=url,
                body=request.body,
                headers=request.headers,
                redirect=False,
                assert_same_host=False,
                preload_content=False,
                decode_content=False,
                retries=self.max_retries,
                timeout=timeout,
                chunked=chunked,
            )
    
        except (ProtocolError, OSError) as err:
            raise ConnectionError(err, request=request)
    
        except MaxRetryError as e:
            if isinstance(e.reason, ConnectTimeoutError):
                # TODO: Remove this in 3.0.0: see #2811
                if not isinstance(e.reason, NewConnectionError):
                    raise ConnectTimeout(e, request=request)
    
            if isinstance(e.reason, ResponseError):
                raise RetryError(e, request=request)
    
            if isinstance(e.reason, _ProxyError):
>               raise ProxyError(e, request=request)
E               requests.exceptions.ProxyError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /scipy/dataset-ascent/main/ascent.dat (Caused by ProxyError('Unable to connect to proxy', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f9d07db3320>: Failed to establish a new connection: [Errno 111] Connection refused')))

/usr/lib/python3/dist-packages/requests/adapters.py:694: ProxyError
----------------------------- Captured stderr call -----------------------------
Downloading file 'ascent.dat' from 'https://raw.githubusercontent.com/scipy/dataset-ascent/main/ascent.dat' to '/builds/science-team/silx/debian/output/source_dir/.pybuild/cpython3_3.12_silx/.cache/scipy-data'.



More information about the debian-science-maintainers mailing list