scipy and pooch fetching data during build
PICCA Frederic-Emmanuel
frederic-emmanuel.picca at synchrotron-soleil.fr
Sun Jun 16 09:44:25 BST 2024
Hello,
I am trying to package the next version of silx.
I get the error message bellow during the build
Indeed silx use pooch via scipy in order to download the test data.
So my question is, what is the best solution in order to solve this issue ?
I want to run the unit test with these data. So at some point we should have them packaged.
I can download this dataset and embeded it in my package, put it at the right place, so pooch knows that it already have the data.
or should we propose a scipy-dataset packages with a collection of dataset usefull for other packages tests
and find a way to register in pooch the fac tha these data's are already on the system.
I do not know how many packages are about to use pooch like this in the futur.
Nevetheless, it would be great to have the pont of view of scipy and pooch maintainer.
thanks
Frederic
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/scipy/datasets/_fetchers.py:71: in ascent
fname = fetch_data("ascent.dat")
/usr/lib/python3/dist-packages/scipy/datasets/_fetchers.py:31: in fetch_data
return data_fetcher.fetch(dataset_name)
/usr/lib/python3/dist-packages/pooch/core.py:589: in fetch
stream_download(
/usr/lib/python3/dist-packages/pooch/core.py:807: in stream_download
downloader(url, tmp, pooch)
/usr/lib/python3/dist-packages/pooch/downloaders.py:220: in __call__
response = requests.get(url, timeout=timeout, **kwargs)
/usr/lib/python3/dist-packages/requests/api.py:73: in get
return request("get", url, params=params, **kwargs)
/usr/lib/python3/dist-packages/requests/api.py:59: in request
return session.request(method=method, url=url, **kwargs)
/usr/lib/python3/dist-packages/requests/sessions.py:589: in request
resp = self.send(prep, **send_kwargs)
/usr/lib/python3/dist-packages/requests/sessions.py:703: in send
r = adapter.send(request, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <requests.adapters.HTTPAdapter object at 0x7f9d07db24e0>
request = <PreparedRequest [GET]>, stream = True
timeout = Timeout(connect=30, read=30, total=None), verify = True, cert = None
proxies = OrderedDict({'no': 'localhost', 'https': 'https://127.0.0.1:9/', 'http': 'http://127.0.0.1:9/'})
def send(
self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None
):
"""Sends PreparedRequest object. Returns Response object.
:param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
:param stream: (optional) Whether to stream the request content.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple or urllib3 Timeout object
:param verify: (optional) Either a boolean, in which case it controls whether
we verify the server's TLS certificate, or a string, in which case it
must be a path to a CA bundle to use
:param cert: (optional) Any user-provided SSL certificate to be trusted.
:param proxies: (optional) The proxies dictionary to apply to the request.
:rtype: requests.Response
"""
try:
conn = self.get_connection_with_tls_context(
request, verify, proxies=proxies, cert=cert
)
except LocationValueError as e:
raise InvalidURL(e, request=request)
self.cert_verify(conn, request.url, verify, cert)
url = self.request_url(request, proxies)
self.add_headers(
request,
stream=stream,
timeout=timeout,
verify=verify,
cert=cert,
proxies=proxies,
)
chunked = not (request.body is None or "Content-Length" in request.headers)
if isinstance(timeout, tuple):
try:
connect, read = timeout
timeout = TimeoutSauce(connect=connect, read=read)
except ValueError:
raise ValueError(
f"Invalid timeout {timeout}. Pass a (connect, read) timeout tuple, "
f"or a single float to set both timeouts to the same value."
)
elif isinstance(timeout, TimeoutSauce):
pass
else:
timeout = TimeoutSauce(connect=timeout, read=timeout)
try:
resp = conn.urlopen(
method=request.method,
url=url,
body=request.body,
headers=request.headers,
redirect=False,
assert_same_host=False,
preload_content=False,
decode_content=False,
retries=self.max_retries,
timeout=timeout,
chunked=chunked,
)
except (ProtocolError, OSError) as err:
raise ConnectionError(err, request=request)
except MaxRetryError as e:
if isinstance(e.reason, ConnectTimeoutError):
# TODO: Remove this in 3.0.0: see #2811
if not isinstance(e.reason, NewConnectionError):
raise ConnectTimeout(e, request=request)
if isinstance(e.reason, ResponseError):
raise RetryError(e, request=request)
if isinstance(e.reason, _ProxyError):
> raise ProxyError(e, request=request)
E requests.exceptions.ProxyError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /scipy/dataset-ascent/main/ascent.dat (Caused by ProxyError('Unable to connect to proxy', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f9d07db3320>: Failed to establish a new connection: [Errno 111] Connection refused')))
/usr/lib/python3/dist-packages/requests/adapters.py:694: ProxyError
----------------------------- Captured stderr call -----------------------------
Downloading file 'ascent.dat' from 'https://raw.githubusercontent.com/scipy/dataset-ascent/main/ascent.dat' to '/builds/science-team/silx/debian/output/source_dir/.pybuild/cpython3_3.12_silx/.cache/scipy-data'.
More information about the debian-science-maintainers
mailing list