[Debian-med-packaging] Bug#1026344: insilicoseq: autopkgtest needs update for new version of numpy: EOFError
Paul Gevers
elbrus at debian.org
Sun Dec 18 20:15:56 GMT 2022
Source: insilicoseq
Version: 1.5.4-3
Severity: serious
X-Debbugs-CC: numpy at packages.debian.org
Tags: sid bookworm
User: debian-ci at lists.debian.org
Usertags: needs-update
Control: affects -1 src:numpy
Dear maintainer(s),
With a recent upload of numpy the autopkgtest of insilicoseq fails in
testing when that autopkgtest is run with the binary packages of numpy
from unstable. It passes when run with only packages from testing. In
tabular form:
pass fail
numpy from testing 1:1.23.5-2
insilicoseq from testing 1.5.4-3
all others from testing from testing
I copied some of the output at the bottom of this report.
Currently this regression is blocking the migration of numpy to testing
[1]. Of course, numpy shouldn't just break your autopkgtest (or even
worse, your package), but it seems to me that the change in numpy was
intended and your package needs to update to the new situation.
If this is a real problem in your package (and not only in your
autopkgtest), the right binary package(s) from numpy should really add a
versioned Breaks on the unfixed version of (one of your) package(s).
Note: the Breaks is nice even if the issue is only in the autopkgtest as
it helps the migration software to figure out the right versions to
combine in the tests.
More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation
Paul
[1] https://qa.debian.org/excuses.php?package=numpy
https://ci.debian.net/data/autopkgtest/testing/amd64/i/insilicoseq/29465792/log.gz
=================================== FAILURES
===================================
_______________________________ test_bad_err_mod
_______________________________
file = 'data/empty_file', mmap_mode = None, allow_pickle = True
fix_imports = True, encoding = 'ASCII'
@set_module('numpy')
def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
encoding='ASCII', *, max_header_size=format._MAX_HEADER_SIZE):
"""
Load arrays or pickled objects from ``.npy``, ``.npz`` or
pickled files.
.. warning:: Loading files that contain object arrays uses
the ``pickle``
module, which is not secure against erroneous or
maliciously
constructed data. Consider passing
``allow_pickle=False`` to
load data that is known not to contain object
arrays for the
safer handling of untrusted sources.
Parameters
----------
file : file-like object, string, or pathlib.Path
The file to read. File-like objects must support the
``seek()`` and ``read()`` methods and must always
be opened in binary mode. Pickled files require that the
file-like object support the ``readline()`` method as well.
mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional
If not None, then memory-map the file, using the given mode
(see
`numpy.memmap` for a detailed description of the modes). A
memory-mapped array is kept on disk. However, it can be
accessed
and sliced like any ndarray. Memory mapping is especially
useful
for accessing small fragments of large files without
reading the
entire file into memory.
allow_pickle : bool, optional
Allow loading pickled object arrays stored in npy files.
Reasons for
disallowing pickles include security, as loading pickled
data can
execute arbitrary code. If pickles are disallowed, loading
object
arrays will fail. Default: False
.. versionchanged:: 1.16.3
Made default False in response to CVE-2019-6446.
fix_imports : bool, optional
Only useful when loading Python 2 generated pickled files
on Python 3,
which includes npy/npz files containing object arrays. If
`fix_imports`
is True, pickle will try to map the old Python 2 names to
the new names
used in Python 3.
encoding : str, optional
What encoding to use when reading Python 2 strings. Only
useful when
loading Python 2 generated pickled files in Python 3, which
includes
npy/npz files containing object arrays. Values other than
'latin1',
'ASCII', and 'bytes' are not allowed, as they can corrupt
numerical
data. Default: 'ASCII'
max_header_size : int, optional
Maximum allowed size of the header. Large headers may not
be safe
to load securely and thus require explicitly passing a
larger value.
See :py:meth:`ast.literal_eval()` for details.
This option is ignored when `allow_pickle` is passed. In
that case
the file is by definition trusted and the limit is unnecessary.
Returns
-------
result : array, tuple, dict, etc.
Data stored in the file. For ``.npz`` files, the returned
instance
of NpzFile class must be closed to avoid leaking file
descriptors.
Raises
------
OSError
If the input file does not exist or cannot be read.
UnpicklingError
If ``allow_pickle=True``, but the file cannot be loaded as
a pickle.
ValueError
The file contains an object array, but
``allow_pickle=False`` given.
See Also
--------
save, savez, savez_compressed, loadtxt
memmap : Create a memory-map to an array stored in a file on disk.
lib.format.open_memmap : Create or load a memory-mapped
``.npy`` file.
Notes
-----
- If the file contains pickle data, then whatever object is stored
in the pickle is returned.
- If the file is a ``.npy`` file, then a single array is returned.
- If the file is a ``.npz`` file, then a dictionary-like object is
returned, containing ``{filename: array}`` key-value pairs,
one for
each file in the archive.
- If the file is a ``.npz`` file, the returned value supports the
context manager protocol in a similar fashion to the open
function::
with load('foo.npz') as data:
a = data['a']
The underlying file descriptor is closed when exiting the
'with'
block.
Examples
--------
Store data to disk, and load it again:
>>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
>>> np.load('/tmp/123.npy')
array([[1, 2, 3],
[4, 5, 6]])
Store compressed data to disk, and load it again:
>>> a=np.array([[1, 2, 3], [4, 5, 6]])
>>> b=np.array([1, 2])
>>> np.savez('/tmp/123.npz', a=a, b=b)
>>> data = np.load('/tmp/123.npz')
>>> data['a']
array([[1, 2, 3],
[4, 5, 6]])
>>> data['b']
array([1, 2])
>>> data.close()
Mem-map the stored array, and then access the second row
directly from disk:
>>> X = np.load('/tmp/123.npy', mmap_mode='r')
>>> X[1, :]
memmap([4, 5, 6])
"""
if encoding not in ('ASCII', 'latin1', 'bytes'):
# The 'encoding' value for pickle also affects what encoding
# the serialized binary data of NumPy arrays is loaded
# in. Pickle does not pass on the encoding information to
# NumPy. The unpickling code in numpy.core.multiarray is
# written to assume that unicode data appearing where binary
# should be is in 'latin1'. 'bytes' is also safe, as is
'ASCII'.
#
# Other encoding values can corrupt binary data, and we
# purposefully disallow them. For the same reason, the errors=
# argument is not exposed, as values other than 'strict'
# result can similarly silently corrupt numerical data.
raise ValueError("encoding must be 'ASCII', 'latin1', or
'bytes'")
pickle_kwargs = dict(encoding=encoding,
fix_imports=fix_imports)
with contextlib.ExitStack() as stack:
if hasattr(file, 'read'):
fid = file
own_fid = False
else:
fid = stack.enter_context(open(os_fspath(file), "rb"))
own_fid = True
# Code to distinguish from NumPy binary files and pickles.
_ZIP_PREFIX = b'PK\x03\x04'
_ZIP_SUFFIX = b'PK\x05\x06' # empty zip files start with this
N = len(format.MAGIC_PREFIX)
magic = fid.read(N)
# If the file size is less than N, we need to make sure not
# to seek past the beginning of the file
fid.seek(-min(N, len(magic)), 1) # back-up
if magic.startswith(_ZIP_PREFIX) or
magic.startswith(_ZIP_SUFFIX):
# zip-file (assume .npz)
# Potentially transfer file ownership to NpzFile
stack.pop_all()
ret = NpzFile(fid, own_fid=own_fid,
allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs,
max_header_size=max_header_size)
return ret
elif magic == format.MAGIC_PREFIX:
# .npy file
if mmap_mode:
if allow_pickle:
max_header_size = 2**64
return format.open_memmap(file, mode=mmap_mode,
max_header_size=max_header_size)
else:
return format.read_array(fid,
allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs,
max_header_size=max_header_size)
else:
# Try a pickle
if not allow_pickle:
raise ValueError("Cannot load file containing
pickled data "
"when allow_pickle=False")
try:
> return pickle.load(fid, **pickle_kwargs)
E EOFError: Ran out of input
/usr/lib/python3/dist-packages/numpy/lib/npyio.py:441: EOFError
The above exception was the direct cause of the following exception:
def test_bad_err_mod():
with pytest.raises(SystemExit):
> err_mod = kde.KDErrorModel('data/empty_file')
test/test_error_model.py:93: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/lib/python3/dist-packages/iss/error_models/kde.py:34: in __init__
self.error_profile = self.load_npz(npz_path, 'kde')
/usr/lib/python3/dist-packages/iss/error_models/__init__.py:37: in load_npz
error_profile = np.load(npz_path, allow_pickle=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _
file = 'data/empty_file', mmap_mode = None, allow_pickle = True
fix_imports = True, encoding = 'ASCII'
@set_module('numpy')
def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
encoding='ASCII', *, max_header_size=format._MAX_HEADER_SIZE):
"""
Load arrays or pickled objects from ``.npy``, ``.npz`` or
pickled files.
.. warning:: Loading files that contain object arrays uses
the ``pickle``
module, which is not secure against erroneous or
maliciously
constructed data. Consider passing
``allow_pickle=False`` to
load data that is known not to contain object
arrays for the
safer handling of untrusted sources.
Parameters
----------
file : file-like object, string, or pathlib.Path
The file to read. File-like objects must support the
``seek()`` and ``read()`` methods and must always
be opened in binary mode. Pickled files require that the
file-like object support the ``readline()`` method as well.
mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional
If not None, then memory-map the file, using the given mode
(see
`numpy.memmap` for a detailed description of the modes). A
memory-mapped array is kept on disk. However, it can be
accessed
and sliced like any ndarray. Memory mapping is especially
useful
for accessing small fragments of large files without
reading the
entire file into memory.
allow_pickle : bool, optional
Allow loading pickled object arrays stored in npy files.
Reasons for
disallowing pickles include security, as loading pickled
data can
execute arbitrary code. If pickles are disallowed, loading
object
arrays will fail. Default: False
.. versionchanged:: 1.16.3
Made default False in response to CVE-2019-6446.
fix_imports : bool, optional
Only useful when loading Python 2 generated pickled files
on Python 3,
which includes npy/npz files containing object arrays. If
`fix_imports`
is True, pickle will try to map the old Python 2 names to
the new names
used in Python 3.
encoding : str, optional
What encoding to use when reading Python 2 strings. Only
useful when
loading Python 2 generated pickled files in Python 3, which
includes
npy/npz files containing object arrays. Values other than
'latin1',
'ASCII', and 'bytes' are not allowed, as they can corrupt
numerical
data. Default: 'ASCII'
max_header_size : int, optional
Maximum allowed size of the header. Large headers may not
be safe
to load securely and thus require explicitly passing a
larger value.
See :py:meth:`ast.literal_eval()` for details.
This option is ignored when `allow_pickle` is passed. In
that case
the file is by definition trusted and the limit is unnecessary.
Returns
-------
result : array, tuple, dict, etc.
Data stored in the file. For ``.npz`` files, the returned
instance
of NpzFile class must be closed to avoid leaking file
descriptors.
Raises
------
OSError
If the input file does not exist or cannot be read.
UnpicklingError
If ``allow_pickle=True``, but the file cannot be loaded as
a pickle.
ValueError
The file contains an object array, but
``allow_pickle=False`` given.
See Also
--------
save, savez, savez_compressed, loadtxt
memmap : Create a memory-map to an array stored in a file on disk.
lib.format.open_memmap : Create or load a memory-mapped
``.npy`` file.
Notes
-----
- If the file contains pickle data, then whatever object is stored
in the pickle is returned.
- If the file is a ``.npy`` file, then a single array is returned.
- If the file is a ``.npz`` file, then a dictionary-like object is
returned, containing ``{filename: array}`` key-value pairs,
one for
each file in the archive.
- If the file is a ``.npz`` file, the returned value supports the
context manager protocol in a similar fashion to the open
function::
with load('foo.npz') as data:
a = data['a']
The underlying file descriptor is closed when exiting the
'with'
block.
Examples
--------
Store data to disk, and load it again:
>>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
>>> np.load('/tmp/123.npy')
array([[1, 2, 3],
[4, 5, 6]])
Store compressed data to disk, and load it again:
>>> a=np.array([[1, 2, 3], [4, 5, 6]])
>>> b=np.array([1, 2])
>>> np.savez('/tmp/123.npz', a=a, b=b)
>>> data = np.load('/tmp/123.npz')
>>> data['a']
array([[1, 2, 3],
[4, 5, 6]])
>>> data['b']
array([1, 2])
>>> data.close()
Mem-map the stored array, and then access the second row
directly from disk:
>>> X = np.load('/tmp/123.npy', mmap_mode='r')
>>> X[1, :]
memmap([4, 5, 6])
"""
if encoding not in ('ASCII', 'latin1', 'bytes'):
# The 'encoding' value for pickle also affects what encoding
# the serialized binary data of NumPy arrays is loaded
# in. Pickle does not pass on the encoding information to
# NumPy. The unpickling code in numpy.core.multiarray is
# written to assume that unicode data appearing where binary
# should be is in 'latin1'. 'bytes' is also safe, as is
'ASCII'.
#
# Other encoding values can corrupt binary data, and we
# purposefully disallow them. For the same reason, the errors=
# argument is not exposed, as values other than 'strict'
# result can similarly silently corrupt numerical data.
raise ValueError("encoding must be 'ASCII', 'latin1', or
'bytes'")
pickle_kwargs = dict(encoding=encoding,
fix_imports=fix_imports)
with contextlib.ExitStack() as stack:
if hasattr(file, 'read'):
fid = file
own_fid = False
else:
fid = stack.enter_context(open(os_fspath(file), "rb"))
own_fid = True
# Code to distinguish from NumPy binary files and pickles.
_ZIP_PREFIX = b'PK\x03\x04'
_ZIP_SUFFIX = b'PK\x05\x06' # empty zip files start with this
N = len(format.MAGIC_PREFIX)
magic = fid.read(N)
# If the file size is less than N, we need to make sure not
# to seek past the beginning of the file
fid.seek(-min(N, len(magic)), 1) # back-up
if magic.startswith(_ZIP_PREFIX) or
magic.startswith(_ZIP_SUFFIX):
# zip-file (assume .npz)
# Potentially transfer file ownership to NpzFile
stack.pop_all()
ret = NpzFile(fid, own_fid=own_fid,
allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs,
max_header_size=max_header_size)
return ret
elif magic == format.MAGIC_PREFIX:
# .npy file
if mmap_mode:
if allow_pickle:
max_header_size = 2**64
return format.open_memmap(file, mode=mmap_mode,
max_header_size=max_header_size)
else:
return format.read_array(fid,
allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs,
max_header_size=max_header_size)
else:
# Try a pickle
if not allow_pickle:
raise ValueError("Cannot load file containing
pickled data "
"when allow_pickle=False")
try:
return pickle.load(fid, **pickle_kwargs)
except Exception as e:
> raise pickle.UnpicklingError(
f"Failed to interpret file {file!r} as a
pickle") from e
E _pickle.UnpicklingError: Failed to interpret file
'data/empty_file' as a pickle
/usr/lib/python3/dist-packages/numpy/lib/npyio.py:443: UnpicklingError
=============================== warnings summary
===============================
../../../../usr/lib/python3/dist-packages/joblib/backports.py:22
/usr/lib/python3/dist-packages/joblib/backports.py:22:
DeprecationWarning: The distutils package is deprecated and slated for
removal in Python 3.12. Use setuptools or check PEP 632 for potential
alternatives
import distutils # noqa
test/test_bam.py: 12 warnings
/usr/lib/python3/dist-packages/iss/modeller.py:56:
DeprecationWarning: `np.float` is a deprecated alias for the builtin
`float`. To silence this warning, use `float` by itself. Doing this will
not modify any behavior and is safe. If you specifically wanted the
numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
read = np.fromiter((q[0] for q in quality), dtype=np.float)
test/test_bam.py::test_to_model
/usr/lib/python3/dist-packages/numpy/lib/npyio.py:716:
VisibleDeprecationWarning: Creating an ndarray from ragged nested
sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with
different lengths or shapes) is deprecated. If you meant to do this, you
must specify 'dtype=object' when creating the ndarray.
val = np.asanyarray(val)
test/test_generator.py::test_simulate_and_save
test/test_generator.py::test_simulate_and_save_short
/usr/lib/python3/dist-packages/Bio/SeqUtils/__init__.py:144:
BiopythonDeprecationWarning: GC is deprecated; please use gc_fraction
instead.
warnings.warn(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info
============================
FAILED test/test_error_model.py::test_bad_err_mod -
_pickle.UnpicklingError: ...
============= 1 failed, 40 passed, 1 skipped, 16 warnings in 5.81s
=============
autopkgtest [09:24:52]: test run-unit-test
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20221218/979086ad/attachment-0001.sig>
More information about the Debian-med-packaging
mailing list