[Debian-med-packaging] Bug#1026344: insilicoseq: autopkgtest needs update for new version of numpy: EOFError

Paul Gevers elbrus at debian.org
Sun Dec 18 20:15:56 GMT 2022


Source: insilicoseq
Version: 1.5.4-3
Severity: serious
X-Debbugs-CC: numpy at packages.debian.org
Tags: sid bookworm
User: debian-ci at lists.debian.org
Usertags: needs-update
Control: affects -1 src:numpy

Dear maintainer(s),

With a recent upload of numpy the autopkgtest of insilicoseq fails in 
testing when that autopkgtest is run with the binary packages of numpy 
from unstable. It passes when run with only packages from testing. In 
tabular form:

                        pass            fail
numpy                  from testing    1:1.23.5-2
insilicoseq            from testing    1.5.4-3
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration of numpy to testing 
[1]. Of course, numpy shouldn't just break your autopkgtest (or even 
worse, your package), but it seems to me that the change in numpy was 
intended and your package needs to update to the new situation.

If this is a real problem in your package (and not only in your 
autopkgtest), the right binary package(s) from numpy should really add a 
versioned Breaks on the unfixed version of (one of your) package(s). 
Note: the Breaks is nice even if the issue is only in the autopkgtest as 
it helps the migration software to figure out the right versions to 
combine in the tests.

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=numpy

https://ci.debian.net/data/autopkgtest/testing/amd64/i/insilicoseq/29465792/log.gz

=================================== FAILURES 
===================================
_______________________________ test_bad_err_mod 
_______________________________

file = 'data/empty_file', mmap_mode = None, allow_pickle = True
fix_imports = True, encoding = 'ASCII'

     @set_module('numpy')
     def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
              encoding='ASCII', *, max_header_size=format._MAX_HEADER_SIZE):
         """
         Load arrays or pickled objects from ``.npy``, ``.npz`` or 
pickled files.
             .. warning:: Loading files that contain object arrays uses 
the ``pickle``
                      module, which is not secure against erroneous or 
maliciously
                      constructed data. Consider passing 
``allow_pickle=False`` to
                      load data that is known not to contain object 
arrays for the
                      safer handling of untrusted sources.
             Parameters
         ----------
         file : file-like object, string, or pathlib.Path
             The file to read. File-like objects must support the
             ``seek()`` and ``read()`` methods and must always
             be opened in binary mode.  Pickled files require that the
             file-like object support the ``readline()`` method as well.
         mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional
             If not None, then memory-map the file, using the given mode 
(see
             `numpy.memmap` for a detailed description of the modes).  A
             memory-mapped array is kept on disk. However, it can be 
accessed
             and sliced like any ndarray.  Memory mapping is especially 
useful
             for accessing small fragments of large files without 
reading the
             entire file into memory.
         allow_pickle : bool, optional
             Allow loading pickled object arrays stored in npy files. 
Reasons for
             disallowing pickles include security, as loading pickled 
data can
             execute arbitrary code. If pickles are disallowed, loading 
object
             arrays will fail. Default: False
                 .. versionchanged:: 1.16.3
                 Made default False in response to CVE-2019-6446.
             fix_imports : bool, optional
             Only useful when loading Python 2 generated pickled files 
on Python 3,
             which includes npy/npz files containing object arrays. If 
`fix_imports`
             is True, pickle will try to map the old Python 2 names to 
the new names
             used in Python 3.
         encoding : str, optional
             What encoding to use when reading Python 2 strings. Only 
useful when
             loading Python 2 generated pickled files in Python 3, which 
includes
             npy/npz files containing object arrays. Values other than 
'latin1',
             'ASCII', and 'bytes' are not allowed, as they can corrupt 
numerical
             data. Default: 'ASCII'
         max_header_size : int, optional
             Maximum allowed size of the header.  Large headers may not 
be safe
             to load securely and thus require explicitly passing a 
larger value.
             See :py:meth:`ast.literal_eval()` for details.
             This option is ignored when `allow_pickle` is passed.  In 
that case
             the file is by definition trusted and the limit is unnecessary.
             Returns
         -------
         result : array, tuple, dict, etc.
             Data stored in the file. For ``.npz`` files, the returned 
instance
             of NpzFile class must be closed to avoid leaking file 
descriptors.
             Raises
         ------
         OSError
             If the input file does not exist or cannot be read.
         UnpicklingError
             If ``allow_pickle=True``, but the file cannot be loaded as 
a pickle.
         ValueError
             The file contains an object array, but 
``allow_pickle=False`` given.
             See Also
         --------
         save, savez, savez_compressed, loadtxt
         memmap : Create a memory-map to an array stored in a file on disk.
         lib.format.open_memmap : Create or load a memory-mapped 
``.npy`` file.
             Notes
         -----
         - If the file contains pickle data, then whatever object is stored
           in the pickle is returned.
         - If the file is a ``.npy`` file, then a single array is returned.
         - If the file is a ``.npz`` file, then a dictionary-like object is
           returned, containing ``{filename: array}`` key-value pairs, 
one for
           each file in the archive.
         - If the file is a ``.npz`` file, the returned value supports the
           context manager protocol in a similar fashion to the open 
function::
                 with load('foo.npz') as data:
                 a = data['a']
               The underlying file descriptor is closed when exiting the 
'with'
           block.
             Examples
         --------
         Store data to disk, and load it again:
             >>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
         >>> np.load('/tmp/123.npy')
         array([[1, 2, 3],
                [4, 5, 6]])
             Store compressed data to disk, and load it again:
             >>> a=np.array([[1, 2, 3], [4, 5, 6]])
         >>> b=np.array([1, 2])
         >>> np.savez('/tmp/123.npz', a=a, b=b)
         >>> data = np.load('/tmp/123.npz')
         >>> data['a']
         array([[1, 2, 3],
                [4, 5, 6]])
         >>> data['b']
         array([1, 2])
         >>> data.close()
             Mem-map the stored array, and then access the second row
         directly from disk:
             >>> X = np.load('/tmp/123.npy', mmap_mode='r')
         >>> X[1, :]
         memmap([4, 5, 6])
             """
         if encoding not in ('ASCII', 'latin1', 'bytes'):
             # The 'encoding' value for pickle also affects what encoding
             # the serialized binary data of NumPy arrays is loaded
             # in. Pickle does not pass on the encoding information to
             # NumPy. The unpickling code in numpy.core.multiarray is
             # written to assume that unicode data appearing where binary
             # should be is in 'latin1'. 'bytes' is also safe, as is 
'ASCII'.
             #
             # Other encoding values can corrupt binary data, and we
             # purposefully disallow them. For the same reason, the errors=
             # argument is not exposed, as values other than 'strict'
             # result can similarly silently corrupt numerical data.
             raise ValueError("encoding must be 'ASCII', 'latin1', or 
'bytes'")
             pickle_kwargs = dict(encoding=encoding, 
fix_imports=fix_imports)
             with contextlib.ExitStack() as stack:
             if hasattr(file, 'read'):
                 fid = file
                 own_fid = False
             else:
                 fid = stack.enter_context(open(os_fspath(file), "rb"))
                 own_fid = True
                 # Code to distinguish from NumPy binary files and pickles.
             _ZIP_PREFIX = b'PK\x03\x04'
             _ZIP_SUFFIX = b'PK\x05\x06' # empty zip files start with this
             N = len(format.MAGIC_PREFIX)
             magic = fid.read(N)
             # If the file size is less than N, we need to make sure not
             # to seek past the beginning of the file
             fid.seek(-min(N, len(magic)), 1)  # back-up
             if magic.startswith(_ZIP_PREFIX) or 
magic.startswith(_ZIP_SUFFIX):
                 # zip-file (assume .npz)
                 # Potentially transfer file ownership to NpzFile
                 stack.pop_all()
                 ret = NpzFile(fid, own_fid=own_fid, 
allow_pickle=allow_pickle,
                               pickle_kwargs=pickle_kwargs,
                               max_header_size=max_header_size)
                 return ret
             elif magic == format.MAGIC_PREFIX:
                 # .npy file
                 if mmap_mode:
                     if allow_pickle:
                         max_header_size = 2**64
                     return format.open_memmap(file, mode=mmap_mode,
 
max_header_size=max_header_size)
                 else:
                     return format.read_array(fid, 
allow_pickle=allow_pickle,
                                              pickle_kwargs=pickle_kwargs,
 
max_header_size=max_header_size)
             else:
                 # Try a pickle
                 if not allow_pickle:
                     raise ValueError("Cannot load file containing 
pickled data "
                                      "when allow_pickle=False")
                 try:
>                   return pickle.load(fid, **pickle_kwargs)
E                   EOFError: Ran out of input

/usr/lib/python3/dist-packages/numpy/lib/npyio.py:441: EOFError

The above exception was the direct cause of the following exception:

     def test_bad_err_mod():
         with pytest.raises(SystemExit):
>           err_mod = kde.KDErrorModel('data/empty_file')

test/test_error_model.py:93: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3/dist-packages/iss/error_models/kde.py:34: in __init__
     self.error_profile = self.load_npz(npz_path, 'kde')
/usr/lib/python3/dist-packages/iss/error_models/__init__.py:37: in load_npz
     error_profile = np.load(npz_path, allow_pickle=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _
file = 'data/empty_file', mmap_mode = None, allow_pickle = True
fix_imports = True, encoding = 'ASCII'

     @set_module('numpy')
     def load(file, mmap_mode=None, allow_pickle=False, fix_imports=True,
              encoding='ASCII', *, max_header_size=format._MAX_HEADER_SIZE):
         """
         Load arrays or pickled objects from ``.npy``, ``.npz`` or 
pickled files.
             .. warning:: Loading files that contain object arrays uses 
the ``pickle``
                      module, which is not secure against erroneous or 
maliciously
                      constructed data. Consider passing 
``allow_pickle=False`` to
                      load data that is known not to contain object 
arrays for the
                      safer handling of untrusted sources.
             Parameters
         ----------
         file : file-like object, string, or pathlib.Path
             The file to read. File-like objects must support the
             ``seek()`` and ``read()`` methods and must always
             be opened in binary mode.  Pickled files require that the
             file-like object support the ``readline()`` method as well.
         mmap_mode : {None, 'r+', 'r', 'w+', 'c'}, optional
             If not None, then memory-map the file, using the given mode 
(see
             `numpy.memmap` for a detailed description of the modes).  A
             memory-mapped array is kept on disk. However, it can be 
accessed
             and sliced like any ndarray.  Memory mapping is especially 
useful
             for accessing small fragments of large files without 
reading the
             entire file into memory.
         allow_pickle : bool, optional
             Allow loading pickled object arrays stored in npy files. 
Reasons for
             disallowing pickles include security, as loading pickled 
data can
             execute arbitrary code. If pickles are disallowed, loading 
object
             arrays will fail. Default: False
                 .. versionchanged:: 1.16.3
                 Made default False in response to CVE-2019-6446.
             fix_imports : bool, optional
             Only useful when loading Python 2 generated pickled files 
on Python 3,
             which includes npy/npz files containing object arrays. If 
`fix_imports`
             is True, pickle will try to map the old Python 2 names to 
the new names
             used in Python 3.
         encoding : str, optional
             What encoding to use when reading Python 2 strings. Only 
useful when
             loading Python 2 generated pickled files in Python 3, which 
includes
             npy/npz files containing object arrays. Values other than 
'latin1',
             'ASCII', and 'bytes' are not allowed, as they can corrupt 
numerical
             data. Default: 'ASCII'
         max_header_size : int, optional
             Maximum allowed size of the header.  Large headers may not 
be safe
             to load securely and thus require explicitly passing a 
larger value.
             See :py:meth:`ast.literal_eval()` for details.
             This option is ignored when `allow_pickle` is passed.  In 
that case
             the file is by definition trusted and the limit is unnecessary.
             Returns
         -------
         result : array, tuple, dict, etc.
             Data stored in the file. For ``.npz`` files, the returned 
instance
             of NpzFile class must be closed to avoid leaking file 
descriptors.
             Raises
         ------
         OSError
             If the input file does not exist or cannot be read.
         UnpicklingError
             If ``allow_pickle=True``, but the file cannot be loaded as 
a pickle.
         ValueError
             The file contains an object array, but 
``allow_pickle=False`` given.
             See Also
         --------
         save, savez, savez_compressed, loadtxt
         memmap : Create a memory-map to an array stored in a file on disk.
         lib.format.open_memmap : Create or load a memory-mapped 
``.npy`` file.
             Notes
         -----
         - If the file contains pickle data, then whatever object is stored
           in the pickle is returned.
         - If the file is a ``.npy`` file, then a single array is returned.
         - If the file is a ``.npz`` file, then a dictionary-like object is
           returned, containing ``{filename: array}`` key-value pairs, 
one for
           each file in the archive.
         - If the file is a ``.npz`` file, the returned value supports the
           context manager protocol in a similar fashion to the open 
function::
                 with load('foo.npz') as data:
                 a = data['a']
               The underlying file descriptor is closed when exiting the 
'with'
           block.
             Examples
         --------
         Store data to disk, and load it again:
             >>> np.save('/tmp/123', np.array([[1, 2, 3], [4, 5, 6]]))
         >>> np.load('/tmp/123.npy')
         array([[1, 2, 3],
                [4, 5, 6]])
             Store compressed data to disk, and load it again:
             >>> a=np.array([[1, 2, 3], [4, 5, 6]])
         >>> b=np.array([1, 2])
         >>> np.savez('/tmp/123.npz', a=a, b=b)
         >>> data = np.load('/tmp/123.npz')
         >>> data['a']
         array([[1, 2, 3],
                [4, 5, 6]])
         >>> data['b']
         array([1, 2])
         >>> data.close()
             Mem-map the stored array, and then access the second row
         directly from disk:
             >>> X = np.load('/tmp/123.npy', mmap_mode='r')
         >>> X[1, :]
         memmap([4, 5, 6])
             """
         if encoding not in ('ASCII', 'latin1', 'bytes'):
             # The 'encoding' value for pickle also affects what encoding
             # the serialized binary data of NumPy arrays is loaded
             # in. Pickle does not pass on the encoding information to
             # NumPy. The unpickling code in numpy.core.multiarray is
             # written to assume that unicode data appearing where binary
             # should be is in 'latin1'. 'bytes' is also safe, as is 
'ASCII'.
             #
             # Other encoding values can corrupt binary data, and we
             # purposefully disallow them. For the same reason, the errors=
             # argument is not exposed, as values other than 'strict'
             # result can similarly silently corrupt numerical data.
             raise ValueError("encoding must be 'ASCII', 'latin1', or 
'bytes'")
             pickle_kwargs = dict(encoding=encoding, 
fix_imports=fix_imports)
             with contextlib.ExitStack() as stack:
             if hasattr(file, 'read'):
                 fid = file
                 own_fid = False
             else:
                 fid = stack.enter_context(open(os_fspath(file), "rb"))
                 own_fid = True
                 # Code to distinguish from NumPy binary files and pickles.
             _ZIP_PREFIX = b'PK\x03\x04'
             _ZIP_SUFFIX = b'PK\x05\x06' # empty zip files start with this
             N = len(format.MAGIC_PREFIX)
             magic = fid.read(N)
             # If the file size is less than N, we need to make sure not
             # to seek past the beginning of the file
             fid.seek(-min(N, len(magic)), 1)  # back-up
             if magic.startswith(_ZIP_PREFIX) or 
magic.startswith(_ZIP_SUFFIX):
                 # zip-file (assume .npz)
                 # Potentially transfer file ownership to NpzFile
                 stack.pop_all()
                 ret = NpzFile(fid, own_fid=own_fid, 
allow_pickle=allow_pickle,
                               pickle_kwargs=pickle_kwargs,
                               max_header_size=max_header_size)
                 return ret
             elif magic == format.MAGIC_PREFIX:
                 # .npy file
                 if mmap_mode:
                     if allow_pickle:
                         max_header_size = 2**64
                     return format.open_memmap(file, mode=mmap_mode,
 
max_header_size=max_header_size)
                 else:
                     return format.read_array(fid, 
allow_pickle=allow_pickle,
                                              pickle_kwargs=pickle_kwargs,
 
max_header_size=max_header_size)
             else:
                 # Try a pickle
                 if not allow_pickle:
                     raise ValueError("Cannot load file containing 
pickled data "
                                      "when allow_pickle=False")
                 try:
                     return pickle.load(fid, **pickle_kwargs)
                 except Exception as e:
>                   raise pickle.UnpicklingError(
                         f"Failed to interpret file {file!r} as a 
pickle") from e
E                   _pickle.UnpicklingError: Failed to interpret file 
'data/empty_file' as a pickle

/usr/lib/python3/dist-packages/numpy/lib/npyio.py:443: UnpicklingError
=============================== warnings summary 
===============================
../../../../usr/lib/python3/dist-packages/joblib/backports.py:22
   /usr/lib/python3/dist-packages/joblib/backports.py:22: 
DeprecationWarning: The distutils package is deprecated and slated for 
removal in Python 3.12. Use setuptools or check PEP 632 for potential 
alternatives
     import distutils  # noqa

test/test_bam.py: 12 warnings
   /usr/lib/python3/dist-packages/iss/modeller.py:56: 
DeprecationWarning: `np.float` is a deprecated alias for the builtin 
`float`. To silence this warning, use `float` by itself. Doing this will 
not modify any behavior and is safe. If you specifically wanted the 
numpy scalar type, use `np.float64` here.
   Deprecated in NumPy 1.20; for more details and guidance: 
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
     read = np.fromiter((q[0] for q in quality), dtype=np.float)

test/test_bam.py::test_to_model
   /usr/lib/python3/dist-packages/numpy/lib/npyio.py:716: 
VisibleDeprecationWarning: Creating an ndarray from ragged nested 
sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with 
different lengths or shapes) is deprecated. If you meant to do this, you 
must specify 'dtype=object' when creating the ndarray.
     val = np.asanyarray(val)

test/test_generator.py::test_simulate_and_save
test/test_generator.py::test_simulate_and_save_short
   /usr/lib/python3/dist-packages/Bio/SeqUtils/__init__.py:144: 
BiopythonDeprecationWarning: GC is deprecated; please use gc_fraction 
instead.
     warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info 
============================
FAILED test/test_error_model.py::test_bad_err_mod - 
_pickle.UnpicklingError: ...
============= 1 failed, 40 passed, 1 skipped, 16 warnings in 5.81s 
=============
autopkgtest [09:24:52]: test run-unit-test

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20221218/979086ad/attachment-0001.sig>


More information about the Debian-med-packaging mailing list