[med-svn] [Git][med-team/python-xopen][upstream] New upstream version 1.0.0
Nilesh Patra
gitlab at salsa.debian.org
Sun Nov 8 18:34:04 GMT 2020
Nilesh Patra pushed to branch upstream at Debian Med / python-xopen
Commits:
52425f70 by Nilesh Patra at 2020-11-09T00:01:07+05:30
New upstream version 1.0.0
- - - - -
9 changed files:
- .travis.yml
- PKG-INFO
- README.rst
- setup.py
- src/xopen.egg-info/PKG-INFO
- src/xopen/__init__.py
- src/xopen/_version.py
- tests/test_xopen.py
- tox.ini
Changes:
=====================================
.travis.yml
=====================================
@@ -1,6 +1,6 @@
language: python
-dist: xenial
+dist: focal
cache:
directories:
@@ -11,6 +11,7 @@ python:
- "3.6"
- "3.7"
- "3.8"
+ - "3.9"
- "pypy3"
install:
@@ -45,7 +46,16 @@ jobs:
ls -l dist/
python3 -m twine upload dist/xopen-*
- - name: flake8
+ - stage: test
+ name: flake8
python: "3.6"
install: python3 -m pip install flake8
script: flake8 src/ tests/
+
+ - stage: test
+ name: igzip
+ python: "3.6"
+ install:
+ - sudo apt-get update && sudo apt-get install -y pigz isal
+ - pip install --upgrade coverage codecov
+ - pip install .
=====================================
PKG-INFO
=====================================
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: xopen
-Version: 0.9.0
+Version: 1.0.0
Summary: Open compressed files transparently
Home-page: https://github.com/marcelm/xopen/
Author: Marcel Martin
@@ -35,14 +35,15 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
to open ``.gz`` files, which is faster than using the built-in ``gzip.open``
function. ``pigz`` can use multiple threads when compressing, but is also faster
when reading ``.gz`` files, so it is used both for reading and writing if it is
- available.
+ available. For gzip compression levels 1 to 3,
+ `igzip <https://github.com/intel/isa-l/>`_ is used for an even greater speedup.
This module has originally been developed as part of the `Cutadapt
tool <https://cutadapt.readthedocs.io/>`_ that is used in bioinformatics to
manipulate sequencing data. It has been in successful use within that software
for a few years.
- ``xopen`` is compatible with Python versions 3.5 to 3.8.
+ ``xopen`` is compatible with Python versions 3.5 and later.
Usage
@@ -82,7 +83,7 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
appending to files.
Ruben Vorderman <https://github.com/rhpvorderman/> contributed improvements to
- make reading gzipped files faster.
+ make reading and writing gzipped files faster.
Benjamin Vaisvil <https://github.com/bvaisvil> contributed support for
format detection from content.
@@ -94,9 +95,15 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
Changes
-------
- v0.9.0
+ v1.0.0
~~~~~~
+ * If installed, the ``igzip`` program (part of
+ `Intel ISA-L <https://github.com/intel/isa-l/>`_) is now used for reading
+ and writing gzip-compressed files at compression levels 1-3, which results
+ in a significant speedup.
+ v0.9.0
+ ~~~~~~
* When the file name extension of a file to be opened for reading is not
available, the content is inspected (if possible) and used to determine
which compression format applies.
@@ -136,10 +143,13 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
* xopen now accepts pathlib.Path objects.
- Author
- ------
+ Contributors
+ ------------
+
+ * Marcel Martin
+ * Ruben Vorderman
+ * For more contributors, see <https://github.com/marcelm/xopen/graphs/contributors>
- Marcel Martin <mail at marcelm.net> (`@marcelm_ on Twitter <https://twitter.com/marcelm_>`_)
Links
-----
@@ -152,9 +162,5 @@ Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.5
-Classifier: Programming Language :: Python :: 3.6
-Classifier: Programming Language :: Python :: 3.7
-Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.5
Provides-Extra: dev
=====================================
README.rst
=====================================
@@ -27,14 +27,15 @@ For example, ``xopen`` uses ``pigz``, which is a parallel version of ``gzip``,
to open ``.gz`` files, which is faster than using the built-in ``gzip.open``
function. ``pigz`` can use multiple threads when compressing, but is also faster
when reading ``.gz`` files, so it is used both for reading and writing if it is
-available.
+available. For gzip compression levels 1 to 3,
+`igzip <https://github.com/intel/isa-l/>`_ is used for an even greater speedup.
This module has originally been developed as part of the `Cutadapt
tool <https://cutadapt.readthedocs.io/>`_ that is used in bioinformatics to
manipulate sequencing data. It has been in successful use within that software
for a few years.
-``xopen`` is compatible with Python versions 3.5 to 3.8.
+``xopen`` is compatible with Python versions 3.5 and later.
Usage
@@ -74,7 +75,7 @@ Kyle Beauchamp <https://github.com/kyleabeauchamp/> has contributed support for
appending to files.
Ruben Vorderman <https://github.com/rhpvorderman/> contributed improvements to
-make reading gzipped files faster.
+make reading and writing gzipped files faster.
Benjamin Vaisvil <https://github.com/bvaisvil> contributed support for
format detection from content.
@@ -86,9 +87,15 @@ If you also want to open S3 files, you may want to use that module instead.
Changes
-------
-v0.9.0
+v1.0.0
~~~~~~
+* If installed, the ``igzip`` program (part of
+ `Intel ISA-L <https://github.com/intel/isa-l/>`_) is now used for reading
+ and writing gzip-compressed files at compression levels 1-3, which results
+ in a significant speedup.
+v0.9.0
+~~~~~~
* When the file name extension of a file to be opened for reading is not
available, the content is inspected (if possible) and used to determine
which compression format applies.
@@ -128,10 +135,13 @@ v0.5.0
* xopen now accepts pathlib.Path objects.
-Author
-------
+Contributors
+------------
+
+* Marcel Martin
+* Ruben Vorderman
+* For more contributors, see <https://github.com/marcelm/xopen/graphs/contributors>
-Marcel Martin <mail at marcelm.net> (`@marcelm_ on Twitter <https://twitter.com/marcelm_>`_)
Links
-----
=====================================
setup.py
=====================================
@@ -1,10 +1,6 @@
import sys
from setuptools import setup, find_packages
-if sys.version_info < (3, 5):
- sys.stdout.write("At least Python 3.5 is required.\n")
- sys.exit(1)
-
with open('README.rst') as f:
long_description = f.read()
@@ -28,9 +24,5 @@ setup(
"Development Status :: 5 - Production/Stable",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
- "Programming Language :: Python :: 3.5",
- "Programming Language :: Python :: 3.6",
- "Programming Language :: Python :: 3.7",
- "Programming Language :: Python :: 3.8",
]
)
=====================================
src/xopen.egg-info/PKG-INFO
=====================================
@@ -1,6 +1,6 @@
Metadata-Version: 2.1
Name: xopen
-Version: 0.9.0
+Version: 1.0.0
Summary: Open compressed files transparently
Home-page: https://github.com/marcelm/xopen/
Author: Marcel Martin
@@ -35,14 +35,15 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
to open ``.gz`` files, which is faster than using the built-in ``gzip.open``
function. ``pigz`` can use multiple threads when compressing, but is also faster
when reading ``.gz`` files, so it is used both for reading and writing if it is
- available.
+ available. For gzip compression levels 1 to 3,
+ `igzip <https://github.com/intel/isa-l/>`_ is used for an even greater speedup.
This module has originally been developed as part of the `Cutadapt
tool <https://cutadapt.readthedocs.io/>`_ that is used in bioinformatics to
manipulate sequencing data. It has been in successful use within that software
for a few years.
- ``xopen`` is compatible with Python versions 3.5 to 3.8.
+ ``xopen`` is compatible with Python versions 3.5 and later.
Usage
@@ -82,7 +83,7 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
appending to files.
Ruben Vorderman <https://github.com/rhpvorderman/> contributed improvements to
- make reading gzipped files faster.
+ make reading and writing gzipped files faster.
Benjamin Vaisvil <https://github.com/bvaisvil> contributed support for
format detection from content.
@@ -94,9 +95,15 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
Changes
-------
- v0.9.0
+ v1.0.0
~~~~~~
+ * If installed, the ``igzip`` program (part of
+ `Intel ISA-L <https://github.com/intel/isa-l/>`_) is now used for reading
+ and writing gzip-compressed files at compression levels 1-3, which results
+ in a significant speedup.
+ v0.9.0
+ ~~~~~~
* When the file name extension of a file to be opened for reading is not
available, the content is inspected (if possible) and used to determine
which compression format applies.
@@ -136,10 +143,13 @@ Description: .. image:: https://travis-ci.org/marcelm/xopen.svg?branch=master
* xopen now accepts pathlib.Path objects.
- Author
- ------
+ Contributors
+ ------------
+
+ * Marcel Martin
+ * Ruben Vorderman
+ * For more contributors, see <https://github.com/marcelm/xopen/graphs/contributors>
- Marcel Martin <mail at marcelm.net> (`@marcelm_ on Twitter <https://twitter.com/marcelm_>`_)
Links
-----
@@ -152,9 +162,5 @@ Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.5
-Classifier: Programming Language :: Python :: 3.6
-Classifier: Programming Language :: Python :: 3.7
-Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.5
Provides-Extra: dev
=====================================
src/xopen/__init__.py
=====================================
@@ -13,7 +13,10 @@ import time
import stat
import signal
import pathlib
+import subprocess
+import tempfile
from subprocess import Popen, PIPE
+from typing import Optional
from ._version import version as __version__
@@ -23,6 +26,22 @@ try:
except ImportError:
lzma = None
+try:
+ import fcntl
+ # fcntl.F_SETPIPE_SZ will be available in python 3.10.
+ # https://github.com/python/cpython/pull/21921
+ # If not available: set it to the correct value for known platforms.
+ if not hasattr(fcntl, "F_SETPIPE_SZ") and sys.platform == "linux":
+ setattr(fcntl, "F_SETPIPE_SZ", 1031)
+except ImportError:
+ fcntl = None
+
+_MAX_PIPE_SIZE_PATH = pathlib.Path("/proc/sys/fs/pipe-max-size")
+if _MAX_PIPE_SIZE_PATH.exists():
+ _MAX_PIPE_SIZE = int(_MAX_PIPE_SIZE_PATH.read_text())
+else:
+ _MAX_PIPE_SIZE = None
+
try:
from os import fspath # Exists in Python 3.6+
@@ -66,6 +85,39 @@ def _available_cpu_count():
return 1
+def _set_pipe_size_to_max(fd: int):
+ """
+ Set pipe size to maximum on platforms that support it.
+ :param fd: The file descriptor to increase the pipe size for.
+ """
+ if not hasattr(fcntl, "F_SETPIPE_SZ") or not _MAX_PIPE_SIZE:
+ return
+ fcntl.fcntl(fd, fcntl.F_SETPIPE_SZ, _MAX_PIPE_SIZE)
+
+
+def _can_read_concatenated_gz(program: str) -> bool:
+ """
+ Check if a concatenated gzip file can be read properly. Not all deflate
+ programs handle this properly.
+ """
+ fd, temp_path = tempfile.mkstemp(suffix=".gz", prefix="xopen.")
+ try:
+ # Create a concatenated gzip file. gzip.compress recreates the contents
+ # of a gzip file including header and trailer.
+ with open(temp_path, "wb") as temp_file:
+ temp_file.write(gzip.compress(b"AB") + gzip.compress(b"CD"))
+ try:
+ result = subprocess.run([program, "-c", "-d", temp_path],
+ check=True, stderr=PIPE, stdout=PIPE)
+ return result.stdout == b"ABCD"
+ except subprocess.CalledProcessError:
+ # Program can't read zip
+ return False
+ finally:
+ os.close(fd)
+ os.remove(temp_path)
+
+
class Closing:
"""
Inherit from this class and implement a close() method to offer context
@@ -85,25 +137,22 @@ class Closing:
pass
-class PipedGzipWriter(Closing):
+class PipedCompressionWriter(Closing):
"""
- Write gzip-compressed files by running an external gzip or pigz process and
- piping into it. pigz is tried first. It is fast because it can compress using
- multiple cores.
-
- If pigz is not available, a gzip subprocess is used. On Python 2, this saves
- CPU time because gzip.GzipFile is slower. On Python 3, gzip.GzipFile is on
- par with gzip itself, but running an external gzip can still reduce wall-clock
- time because the compression happens in a separate process.
+ Write Compressed files by running an external process and piping into it.
"""
-
- def __init__(self, path, mode='wt', compresslevel=6, threads=None):
+ def __init__(self, path, program, mode='wt',
+ compresslevel: Optional[int] = None,
+ threads_flag: str = None,
+ threads: Optional[int] = None):
"""
mode -- one of 'w', 'wt', 'wb', 'a', 'at', 'ab'
- compresslevel -- gzip compression level
- threads (int) -- number of pigz threads. If this is set to None, a reasonable default is
+ compresslevel -- compression level
+ threads_flag -- which flag is used to denote the number of threads in the program.
+ If set to none, program will be called without threads flag.
+ threads (int) -- number of threads. If this is set to None, a reasonable default is
used. At the moment, this means that the number of available CPU cores is used, capped
- at four to avoid creating too many threads. Use 0 to let pigz use all available cores.
+ at four to avoid creating too many threads. Use 0 to use all available cores.
"""
if mode not in ('w', 'wt', 'wb', 'a', 'at', 'ab'):
raise ValueError(
@@ -114,29 +163,33 @@ class PipedGzipWriter(Closing):
self.devnull = open(os.devnull, mode)
self.closed = False
self.name = path
+ self._mode = mode
+ self._program = program
+ self._threads_flag = threads_flag
if threads is None:
threads = min(_available_cpu_count(), 4)
try:
- self.process, self.program = self._open_process(
+ self.process = self._open_process(
mode, compresslevel, threads, self.outfile, self.devnull)
except OSError:
self.outfile.close()
self.devnull.close()
raise
+ _set_pipe_size_to_max(self.process.stdin.fileno())
+
if 'b' not in mode:
self._file = io.TextIOWrapper(self.process.stdin)
else:
self._file = self.process.stdin
- @staticmethod
- def _open_process(mode, compresslevel, threads, outfile, devnull):
- pigz_args = ['pigz']
- if threads != 0:
- pigz_args += ['-p', str(threads)]
+ def _open_process(self, mode, compresslevel, threads, outfile, devnull):
+ program_args = [self._program]
+ if threads != 0 and self._threads_flag is not None:
+ program_args += [self._threads_flag, str(threads)]
extra_args = []
- if 'w' in mode and compresslevel != 6:
+ if 'w' in mode and compresslevel is not None:
extra_args += ['-' + str(compresslevel)]
kwargs = dict(stdin=PIPE, stdout=outfile, stderr=devnull)
@@ -148,14 +201,9 @@ class PipedGzipWriter(Closing):
if sys.platform != 'win32':
kwargs['close_fds'] = True
- try:
- process = Popen(pigz_args + extra_args, **kwargs)
- program = 'pigz'
- except OSError: # TODO Use FileNotFound instead (Python 3)
- # pigz not found, try regular gzip
- process = Popen(['gzip'] + extra_args, **kwargs)
- program = 'gzip'
- return process, program
+ process = Popen(program_args + extra_args, **kwargs)
+
+ return process
def write(self, arg):
self._file.write(arg)
@@ -170,7 +218,7 @@ class PipedGzipWriter(Closing):
self.devnull.close()
if retcode != 0:
raise OSError(
- "Output {} process terminated with exit code {}".format(self.program, retcode))
+ "Output {} process terminated with exit code {}".format(self._program, retcode))
def __iter__(self):
return self
@@ -179,36 +227,36 @@ class PipedGzipWriter(Closing):
raise io.UnsupportedOperation('not readable')
-class PipedGzipReader(Closing):
+class PipedCompressionReader(Closing):
"""
- Open a pipe to pigz for reading a gzipped file. Even though pigz is mostly
- used to speed up writing by using many compression threads, it is
- also faster when reading, even when forced to use a single thread
- (ca. 2x speedup).
+ Open a pipe to a process for reading a compressed file.
"""
- def __init__(self, path, mode='r', threads=None):
+ def __init__(self, path, program, mode='r', threads_flag=None, threads=None):
"""
Raise an OSError when pigz could not be found.
"""
if mode not in ('r', 'rt', 'rb'):
raise ValueError("Mode is '{}', but it must be 'r', 'rt' or 'rb'".format(mode))
- pigz_args = ['pigz', '-cd', path]
-
- if threads is None:
- # Single threaded behaviour by default because:
- # - Using a single thread to read a file is the least unexpected
- # behaviour. (For users of xopen, who do not know which backend is used.)
- # - There is quite a substantial overhead (+25% CPU time) when
- # using multiple threads while there is only a 10% gain in wall
- # clock time.
- threads = 1
+ program_args = [program, '-cd', path]
- pigz_args += ['-p', str(threads)]
+ if threads_flag is not None:
+ if threads is None:
+ # Single threaded behaviour by default because:
+ # - Using a single thread to read a file is the least unexpected
+ # behaviour. (For users of xopen, who do not know which backend is used.)
+ # - There is quite a substantial overhead (+25% CPU time) when
+ # using multiple threads while there is only a 10% gain in wall
+ # clock time.
+ threads = 1
+ program_args += [threads_flag, str(threads)]
- self.process = Popen(pigz_args, stdout=PIPE, stderr=PIPE)
+ self.process = Popen(program_args, stdout=PIPE, stderr=PIPE)
self.name = path
+
+ _set_pipe_size_to_max(self.process.stdout.fileno())
+
if 'b' not in mode:
self._file = io.TextIOWrapper(self.process.stdout)
else:
@@ -283,6 +331,84 @@ class PipedGzipReader(Closing):
return None
+class PipedGzipReader(PipedCompressionReader):
+ """
+ Open a pipe to pigz for reading a gzipped file. Even though pigz is mostly
+ used to speed up writing by using many compression threads, it is
+ also faster when reading, even when forced to use a single thread
+ (ca. 2x speedup).
+ """
+ def __init__(self, path, mode='r', threads=None):
+ try:
+ super().__init__(path, "pigz", mode, "-p", threads)
+ except OSError:
+ super().__init__(path, "gzip", mode, None, threads)
+
+
+class PipedGzipWriter(PipedCompressionWriter):
+ """
+ Write gzip-compressed files by running an external gzip or pigz process and
+ piping into it. pigz is tried first. It is fast because it can compress using
+ multiple cores. Also it is more efficient on one core.
+ If pigz is not available, a gzip subprocess is used. On Python 3, gzip.GzipFile is on
+ par with gzip itself, but running an external gzip can still reduce wall-clock
+ time because the compression happens in a separate process.
+ """
+ def __init__(self, path, mode='wt', compresslevel=None, threads=None):
+ """
+ mode -- one of 'w', 'wt', 'wb', 'a', 'at', 'ab'
+ compresslevel -- compression level
+ threads (int) -- number of pigz threads. If this is set to None, a reasonable default is
+ used. At the moment, this means that the number of available CPU cores is used, capped
+ at four to avoid creating too many threads. Use 0 to let pigz use all available cores.
+ """
+ if compresslevel is not None and compresslevel not in range(1, 10):
+ raise ValueError("compresslevel must be between 1 and 9")
+ try:
+ super().__init__(path, "pigz", mode, compresslevel, "-p", threads)
+ except OSError:
+ super().__init__(path, "gzip", mode, compresslevel, None, threads)
+
+
+class PipedIGzipReader(PipedCompressionReader):
+ """
+ Uses igzip for reading of a gzipped file. This is much faster than either
+ gzip or pigz which were written to run on a wide array of systems. igzip
+ can only run on x86 and ARM architectures, but is able to use more
+ architecture-specific optimizations as a result.
+ """
+ def __init__(self, path, mode="r"):
+ if not _can_read_concatenated_gz("igzip"):
+ # Instead of elaborate version string checking once the problem is
+ # fixed, it is much easier to use this, "proof in the pudding" type
+ # of evaluation.
+ raise ValueError(
+ "This version of igzip does not support reading "
+ "concatenated gzip files and is therefore not "
+ "safe to use. See: https://github.com/intel/isa-l/issues/143")
+ super().__init__(path, "igzip", mode)
+
+
+class PipedIGzipWriter(PipedCompressionWriter):
+ """
+ Uses igzip for writing a gzipped file. This is much faster than either
+ gzip or pigz which were written to run on a wide array of systems. igzip
+ can only run on x86 and ARM architectures, but is able to use more
+ architecture-specific optimizations as a result.
+
+ Threads are supported by a flag, but do not add any speed. Also on some
+ distro version (isal package in debian buster) the thread flag is not
+ present. For these reason threads are omitted from the interface.
+ Only compresslevel 0-3 are supported and these output slightly different
+ filesizes from their pigz/gzip counterparts.
+ See: https://gist.github.com/rhpvorderman/4f1201c3f39518ff28dde45409eb696b
+ """
+ def __init__(self, path, mode="wt", compresslevel=None):
+ if compresslevel is not None and compresslevel not in range(0, 4):
+ raise ValueError("compresslevel must be between 0 and 3")
+ super().__init__(path, "igzip", mode, compresslevel)
+
+
def _open_stdin_or_out(mode):
# Do not return sys.stdin or sys.stdout directly as we want the returned object
# to be closable without closing sys.stdout.
@@ -305,16 +431,27 @@ def _open_gz(filename, mode, compresslevel, threads):
if threads != 0:
try:
if 'r' in mode:
- return PipedGzipReader(filename, mode, threads=threads)
+ try:
+ return PipedIGzipReader(filename, mode)
+ except (OSError, ValueError):
+ # No igzip installed or version does not support reading
+ # concatenated files.
+ return PipedGzipReader(filename, mode, threads=threads)
else:
- return PipedGzipWriter(filename, mode, compresslevel, threads=threads)
- except FileNotFoundError:
+ try:
+ return PipedIGzipWriter(filename, mode, compresslevel)
+ except (OSError, ValueError):
+ # No igzip installed or compression level higher than 3
+ return PipedGzipWriter(filename, mode, compresslevel, threads=threads)
+ except OSError:
pass # We try without threads.
if 'r' in mode:
return gzip.open(filename, mode)
else:
- return gzip.open(filename, mode, compresslevel=compresslevel)
+ # Override gzip.open's default of 9 for consistency with command-line gzip.
+ return gzip.open(filename, mode,
+ compresslevel=6 if compresslevel is None else compresslevel)
def _detect_format_from_content(filename):
@@ -354,7 +491,7 @@ def _detect_format_from_extension(filename):
return None
-def xopen(filename, mode='r', compresslevel=6, threads=None):
+def xopen(filename, mode='r', compresslevel=None, threads=None):
"""
A replacement for the "open" function that can also read and write
compressed files transparently. The supported compression formats are gzip,
@@ -373,7 +510,8 @@ def xopen(filename, mode='r', compresslevel=6, threads=None):
will raise an error.
compresslevel is the compression level for writing to gzip files.
- This parameter is ignored for the other compression formats.
+ This parameter is ignored for the other compression formats. If set to
+ None (default), level 6 is used.
threads only has a meaning when reading or writing gzip files.
@@ -387,8 +525,6 @@ def xopen(filename, mode='r', compresslevel=6, threads=None):
if mode not in ('rt', 'rb', 'wt', 'wb', 'at', 'ab'):
raise ValueError("Mode '{}' not supported".format(mode))
filename = fspath(filename)
- if compresslevel not in range(1, 10):
- raise ValueError("compresslevel must be between 1 and 9")
if filename == '-':
return _open_stdin_or_out(mode)
=====================================
src/xopen/_version.py
=====================================
@@ -1,4 +1,4 @@
# coding: utf-8
# file generated by setuptools_scm
# don't change, don't track in version control
-version = '0.9.0'
+version = '1.0.0'
=====================================
tests/test_xopen.py
=====================================
@@ -1,13 +1,15 @@
import io
import os
import random
+import shutil
import signal
+import sys
import time
import pytest
from pathlib import Path
-from xopen import xopen, PipedGzipReader, PipedGzipWriter
-
+from xopen import xopen, PipedCompressionWriter, PipedGzipReader, \
+ PipedGzipWriter, _MAX_PIPE_SIZE, _can_read_concatenated_gz
extensions = ["", ".gz", ".bz2"]
@@ -17,6 +19,13 @@ try:
except ImportError:
lzma = None
+try:
+ import fcntl
+ if not hasattr(fcntl, "F_GETPIPE_SZ") and sys.platform == "linux":
+ setattr(fcntl, "F_GETPIPE_SZ", 1032)
+except ImportError:
+ fcntl = None
+
base = "tests/file.txt"
files = [base + ext for ext in extensions]
CONTENT_LINES = ['Testing, testing ...\n', 'The second line.\n']
@@ -33,6 +42,23 @@ def fname(request):
return request.param
+ at pytest.fixture
+def lacking_pigz_permissions(tmp_path):
+ """
+ Set PATH to a directory that contains a pigz binary with permissions set to 000.
+ If no suitable pigz binary could be found, PATH is set to an empty directory
+ """
+ pigz_path = shutil.which("pigz")
+ if pigz_path:
+ shutil.copy(pigz_path, str(tmp_path))
+ os.chmod(str(tmp_path / "pigz"), 0)
+
+ path = os.environ["PATH"]
+ os.environ["PATH"] = str(tmp_path)
+ yield
+ os.environ["PATH"] = path
+
+
@pytest.fixture
def large_gzip(tmpdir):
path = str(tmpdir.join("large.gz"))
@@ -394,3 +420,25 @@ if lzma is not None:
def test_detect_xz_file_format_from_content():
with xopen("tests/file.txt.xz.test", "rb") as fh:
assert fh.readline() == CONTENT_LINES[0].encode("utf-8")
+
+
+def test_concatenated_gzip_function():
+ assert _can_read_concatenated_gz("gzip") is True
+ assert _can_read_concatenated_gz("pigz") is True
+ assert _can_read_concatenated_gz("xz") is False
+
+
+ at pytest.mark.skipif(
+ not hasattr(fcntl, "F_GETPIPE_SZ") and _MAX_PIPE_SIZE is not None,
+ reason="Pipe size modifications not available on this platform.")
+def test_pipesize_changed(tmpdir):
+ path = Path(str(tmpdir), "hello.gz")
+ with xopen(path, "wb") as f:
+ assert isinstance(f, PipedCompressionWriter)
+ assert fcntl.fcntl(f._file.fileno(),
+ fcntl.F_GETPIPE_SZ) == _MAX_PIPE_SIZE
+
+
+def test_xopen_falls_back_to_gzip_open(lacking_pigz_permissions):
+ with xopen("tests/file.txt.gz", "rb") as f:
+ assert f.readline() == CONTENT_LINES[0].encode("utf-8")
=====================================
tox.ini
=====================================
@@ -1,5 +1,5 @@
[tox]
-envlist = flake8,py35,py36,py37,py38,pypy3
+envlist = flake8,py35,py36,py37,py38,py39,pypy3
[testenv]
deps = pytest
View it on GitLab: https://salsa.debian.org/med-team/python-xopen/-/commit/52425f7003b165eac6e7793cea7c058cbaa42b3b
--
View it on GitLab: https://salsa.debian.org/med-team/python-xopen/-/commit/52425f7003b165eac6e7793cea7c058cbaa42b3b
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20201108/222d61fc/attachment-0001.html>
More information about the debian-med-commit
mailing list