[med-svn] [Git][med-team/python-pyfaidx][master] 10 commits: Fix watchfile to detect new versions on github
Andreas Tille (@tille)
gitlab at salsa.debian.org
Thu Feb 17 14:41:48 GMT 2022
Andreas Tille pushed to branch master at Debian Med / python-pyfaidx
Commits:
b8f1f6ce by Andreas Tille at 2022-02-17T15:12:31+01:00
Fix watchfile to detect new versions on github
- - - - -
dc77492a by Andreas Tille at 2022-02-17T15:12:33+01:00
New upstream version 0.6.4
- - - - -
dc8b3889 by Andreas Tille at 2022-02-17T15:12:33+01:00
routine-update: New upstream version
- - - - -
974af803 by Andreas Tille at 2022-02-17T15:12:34+01:00
Update upstream source from tag 'upstream/0.6.4'
Update to upstream version '0.6.4'
with Debian dir 3709265088b7cac48b673087e41510e8ad72c0c8
- - - - -
ce89f09d by Andreas Tille at 2022-02-17T15:12:42+01:00
Apply multi-arch hints.
+ python-pyfaidx-examples: Add Multi-Arch: foreign.
Changes-By: apply-multiarch-hints
- - - - -
a816fc72 by Andreas Tille at 2022-02-17T15:13:59+01:00
Use tags instead of releases
- - - - -
8c6f2831 by Andreas Tille at 2022-02-17T15:16:11+01:00
Update patches
- - - - -
7cc447e9 by Andreas Tille at 2022-02-17T15:22:43+01:00
Do not use setuptools-scm
- - - - -
16fe592f by Andreas Tille at 2022-02-17T15:25:35+01:00
Build-Depends: python3-pytest
- - - - -
707ff65c by Andreas Tille at 2022-02-17T15:41:10+01:00
Simplify build time test
- - - - -
29 changed files:
- README.rst
- debian/changelog
- debian/control
- + debian/patches/no_setuptools-scm.patch
- debian/patches/series
- − debian/patches/test-locale-to-c.patch
- debian/rules
- debian/watch
- dev-requirements.txt
- pyfaidx/__init__.py
- setup.py
- tests/data/download_gene_fasta.py
- tests/test_FastaRecord.py
- tests/test_FastaRecord_iter.py
- tests/test_FastaVariant.py
- tests/test_Fasta_bgzip.py
- tests/test_Fasta_integer_index.py
- tests/test_Fasta_synchronization.py
- tests/test_bio_seqio.py
- tests/test_faidx.py
- tests/test_feature_bounds_check.py
- tests/test_feature_default_seq.py
- tests/test_feature_indexing.py
- tests/test_feature_key_function.py
- tests/test_feature_read_ahead_buffer.py
- tests/test_feature_sequence_as_raw.py
- tests/test_feature_spliced_seq.py
- tests/test_feature_split_char.py
- tests/test_sequence_class.py
Changes:
=====================================
README.rst
=====================================
@@ -1,4 +1,4 @@
-|Travis| |PyPI| |Coverage| |Depsy|
+|CI| |Package| |PyPI| |Coverage| |Depsy|
Description
-----------
@@ -26,7 +26,7 @@ If you use pyfaidx in your publication, please cite:
Installation
------------
-This package is tested under Linux, MacOS, and Windows using Python 3.2-3.4, 2.7, 2.6, and pypy and is available from the PyPI:
+This package is tested under Linux and macOS using Python 3.6+, and and is available from the PyPI:
::
@@ -38,7 +38,7 @@ or download a `release <https://github.com/mdshw5/pyfaidx/releases>`_ and:
python setup.py install
-If using ``pip install --user`` make sure to add ``/home/$(whoami)/.local/bin`` to your ``$PATH`` if you want to run the ``faidx`` script.
+If using ``pip install --user`` make sure to add ``/home/$USER/.local/bin`` to your ``$PATH`` (on linux) or ``/Users/$USER/Library/Python/{python version}/bin`` (on macOS) if you want to run the ``faidx`` script.
Usage
-----
@@ -575,13 +575,16 @@ create also the relevant test.
To get test running on your machine:
- Create a new virtualenv and install the `dev-requirements.txt`.
+
+ pip install -r dev-requirements.txt
+
- Download the test data running:
python tests/data/download_gene_fasta.py
- Run the tests with
- nosetests --with-coverage --cover-package=pyfaidx
+ pytests
Acknowledgements
----------------
@@ -595,6 +598,9 @@ Comprehensive Cancer Center in the Department of Oncology.
.. |Travis| image:: https://travis-ci.com/mdshw5/pyfaidx.svg?branch=master
:target: https://travis-ci.com/mdshw5/pyfaidx
+
+.. |CI| image:: https://github.com/mdshw5/pyfaidx/actions/workflows/main.yml/badge.svg?branch=master
+ :target: https://github.com/mdshw5/pyfaidx/actions/workflows/main.yml
.. |PyPI| image:: https://img.shields.io/pypi/v/pyfaidx.svg?branch=master
:target: https://pypi.python.org/pypi/pyfaidx
@@ -611,3 +617,6 @@ Comprehensive Cancer Center in the Department of Oncology.
.. |Appveyor| image:: https://ci.appveyor.com/api/projects/status/80ihlw30a003596w?svg=true
:target: https://ci.appveyor.com/project/mdshw5/pyfaidx
+
+.. |Package| image:: https://github.com/mdshw5/pyfaidx/actions/workflows/pypi.yml/badge.svg
+ :target: https://github.com/mdshw5/pyfaidx/actions/workflows/pypi.yml
=====================================
debian/changelog
=====================================
@@ -1,3 +1,16 @@
+python-pyfaidx (0.6.4-1) UNRELEASED; urgency=medium
+
+ * Fix watchfile to detect new versions on github
+ * New upstream version
+ * Apply multi-arch hints.
+ + python-pyfaidx-examples: Add Multi-Arch: foreign.
+ * Do not use setuptools-scm
+ * Build-Depends: python3-pytest, python3-pkg-resources;
+ Remove python3-nose from Build-Depends
+ * Simplify build time test
+
+ -- Andreas Tille <tille at debian.org> Thu, 17 Feb 2022 15:12:33 +0100
+
python-pyfaidx (0.6.2-1) unstable; urgency=medium
* Team upload.
=====================================
debian/control
=====================================
@@ -9,11 +9,12 @@ Build-Depends: debhelper-compat (= 13),
python3-all,
python3-coverage,
python3-setuptools,
- python3-nose,
- python3-numpy,
- python3-six,
- python3-mock,
- samtools,
+ python3-numpy <!nocheck>,
+ python3-pkg-resources <!nocheck>,
+ python3-pytest <!nocheck>,
+ python3-six <!nocheck>,
+ python3-mock <!nocheck>,
+ samtools <!nocheck>,
tabix <!nocheck>
Standards-Version: 4.6.0
Vcs-Browser: https://salsa.debian.org/med-team/python-pyfaidx
@@ -42,6 +43,7 @@ Package: python-pyfaidx-examples
Architecture: all
Depends: ${misc:Depends}
Suggests: python3-pyfaidx
+Multi-Arch: foreign
Description: example data for efficient random access to fasta subsequences for Python
Samtools provides a function "faidx" (FAsta InDeX), which creates a
small flat index file ".fai" allowing for fast random access to any
=====================================
debian/patches/no_setuptools-scm.patch
=====================================
@@ -0,0 +1,15 @@
+Description: Do not use setuptools-scm
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Thu, 17 Feb 2022 15:12:33 +0100
+
+--- a/setup.py
++++ b/setup.py
+@@ -18,8 +18,6 @@ setup(
+ license='BSD',
+ packages=['pyfaidx'],
+ install_requires=install_requires,
+- use_scm_version={"local_scheme": "no-local-version"},
+- setup_requires=['setuptools_scm'],
+ entry_points={'console_scripts': ['faidx = pyfaidx.cli:main']},
+ classifiers=[
+ "Development Status :: 5 - Production/Stable",
=====================================
debian/patches/series
=====================================
@@ -1,2 +1,2 @@
-test-locale-to-c.patch
remove-ignore-docstring.patch
+no_setuptools-scm.patch
=====================================
debian/patches/test-locale-to-c.patch deleted
=====================================
@@ -1,18 +0,0 @@
-Description: Test locale to C
- Change the testing locale "C", for avoiding the case in which the
- additional locale set "en_US.utf8" is not available.
-Author: Sao I Kuan <saoikuan at gmail.com>
-Forwarded: not-needed
-Last-Update: 2020-09-30
-
---- python-pyfaidx-0.5.9.1.orig/tests/test_feature_indexing.py
-+++ python-pyfaidx-0.5.9.1/tests/test_feature_indexing.py
-@@ -322,7 +322,7 @@ class TestIndexing(TestCase):
- import locale
- old_locale = locale.getlocale(locale.LC_NUMERIC)
- try:
-- locale.setlocale(locale.LC_NUMERIC, 'en_US.utf8')
-+ locale.setlocale(locale.LC_NUMERIC, 'C')
- faidx = Faidx('data/genes.fasta')
- faidx.write_fai()
- faidx = Faidx('data/genes.fasta', build_index=False)
=====================================
debian/rules
=====================================
@@ -12,8 +12,7 @@ export PYBUILD_NAME=pyfaidx
override_dh_auto_test:
ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
bgzip -c tests/data/genes.fasta > tests/data/genes.fasta.gz
- dh_auto_test -- --test --system=custom --test-args='set -e; \
- {interpreter} -m "nose" --with-coverage --cover-package=pyfaidx'
+ dh_auto_test
endif
override_dh_compress:
=====================================
debian/watch
=====================================
@@ -1,2 +1,2 @@
version=4
-https://github.com/mdshw5/pyfaidx/releases .*/archive/.*/v(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz)
+https://github.com/mdshw5/pyfaidx/tags .*/v?(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz)
=====================================
dev-requirements.txt
=====================================
@@ -1,5 +1,13 @@
six
-nose
+pytest
+pytest-cov
+setuptools
+mock
+cython
+pysam
+requests
+coverage
+pyfasta
+pyvcf
+numpy
biopython
-setuptools >= 0.7
-mock; python_version < '3.3'
\ No newline at end of file
=====================================
pyfaidx/__init__.py
=====================================
@@ -16,6 +16,7 @@ from itertools import islice
from math import ceil
from os.path import getmtime
from threading import Lock
+from pkg_resources import get_distribution
from six import PY2, PY3, integer_types, string_types
from six.moves import zip_longest
@@ -24,15 +25,14 @@ try:
from collections import OrderedDict
except ImportError: #python 2.6
from ordereddict import OrderedDict
+
+__version__ = get_distribution("pyfaidx").version
if sys.version_info > (3, ):
buffer = memoryview
dna_bases = re.compile(r'([ACTGNactgnYRWSKMDVHBXyrwskmdvhbx]+)')
-__version__ = '0.6.2'
-
-
class KeyFunctionError(ValueError):
"""Raised if the key_function argument is invalid."""
@@ -45,6 +45,10 @@ class IndexNotFoundError(IOError):
"""Raised if read_fai cannot open the index file."""
+class VcfIndexNotFoundError(IOError):
+ """Raised if vcf cannot find a tbi file."""
+
+
class FastaNotFoundError(IOError):
"""Raised if the fasta file cannot be opened."""
@@ -348,8 +352,8 @@ class Faidx(object):
try:
from Bio import bgzf
from Bio import __version__ as bgzf_version
- from distutils.version import LooseVersion
- if LooseVersion(bgzf_version) < LooseVersion('1.73'):
+ from packaging.version import Version
+ if Version(bgzf_version) < Version('1.73'):
raise ImportError
except ImportError:
raise ImportError(
@@ -1130,6 +1134,8 @@ class FastaVariant(Fasta):
self.vcf = vcf.Reader(filename=vcf_file)
else:
raise IOError("File {0} does not exist.".format(vcf_file))
+ if not os.path.exists(vcf_file + '.tbi'):
+ raise VcfIndexNotFoundError("File {0} has not tabix index.".format(vcf_file))
if sample is not None:
self.sample = sample
else:
@@ -1163,14 +1169,24 @@ class FastaVariant(Fasta):
else:
seq_mut = list(seq.seq)
del seq.seq
- var = self.vcf.fetch(name, start - 1, end)
- for record in var:
- if record.is_snp: # skip indels
- sample = record.genotype(self.sample)
- if sample.gt_type in self.gt_type and eval(self.filter):
- alt = record.ALT[0]
- i = (record.POS - 1) - (start - 1)
- seq_mut[i:i + len(alt)] = str(alt)
+ try:
+ var = self.vcf.fetch(name, start - 1, end)
+ for record in var:
+ if record.is_snp: # skip indels
+ sample = record.genotype(self.sample)
+ if sample.gt_type in self.gt_type and eval(self.filter):
+ alt = record.ALT[0]
+ i = (record.POS - 1) - (start - 1)
+ seq_mut[i:i + len(alt)] = str(alt)
+ except ValueError as e: # Can be raised if name is not part of tabix for vcf
+ if self.vcf._tabix is not None and name not in self.vcf._tabix.contigs:
+ # The chromosome name is not part of the vcf
+ # The sequence returned is the same as the reference
+ pass
+ else:
+ # This is something else
+ raise e
+
# slice the list in case we added an MNP in last position
if self.faidx.as_raw:
return ''.join(seq_mut[:end - start + 1])
=====================================
setup.py
=====================================
@@ -6,19 +6,9 @@ install_requires = ['six', 'setuptools >= 0.7']
if sys.version_info[0] == 2 and sys.version_info[1] == 6:
install_requires.extend(['ordereddict', 'argparse'])
-
-def get_version(string):
- """ Parse the version number variable __version__ from a script. """
- import re
- version_re = r"^__version__ = ['\"]([^'\"]*)['\"]"
- version_str = re.search(version_re, string, re.M).group(1)
- return version_str
-
-
setup(
name='pyfaidx',
provides='pyfaidx',
- version=get_version(open('pyfaidx/__init__.py', encoding='utf-8').read()),
author='Matthew Shirley',
author_email='mdshw5 at gmail.com',
url='http://mattshirley.com',
@@ -28,6 +18,8 @@ setup(
license='BSD',
packages=['pyfaidx'],
install_requires=install_requires,
+ use_scm_version={"local_scheme": "no-local-version"},
+ setup_requires=['setuptools_scm'],
entry_points={'console_scripts': ['faidx = pyfaidx.cli:main']},
classifiers=[
"Development Status :: 5 - Production/Stable",
=====================================
tests/data/download_gene_fasta.py
=====================================
@@ -36,21 +36,30 @@ def fetch_genes(filename, suffix=None):
def fetch_chr22(filename):
import requests
import gzip
+ import io
with requests.get('https://ftp-trace.ncbi.nih.gov/1000genomes/ftp/pilot_data/technical/reference/human_b36_male.fa.gz') as compressed:
- with open(filename, 'w') as fasta, gzip.GzipFile(fileobj=compressed.raw) as gz:
+ with open(filename, 'w') as fasta, gzip.GzipFile(fileobj=io.BytesIO(compressed.content)) as gz:
chr22 = False
for line in gz:
- if line[0:3] == '>22':
- fasta.write(line)
+ line_content = line.decode()
+ if line_content[0:3] == '>22':
+ fasta.write(line_content)
chr22 = True
elif not chr22:
continue
- elif chr22 and line[0] == '>':
- curl.kill()
+ elif chr22 and line_content[0] == '>':
break
elif chr22:
- fasta.write(line)
+ fasta.write(line_content)
+
+def add_fake_chr(existing_fasta, filename):
+ with open(filename, 'w') as fasta:
+ with open(existing_fasta, 'r') as old_fa:
+ for line in old_fa:
+ fasta.write(line)
+ fasta.write('>fake chromosome not in vcf\n')
+ fasta.write('ATCG\n')
def fake_chr22(filename):
""" Fake up some data """
@@ -84,10 +93,16 @@ if __name__ == "__main__":
path = os.path.dirname(__file__)
os.chdir(path)
if not os.path.isfile("genes.fasta") or not os.path.isfile("genes.fasta.lower"):
+ print("GETTING genes")
fetch_genes("genes.fasta")
if not os.path.isfile("chr22.vcf.gz"):
+ print("GETTING vcf")
fetch_chr22_vcf("chr22.vcf.gz")
if not os.path.isfile("chr22.fasta"):
+ print("GETTING chr22.fasta")
fetch_chr22("chr22.fasta")
+ if not os.path.isfile("chr22andfake.fasta"):
+ print("adding fake chr")
+ add_fake_chr("chr22.fasta", "chr22andfake.fasta")
bgzip_compress_fasta("genes.fasta")
bgzip_compress_fasta("chr22.fasta")
=====================================
tests/test_FastaRecord.py
=====================================
@@ -1,137 +1,131 @@
import os
import sys
+import pytest
from pyfaidx import Fasta
from tempfile import NamedTemporaryFile
-from unittest import TestCase
-from nose.tools import raises
from difflib import Differ
path = os.path.dirname(__file__)
os.chdir(path)
-
-class TestFastaRecord(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_sequence_uppercase(self):
- """Test that the sequence is always returned in
- uppercase, even if it is in lowercase in the
- reference genome.
- """
- filename = "data/genes.fasta.lower"
- reference_upper = Fasta(filename, sequence_always_upper=True)
- reference_normal = Fasta(filename)
- os.remove('data/genes.fasta.lower.fai')
- assert reference_upper['gi|557361099|gb|KF435150.1|'][
- 1:100].seq == reference_normal['gi|557361099|gb|KF435150.1|'][
- 1:100].seq.upper()
-
- def test_long_names(self):
- """ Test that deflines extracted using FastaRecord.long_name are
- identical to deflines in the actual file.
- """
- deflines = []
- with open('data/genes.fasta') as fasta_file:
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_sequence_uppercase(remove_index):
+ """Test that the sequence is always returned in
+ uppercase, even if it is in lowercase in the
+ reference genome.
+ """
+ filename = "data/genes.fasta.lower"
+ reference_upper = Fasta(filename, sequence_always_upper=True)
+ reference_normal = Fasta(filename)
+ os.remove('data/genes.fasta.lower.fai')
+ assert reference_upper['gi|557361099|gb|KF435150.1|'][
+ 1:100].seq == reference_normal['gi|557361099|gb|KF435150.1|'][
+ 1:100].seq.upper()
+
+def test_long_names(remove_index):
+ """ Test that deflines extracted using FastaRecord.long_name are
+ identical to deflines in the actual file.
+ """
+ deflines = []
+ with open('data/genes.fasta') as fasta_file:
+ for line in fasta_file:
+ if line[0] == '>':
+ deflines.append(line[1:-1])
+ fasta = Fasta('data/genes.fasta')
+ long_names = []
+ for record in fasta:
+ long_names.append(record.long_name)
+ print(tuple(zip(deflines, long_names)))
+ assert deflines == long_names
+
+def test_issue_62(remove_index):
+ """ Check for pathogenic FastaRecord.long_name behavior in mdshw5/pyfaidx#62 """
+ deflines = []
+ line_len = None
+ with open('data/genes.fasta', 'rb') as fasta_file:
+ with open('data/issue_62.fa', 'wb') as fasta_uniform_len:
for line in fasta_file:
- if line[0] == '>':
- deflines.append(line[1:-1])
- fasta = Fasta('data/genes.fasta')
- long_names = []
- for record in fasta:
- long_names.append(record.long_name)
- print(tuple(zip(deflines, long_names)))
- assert deflines == long_names
-
- def test_issue_62(self):
- """ Check for pathogenic FastaRecord.long_name behavior in mdshw5/pyfaidx#62 """
- deflines = []
- line_len = None
- with open('data/genes.fasta', 'rb') as fasta_file:
- with open('data/issue_62.fa', 'wb') as fasta_uniform_len:
- for line in fasta_file:
- if line.startswith(b'>'):
- deflines.append(line[1:-1].decode('ascii'))
- fasta_uniform_len.write(line)
- elif line_len is None:
- line_len = len(line)
- fasta_uniform_len.write(line)
- elif line_len > len(line):
- fasta_uniform_len.write(line.rstrip() + b'N' *
- (line_len - len(line)) + b'\n')
- else:
- fasta_uniform_len.write(line)
- fasta = Fasta('data/issue_62.fa', as_raw=True)
- long_names = []
- for record in fasta:
- long_names.append(record.long_name)
- try:
- os.remove('data/issue_62.fa')
- os.remove('data/issue_62.fa.fai')
- except EnvironmentError:
- pass
- sys.stdout.writelines(tuple(Differ().compare(deflines, long_names)))
- assert deflines == long_names
-
- def test_unpadded_length(self):
- filename = "data/padded.fasta"
- with open(filename, 'w') as padded:
- padded.write(">test_padded\n")
- for n in range(10):
- padded.write("N" * 80)
- padded.write("\n")
- padded.write("N" * 30)
- padded.write("A" * 20)
- padded.write("N" * 30)
+ if line.startswith(b'>'):
+ deflines.append(line[1:-1].decode('ascii'))
+ fasta_uniform_len.write(line)
+ elif line_len is None:
+ line_len = len(line)
+ fasta_uniform_len.write(line)
+ elif line_len > len(line):
+ fasta_uniform_len.write(line.rstrip() + b'N' *
+ (line_len - len(line)) + b'\n')
+ else:
+ fasta_uniform_len.write(line)
+ fasta = Fasta('data/issue_62.fa', as_raw=True)
+ long_names = []
+ for record in fasta:
+ long_names.append(record.long_name)
+ try:
+ os.remove('data/issue_62.fa')
+ os.remove('data/issue_62.fa.fai')
+ except EnvironmentError:
+ pass
+ sys.stdout.writelines(tuple(Differ().compare(deflines, long_names)))
+ assert deflines == long_names
+
+def test_unpadded_length(remove_index):
+ filename = "data/padded.fasta"
+ with open(filename, 'w') as padded:
+ padded.write(">test_padded\n")
+ for n in range(10):
+ padded.write("N" * 80)
+ padded.write("\n")
+ padded.write("N" * 30)
+ padded.write("A" * 20)
+ padded.write("N" * 30)
+ padded.write("\n")
+ for n in range(10):
+ padded.write("N" * 80)
padded.write("\n")
- for n in range(10):
- padded.write("N" * 80)
- padded.write("\n")
-
- fasta = Fasta(filename)
- expect = 20
- result = fasta["test_padded"].unpadded_len
- print(expect, result)
- assert expect == result
- os.remove('data/padded.fasta')
- os.remove('data/padded.fasta.fai')
-
- def test_numpy_array(self):
- """ Test the __array_interface__ """
- import numpy
- filename = "data/genes.fasta.lower"
- reference = Fasta(filename)
- np_array = numpy.asarray(reference[0])
- assert isinstance(np_array, numpy.ndarray)
-
-
-class TestMutableFastaRecord(TestCase):
- def setUp(self):
- with open('data/genes_mutable.fasta', 'wb') as mutable:
- mutable.write(open('data/genes.fasta', 'rb').read())
- self.mutable_fasta = Fasta('data/genes_mutable.fasta', mutable=True)
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
- try:
- os.remove('data/genes_mutable.fasta')
- except EnvironmentError:
- pass # some tests may delete this file
- try:
- os.remove('data/genes_mutable.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
- def test_mutate_fasta_to_same(self):
+ fasta = Fasta(filename)
+ expect = 20
+ result = fasta["test_padded"].unpadded_len
+ print(expect, result)
+ assert expect == result
+ os.remove('data/padded.fasta')
+ os.remove('data/padded.fasta.fai')
+
+def test_numpy_array(remove_index):
+ """ Test the __array_interface__ """
+ import numpy
+ filename = "data/genes.fasta.lower"
+ reference = Fasta(filename)
+ np_array = numpy.asarray(reference[0])
+ assert isinstance(np_array, numpy.ndarray)
+
+ at pytest.fixture
+def remove_index_mutable():
+ with open('data/genes_mutable.fasta', 'wb') as mutable:
+ mutable.write(open('data/genes.fasta', 'rb').read())
+ mutable_fasta = Fasta('data/genes_mutable.fasta', mutable=True)
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+ try:
+ os.remove('data/genes_mutable.fasta')
+ except EnvironmentError:
+ pass # some tests may delete this file
+ try:
+ os.remove('data/genes_mutable.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+ def test_mutate_fasta_to_same(remove_index_mutable):
mutable = Fasta('data/genes_mutable.fasta', mutable=True)
fasta = Fasta('data/genes.fasta', mutable=False)
chunk = fasta['gi|557361099|gb|KF435150.1|'][0:100]
@@ -139,26 +133,26 @@ class TestMutableFastaRecord(TestCase):
assert str(fasta['gi|557361099|gb|KF435150.1|']) == str(
mutable['gi|557361099|gb|KF435150.1|'])
- def test_mutate_fasta_to_N(self):
+ def test_mutate_fasta_to_N(remove_index_mutable):
mutable = Fasta('data/genes_mutable.fasta', mutable=True)
chunk = 100 * 'N'
mutable['gi|557361099|gb|KF435150.1|'][0:100] = chunk
assert mutable['gi|557361099|gb|KF435150.1|'][0:100].seq == chunk
- def test_mutate_single_position(self):
+ def test_mutate_single_position(remove_index_mutable):
mutable = Fasta('data/genes_mutable.fasta', mutable=True)
chunk = 'N'
mutable['gi|557361099|gb|KF435150.1|'][0] = chunk
assert mutable['gi|557361099|gb|KF435150.1|'][0].seq == chunk
- @raises(TypeError)
- def test_mutate_immutable_fasta(self):
+ @pytest.mark.xfail(raises=TypeError)
+ def test_mutate_immutable_fasta(remove_index_mutable):
mutable = Fasta('data/genes_mutable.fasta', mutable=False)
chunk = 100 * 'N'
mutable['gi|557361099|gb|KF435150.1|'][0:100] = chunk
- @raises(IOError)
- def test_mutate_too_long(self):
+ @pytest.mark.xfail(raises=IOError)
+ def test_mutate_too_long(remove_index_mutable):
mutable = Fasta('data/genes_mutable.fasta', mutable=True)
chunk = 101 * 'N'
mutable['gi|557361099|gb|KF435150.1|'][0:100] = chunk
=====================================
tests/test_FastaRecord_iter.py
=====================================
@@ -1,34 +1,32 @@
import os
+import pytest
from pyfaidx import Fasta
from itertools import chain
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFastaRecordIter(TestCase):
- def setUp(self):
- pass
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+def test_fetch_whole_fasta(remove_index):
+ expect = [line.rstrip('\n') for line in open('data/genes.fasta') if line[0] != '>']
+ result = list(chain(*([line for line in record] for record in Fasta('data/genes.fasta', as_raw=True))))
+ assert expect == result
- def test_fetch_whole_fasta(self):
- expect = [line.rstrip('\n') for line in open('data/genes.fasta') if line[0] != '>']
- result = list(chain(*([line for line in record] for record in Fasta('data/genes.fasta', as_raw=True))))
- assert expect == result
+def test_line_len(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ for record in fasta:
+ assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
- def test_line_len(self):
- fasta = Fasta('data/genes.fasta')
- for record in fasta:
- assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
-
- def test_reverse_iter(self):
- expect = list(chain(*([line[::-1] for line in record][::-1] for record in Fasta('data/genes.fasta', as_raw=True))))
- result = list(chain(*([line for line in reversed(record)] for record in Fasta('data/genes.fasta', as_raw=True))))
- for a, b in zip(expect, result):
- print(a, b)
- assert expect == result
+def test_reverse_iter(remove_index):
+ expect = list(chain(*([line[::-1] for line in record][::-1] for record in Fasta('data/genes.fasta', as_raw=True))))
+ result = list(chain(*([line for line in reversed(record)] for record in Fasta('data/genes.fasta', as_raw=True))))
+ for a, b in zip(expect, result):
+ print(a, b)
+ assert expect == result
=====================================
tests/test_FastaVariant.py
=====================================
@@ -1,63 +1,66 @@
import os
+import pytest
from pyfaidx import FastaVariant, Fasta
-from unittest import TestCase
-from nose.plugins.skip import SkipTest
-
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFastaVariant(TestCase):
-
- def setUp(self):
- raise SkipTest
-
- def tearDown(self):
- try:
- os.remove('data/chr22.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/chr22.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def test_fetch_variant(self):
- try:
- import pysam
- fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
- assert fasta['22'][32330458:32330462] == 'CAGG' # het
- assert fasta['22'][32352282:32352286] == 'CAGC' # hom
- except (ImportError, IOError):
- raise SkipTest
+def test_fetch_variant(remove_index):
+ try:
+ import pysam
+ fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
+ assert fasta['22'][32330458:32330462] == 'CAGG' # het
+ assert fasta['22'][32352282:32352286] == 'CAGC' # hom
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
- def test_fetch_hom_variant(self):
- try:
- import pysam
- fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=False, as_raw=True)
- assert fasta['22'][32330458:32330462] == 'CGGG' # het
- assert fasta['22'][32352282:32352286] == 'CAGC' # hom
- except (ImportError, IOError):
- raise SkipTest
+def test_fetch_hom_variant(remove_index):
+ try:
+ import pysam
+ fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=False, as_raw=True)
+ assert fasta['22'][32330458:32330462] == 'CGGG' # het
+ assert fasta['22'][32352282:32352286] == 'CAGC' # hom
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
- def test_fetch_het_variant(self):
- try:
- import pysam
- fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=False, het=True, as_raw=True)
- assert fasta['22'][32330458:32330462] == 'CAGG' # het
- assert fasta['22'][32352282:32352286] == 'CGGC' # hom
- except (ImportError, IOError):
- raise SkipTest
+def test_fetch_het_variant(remove_index):
+ try:
+ import pysam
+ fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=False, het=True, as_raw=True)
+ assert fasta['22'][32330458:32330462] == 'CAGG' # het
+ assert fasta['22'][32352282:32352286] == 'CGGC' # hom
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
- def test_all_pos(self):
- try:
- import pysam
- fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
- assert fasta['22'].variant_sites == (16042793, 21833121, 29153196, 29187373, 29187448, 29194610, 29821295, 29821332, 29993842, 32330460, 32352284)
- except (ImportError, IOError):
- raise SkipTest
+def test_fetch_chr_not_in_vcf(remove_index):
+ try:
+ import pysam
+ fasta = FastaVariant('data/chr22andfake.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
+ assert fasta['fake'][:10] == 'ATCG' # fake is not in vcf
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
+
+def test_all_pos(remove_index):
+ try:
+ import pysam
+ fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
+ assert fasta['22'].variant_sites == (16042793, 21833121, 29153196, 29187373, 29187448, 29194610, 29821295, 29821332, 29993842, 32330460, 32352284)
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
- def test_all_diff(self):
- try:
- fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
- ref = Fasta('data/chr22.fasta', as_raw=True)
- print([(ref['22'][pos-1], fasta['22'][pos-1]) for pos in fasta['22'].variant_sites])
- assert all(ref['22'][pos-1] != fasta['22'][pos-1] for pos in fasta['22'].variant_sites)
- except (ImportError, IOError):
- raise SkipTest
+def test_all_diff(remove_index):
+ try:
+ fasta = FastaVariant('data/chr22.fasta', 'data/chr22.vcf.gz', hom=True, het=True, as_raw=True)
+ ref = Fasta('data/chr22.fasta', as_raw=True)
+ print([(ref['22'][pos-1], fasta['22'][pos-1]) for pos in fasta['22'].variant_sites])
+ assert all(ref['22'][pos-1] != fasta['22'][pos-1] for pos in fasta['22'].variant_sites)
+ except (ImportError, IOError):
+ pytest.skip("pysam not installed.")
=====================================
tests/test_Fasta_bgzip.py
=====================================
@@ -1,258 +1,241 @@
import os
+import pytest
from pyfaidx import Fasta, Faidx, UnsupportedCompressionFormat, FetchError
from itertools import chain
-try:
- from unittest import TestCase, expectedFailure
-except ImportError:
- from unittest import TestCase
- from nose.plugins.skip import SkipTest as expectedFailure # python2.6
-from nose.tools import raises
-from nose.plugins.skip import SkipTest
path = os.path.dirname(__file__)
os.chdir(path)
-class TestIndexing(TestCase):
- def setUp(self):
- try:
- from Bio import SeqIO
- except ImportError:
- raise SkipTest
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.gz.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- @expectedFailure
- def test_build_issue_126(self):
- """ Samtools BGZF index should be identical to pyfaidx BGZF index """
- expect_index = ("gi|563317589|dbj|AB821309.1| 3510 114 70 71\n"
- "gi|557361099|gb|KF435150.1| 481 3789 70 71\n"
- "gi|557361097|gb|KF435149.1| 642 4368 70 71\n"
- "gi|543583796|ref|NR_104216.1| 4573 5141 70 71\n"
- "gi|543583795|ref|NR_104215.1| 5317 9901 70 71\n"
- "gi|543583794|ref|NR_104212.1| 5374 15415 70 71\n"
- "gi|543583788|ref|NM_001282545.1| 4170 20980 70 71\n"
- "gi|543583786|ref|NM_001282543.1| 5466 25324 70 71\n"
- "gi|543583785|ref|NM_000465.3| 5523 30980 70 71\n"
- "gi|543583740|ref|NM_001282549.1| 3984 36696 70 71\n"
- "gi|543583738|ref|NM_001282548.1| 4113 40851 70 71\n"
- "gi|530384540|ref|XM_005249645.1| 2752 45151 70 71\n"
- "gi|530384538|ref|XM_005249644.1| 3004 48071 70 71\n"
- "gi|530384536|ref|XM_005249643.1| 3109 51246 70 71\n"
- "gi|530384534|ref|XM_005249642.1| 3097 54528 70 71\n"
- "gi|530373237|ref|XM_005265508.1| 2794 57830 70 71\n"
- "gi|530373235|ref|XM_005265507.1| 2848 60824 70 71\n"
- "gi|530364726|ref|XR_241081.1| 1009 63849 70 71\n"
- "gi|530364725|ref|XR_241080.1| 4884 65009 70 71\n"
- "gi|530364724|ref|XR_241079.1| 2819 70099 70 71\n")
- index_file = Faidx('data/genes.fasta.gz').indexname
- result_index = open(index_file).read()
- assert result_index == expect_index
-
-class TestFastaBGZF(TestCase):
- def setUp(self):
- try:
- from Bio import SeqIO
- except ImportError:
- raise SkipTest
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.gz.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_integer_slice(self):
- fasta = Fasta('data/genes.fasta.gz')
- expect = fasta['gi|563317589|dbj|AB821309.1|'][:100].seq
- result = fasta[0][:100].seq
- assert expect == result
-
- def test_integer_index(self):
- fasta = Fasta('data/genes.fasta.gz')
- expect = fasta['gi|563317589|dbj|AB821309.1|'][100].seq
- result = fasta[0][100].seq
- assert expect == result
-
- def test_fetch_whole_fasta(self):
- expect = [line.rstrip('\n') for line in open('data/genes.fasta') if line[0] != '>']
- result = list(chain(*([line for line in record] for record in Fasta('data/genes.fasta.gz', as_raw=True))))
- assert expect == result
-
- def test_line_len(self):
- fasta = Fasta('data/genes.fasta.gz')
- for record in fasta:
- assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
-
- @raises(UnsupportedCompressionFormat)
- def test_mutable_bgzf(self):
- fasta = Fasta('data/genes.fasta.gz', mutable=True)
-
- @raises(NotImplementedError)
- def test_long_names(self):
- """ Test that deflines extracted using FastaRecord.long_name are
- identical to deflines in the actual file.
- """
- deflines = []
- with open('data/genes.fasta') as fasta_file:
- for line in fasta_file:
- if line[0] == '>':
- deflines.append(line[1:-1])
- fasta = Fasta('data/genes.fasta.gz')
- long_names = []
- for record in fasta:
- long_names.append(record.long_name)
- assert deflines == long_names
-
- def test_fetch_whole_entry(self):
- faidx = Faidx('data/genes.fasta.gz')
- expect = ('ATGACATCATTTTCCACCTCTGCTCAGTGTTCAACATCTGA'
- 'CAGTGCTTGCAGGATCTCTCCTGGACAAATCAATCAGGTACGACCA'
- 'AAACTGCCGCTTTTGAAGATTTTGCATGCAGCAGGTGCGCAAGG'
- 'TGAAATGTTCACTGTTAAAGAGGTCATGCACTATTTAGGTCAGTACAT'
- 'AATGGTGAAGCAACTTTATGATCAGCAGGAGCAGCATATGGTATATTG'
- 'TGGTGGAGATCTTTTGGGAGAACTACTGGGACGTCAGAGCTTCTCCGTG'
- 'AAAGACCCAAGCCCTCTCTATGATATGCTAAGAAAGAATCTTGTCACTTT'
- 'AGCCACTGCTACTACAGCAAAGTGCAGAGGAAAGTTCCACTTCCAGAAAAA'
- 'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
- 'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 1, 481)
- assert str(result) == expect
-
- def test_fetch_middle(self):
- faidx = Faidx('data/genes.fasta.gz')
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 100, 150)
- assert str(result) == expect
-
- def test_fetch_end(self):
- faidx = Faidx('data/genes.fasta.gz')
- expect = 'TC'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 481)
- assert str(result) == expect
-
- @raises(FetchError)
- def test_fetch_border(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta.gz')
- expect = 'TC'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 500)
- print(result)
- assert str(result) == expect
-
- def test_rev(self):
- faidx = Faidx('data/genes.fasta.gz')
- expect = 'GA'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 481)
- assert str(-result) == expect, result
-
- @raises(FetchError)
- def test_fetch_past_bounds(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 5000)
-
- @raises(FetchError)
- def test_fetch_negative(self):
- """ Fetch starting with a negative coordinate """
- faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- -10, 10)
-
- @raises(FetchError)
- def test_fetch_reversed_coordinates(self):
- """ Fetch starting with a negative coordinate """
- faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 50, 10)
-
- @raises(FetchError)
- def test_fetch_keyerror(self):
- """ Fetch a key that does not exist """
- faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
- result = faidx.fetch('gi|joe|gb|KF435150.1|',
- 1, 10)
-
- def test_blank_string(self):
- """ seq[0:0] should return a blank string mdshw5/pyfaidx#53 """
- fasta = Fasta('data/genes.fasta.gz', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][0:0] == ''
-
- def test_slice_from_beginning(self):
- fasta = Fasta('data/genes.fasta.gz', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][:4] == 'ATGA'
-
- def test_slice_from_end(self):
- fasta = Fasta('data/genes.fasta.gz', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][-4:] == 'ACTC'
-
- def test_issue_74_start(self):
- f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
- assert f0['gi|557361099|gb|KF435150.1|'][0:90].start == f1['gi|557361099|gb|KF435150.1|'][0:90].start - 1
-
- def test_issue_74_consistency(self):
- f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
- assert str(f0['gi|557361099|gb|KF435150.1|'][0:90]) == str(f1['gi|557361099|gb|KF435150.1|'][0:90])
-
- def test_issue_74_end_faidx(self):
- f0 = Faidx('data/genes.fasta.gz', one_based_attributes=False)
- f1 = Faidx('data/genes.fasta.gz', one_based_attributes=True)
- end0 = f0.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
- end1 = f1.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
- assert end0 == end1
-
- def test_issue_74_end_fasta(self):
- f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
- end0 = f0['gi|557361099|gb|KF435150.1|'][1:90].end
- end1 = f1['gi|557361099|gb|KF435150.1|'][1:90].end
- print((end0, end1))
- assert end0 == end1
-
- def test_issue_79_fix(self):
- f = Fasta('data/genes.fasta.gz')
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print((s.start, s.end))
- assert (101, 105) == (s.start, s.end)
-
- def test_issue_79_fix_negate(self):
- f = Fasta('data/genes.fasta.gz')
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- s = -s
- print((s.start, s.end))
- assert (105, 101) == (s.start, s.end)
-
- def test_issue_79_fix_one_based_false(self):
- f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print((s.start, s.end))
- assert (100, 105) == (s.start, s.end)
-
- def test_issue_79_fix_one_based_false_negate(self):
- f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print(s.__dict__)
- s = -s
- print(s.__dict__)
- assert (105, 100) == (s.start, s.end)
-
- @raises(FetchError)
- def test_fetch_border_padded(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta.gz', default_seq='N')
- expect = 'TCNNNNNNNNNNNNNNNNNNN'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 500)
- print(result)
- assert str(result) == expect
+try:
+ from Bio import SeqIO
+ bio = True
+except ImportError:
+ bio = False
+
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.gz.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+ at pytest.mark.skipif(not bio, reason="Biopython is not installed.")
+ at pytest.mark.xfail
+def test_build_issue_126(remove_index):
+ """ Samtools BGZF index should be identical to pyfaidx BGZF index """
+ expect_index = ("gi|563317589|dbj|AB821309.1| 3510 114 70 71\n"
+ "gi|557361099|gb|KF435150.1| 481 3789 70 71\n"
+ "gi|557361097|gb|KF435149.1| 642 4368 70 71\n"
+ "gi|543583796|ref|NR_104216.1| 4573 5141 70 71\n"
+ "gi|543583795|ref|NR_104215.1| 5317 9901 70 71\n"
+ "gi|543583794|ref|NR_104212.1| 5374 15415 70 71\n"
+ "gi|543583788|ref|NM_001282545.1| 4170 20980 70 71\n"
+ "gi|543583786|ref|NM_001282543.1| 5466 25324 70 71\n"
+ "gi|543583785|ref|NM_000465.3| 5523 30980 70 71\n"
+ "gi|543583740|ref|NM_001282549.1| 3984 36696 70 71\n"
+ "gi|543583738|ref|NM_001282548.1| 4113 40851 70 71\n"
+ "gi|530384540|ref|XM_005249645.1| 2752 45151 70 71\n"
+ "gi|530384538|ref|XM_005249644.1| 3004 48071 70 71\n"
+ "gi|530384536|ref|XM_005249643.1| 3109 51246 70 71\n"
+ "gi|530384534|ref|XM_005249642.1| 3097 54528 70 71\n"
+ "gi|530373237|ref|XM_005265508.1| 2794 57830 70 71\n"
+ "gi|530373235|ref|XM_005265507.1| 2848 60824 70 71\n"
+ "gi|530364726|ref|XR_241081.1| 1009 63849 70 71\n"
+ "gi|530364725|ref|XR_241080.1| 4884 65009 70 71\n"
+ "gi|530364724|ref|XR_241079.1| 2819 70099 70 71\n")
+ index_file = Faidx('data/genes.fasta.gz').indexname
+ result_index = open(index_file).read()
+ assert result_index == expect_index
+
+def test_integer_slice(remove_index):
+ fasta = Fasta('data/genes.fasta.gz')
+ expect = fasta['gi|563317589|dbj|AB821309.1|'][:100].seq
+ result = fasta[0][:100].seq
+ assert expect == result
+
+def test_integer_index(remove_index):
+ fasta = Fasta('data/genes.fasta.gz')
+ expect = fasta['gi|563317589|dbj|AB821309.1|'][100].seq
+ result = fasta[0][100].seq
+ assert expect == result
+
+def test_fetch_whole_fasta(remove_index):
+ expect = [line.rstrip('\n') for line in open('data/genes.fasta') if line[0] != '>']
+ result = list(chain(*([line for line in record] for record in Fasta('data/genes.fasta.gz', as_raw=True))))
+ assert expect == result
+
+def test_line_len(remove_index):
+ fasta = Fasta('data/genes.fasta.gz')
+ for record in fasta:
+ assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
+
+ at pytest.mark.xfail(raises=UnsupportedCompressionFormat)
+def test_mutable_bgzf(remove_index):
+ fasta = Fasta('data/genes.fasta.gz', mutable=True)
+
+ at pytest.mark.xfail(raises=NotImplementedError)
+def test_long_names(remove_index):
+ """ Test that deflines extracted using FastaRecord.long_name are
+ identical to deflines in the actual file.
+ """
+ deflines = []
+ with open('data/genes.fasta') as fasta_file:
+ for line in fasta_file:
+ if line[0] == '>':
+ deflines.append(line[1:-1])
+ fasta = Fasta('data/genes.fasta.gz')
+ long_names = []
+ for record in fasta:
+ long_names.append(record.long_name)
+ assert deflines == long_names
+
+def test_fetch_whole_entry(remove_index):
+ faidx = Faidx('data/genes.fasta.gz')
+ expect = ('ATGACATCATTTTCCACCTCTGCTCAGTGTTCAACATCTGA'
+ 'CAGTGCTTGCAGGATCTCTCCTGGACAAATCAATCAGGTACGACCA'
+ 'AAACTGCCGCTTTTGAAGATTTTGCATGCAGCAGGTGCGCAAGG'
+ 'TGAAATGTTCACTGTTAAAGAGGTCATGCACTATTTAGGTCAGTACAT'
+ 'AATGGTGAAGCAACTTTATGATCAGCAGGAGCAGCATATGGTATATTG'
+ 'TGGTGGAGATCTTTTGGGAGAACTACTGGGACGTCAGAGCTTCTCCGTG'
+ 'AAAGACCCAAGCCCTCTCTATGATATGCTAAGAAAGAATCTTGTCACTTT'
+ 'AGCCACTGCTACTACAGCAAAGTGCAGAGGAAAGTTCCACTTCCAGAAAAA'
+ 'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
+ 'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 1, 481)
+ assert str(result) == expect
+
+def test_fetch_middle(remove_index):
+ faidx = Faidx('data/genes.fasta.gz')
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 100, 150)
+ assert str(result) == expect
+
+def test_fetch_end(remove_index):
+ faidx = Faidx('data/genes.fasta.gz')
+ expect = 'TC'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 481)
+ assert str(result) == expect
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_border(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta.gz')
+ expect = 'TC'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 500)
+ print(result)
+ assert str(result) == expect
+
+def test_rev(remove_index):
+ faidx = Faidx('data/genes.fasta.gz')
+ expect = 'GA'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 481)
+ assert str(-result) == expect, result
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_past_bounds(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 5000)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_negative(remove_index):
+ """ Fetch starting with a negative coordinate """
+ faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ -10, 10)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_reversed_coordinates(remove_index):
+ """ Fetch starting with a negative coordinate """
+ faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 50, 10)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_keyerror(remove_index):
+ """ Fetch a key that does not exist """
+ faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+ result = faidx.fetch('gi|joe|gb|KF435150.1|',
+ 1, 10)
+
+def test_blank_string(remove_index):
+ """ seq[0:0] should return a blank string mdshw5/pyfaidx#53 """
+ fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][0:0] == ''
+
+def test_slice_from_beginning(remove_index):
+ fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][:4] == 'ATGA'
+
+def test_slice_from_end(remove_index):
+ fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][-4:] == 'ACTC'
+
+def test_issue_74_start(remove_index):
+ f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+ assert f0['gi|557361099|gb|KF435150.1|'][0:90].start == f1['gi|557361099|gb|KF435150.1|'][0:90].start - 1
+
+def test_issue_74_consistency(remove_index):
+ f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+ assert str(f0['gi|557361099|gb|KF435150.1|'][0:90]) == str(f1['gi|557361099|gb|KF435150.1|'][0:90])
+
+def test_issue_74_end_faidx(remove_index):
+ f0 = Faidx('data/genes.fasta.gz', one_based_attributes=False)
+ f1 = Faidx('data/genes.fasta.gz', one_based_attributes=True)
+ end0 = f0.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+ end1 = f1.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+ assert end0 == end1
+
+def test_issue_74_end_fasta(remove_index):
+ f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+ end0 = f0['gi|557361099|gb|KF435150.1|'][1:90].end
+ end1 = f1['gi|557361099|gb|KF435150.1|'][1:90].end
+ print((end0, end1))
+ assert end0 == end1
+
+def test_issue_79_fix(remove_index):
+ f = Fasta('data/genes.fasta.gz')
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print((s.start, s.end))
+ assert (101, 105) == (s.start, s.end)
+
+def test_issue_79_fix_negate(remove_index):
+ f = Fasta('data/genes.fasta.gz')
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ s = -s
+ print((s.start, s.end))
+ assert (105, 101) == (s.start, s.end)
+
+def test_issue_79_fix_one_based_false(remove_index):
+ f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print((s.start, s.end))
+ assert (100, 105) == (s.start, s.end)
+
+def test_issue_79_fix_one_based_false_negate(remove_index):
+ f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print(s.__dict__)
+ s = -s
+ print(s.__dict__)
+ assert (105, 100) == (s.start, s.end)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_border_padded(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta.gz', default_seq='N')
+ expect = 'TCNNNNNNNNNNNNNNNNNNN'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 500)
+ print(result)
+ assert str(result) == expect
=====================================
tests/test_Fasta_integer_index.py
=====================================
@@ -1,28 +1,26 @@
import os
+import pytest
from pyfaidx import Fasta
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFastaIntIndex(TestCase):
- def setUp(self):
- pass
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_integer_slice(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ expect = fasta['gi|563317589|dbj|AB821309.1|'][:100].seq
+ result = fasta[0][:100].seq
+ assert expect == result
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_integer_slice(self):
- fasta = Fasta('data/genes.fasta')
- expect = fasta['gi|563317589|dbj|AB821309.1|'][:100].seq
- result = fasta[0][:100].seq
- assert expect == result
-
- def test_integer_index(self):
- fasta = Fasta('data/genes.fasta')
- expect = fasta['gi|563317589|dbj|AB821309.1|'][100].seq
- result = fasta[0][100].seq
- assert expect == result
+def test_integer_index(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ expect = fasta['gi|563317589|dbj|AB821309.1|'][100].seq
+ result = fasta[0][100].seq
+ assert expect == result
=====================================
tests/test_Fasta_synchronization.py
=====================================
@@ -1,4 +1,5 @@
import os
+import pytest
try:
from collections import OrderedDict
except ImportError: #python 2.6
@@ -21,7 +22,7 @@ class _ThreadReadSequence(threading.Thread):
seq_len = len(seq)
sub_seq_slices = list(slice(i, min(i + 20, seq_len)) for i in range(0, seq_len, 20))
- random.shuffle(sub_seq_slices, rand.random)
+ random.shuffle(sub_seq_slices)
self.result_map = result_map
self.result_lock = result_lock
@@ -51,7 +52,7 @@ class _ThreadWriteSequence(threading.Thread):
seq_len = len(seq)
sub_seq_slices = list(slice(i, min(i + 20, seq_len)) for i in range(0, seq_len, 20))
- random.shuffle(sub_seq_slices, rand.random)
+ random.shuffle(sub_seq_slices)
self.name = name
self.seq = seq
=====================================
tests/test_bio_seqio.py
=====================================
@@ -1,50 +1,42 @@
import os
+import pytest
from pyfaidx import Fasta, FetchError
-from nose.plugins.skip import Skip, SkipTest
-from unittest import TestCase
-try:
- from Bio import SeqIO
- test_bio = True
-except ImportError:
- test_bio = False
path = os.path.dirname(__file__)
os.chdir(path)
-class TestBioSeqIO(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+try:
+ from Bio import SeqIO
+ bio = True
+except ImportError:
+ bio = False
+
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def test_fetch_whole_entry(self):
- fasta = Fasta('data/genes.fasta')
- if test_bio:
- with open('data/genes.fasta', "rU") as fh:
- seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
- assert str(fasta['gi|557361099|gb|KF435150.1|']) == str(seqio['gi|557361099|gb|KF435150.1|'].seq)
- assert fasta['gi|557361099|gb|KF435150.1|'].name == str(seqio['gi|557361099|gb|KF435150.1|'].name)
- else:
- raise SkipTest
+ at pytest.mark.skipif(not bio, reason="Biopython is not installed.")
+def test_fetch_whole_entry(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ with open('data/genes.fasta', "r") as fh:
+ seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
+ assert str(fasta['gi|557361099|gb|KF435150.1|']) == str(seqio['gi|557361099|gb|KF435150.1|'].seq)
+ assert fasta['gi|557361099|gb|KF435150.1|'].name == str(seqio['gi|557361099|gb|KF435150.1|'].name)
- def test_slice_whole_entry(self):
- fasta = Fasta('data/genes.fasta')
- if test_bio:
- with open('data/genes.fasta', "rU") as fh:
- seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
- assert str(fasta['gi|557361099|gb|KF435150.1|'][::3]) == str(seqio['gi|557361099|gb|KF435150.1|'].seq[::3])
- else:
- raise SkipTest
+ at pytest.mark.skipif(not bio, reason="Biopython is not installed.")
+def test_slice_whole_entry(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ with open('data/genes.fasta', "r") as fh:
+ seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
+ assert str(fasta['gi|557361099|gb|KF435150.1|'][::3]) == str(seqio['gi|557361099|gb|KF435150.1|'].seq[::3])
- def test_revcomp_whole_entry(self):
- fasta = Fasta('data/genes.fasta')
- if test_bio:
- with open('data/genes.fasta', "rU") as fh:
- seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
- assert str(fasta['gi|557361099|gb|KF435150.1|'][:].reverse.complement) == str(seqio['gi|557361099|gb|KF435150.1|'].reverse_complement().seq)
- else:
- raise SkipTest
+ at pytest.mark.skipif(not bio, reason="Biopython is not installed.")
+def test_revcomp_whole_entry(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ with open('data/genes.fasta', "r") as fh:
+ seqio = SeqIO.to_dict(SeqIO.parse(fh, "fasta"))
+ assert str(fasta['gi|557361099|gb|KF435150.1|'][:].reverse.complement) == str(seqio['gi|557361099|gb|KF435150.1|'].reverse_complement().seq)
\ No newline at end of file
=====================================
tests/test_faidx.py
=====================================
@@ -1,60 +1,56 @@
import os
import filecmp
-from pyfaidx import FastaIndexingError, BedError, FetchError
+import pytest
+from pyfaidx import BedError, FetchError
from pyfaidx.cli import main
-from nose.tools import raises
-from unittest import TestCase
from tempfile import NamedTemporaryFile
path = os.path.dirname(__file__)
os.chdir(path)
-
-class TestCLI(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- @raises(BedError)
- def test_short_line_lengths(self):
- main(['data/genes.fasta', '--bed', 'data/malformed.bed'])
-
- def test_fetch_whole_file(self):
- main(['data/genes.fasta'])
-
- def test_split_entry(self):
- main(['--split-files', 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|'])
- assert os.path.exists('gi557361099gbKF435150.1.fasta')
- os.remove('gi557361099gbKF435150.1.fasta')
-
- @raises(FetchError)
- def test_fetch_error(self):
- main(['data/genes.fasta', 'gi|557361099|gb|KF435150.1|:1-1000'])
-
- def test_key_warning(self):
- main(['data/genes.fasta', 'foo'])
-
- def test_auto_strand(self):
- """ Test that --auto-strand produces the same output as --reverse --complement"""
- with NamedTemporaryFile() as auto_strand:
- with NamedTemporaryFile() as noto_strand:
- main(['--auto-strand', '-o', auto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:100-1'])
- main(['--reverse', '--complement', '-o', noto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:1-100'])
- print(auto_strand.read())
- print()
- print(noto_strand.read())
- self.assertTrue(filecmp.cmp(auto_strand.name, noto_strand.name))
-
- def test_regexp(self):
- main(['data/genes.fasta', '-g', 'XR'])
-
- def test_not_regexp(self):
- main(['data/genes.fasta', '-g', 'XR','-v'])
-
- def test_not_regexp_multi(self):
- main(['data/genes.fasta', '-g', 'XR', '-g', 'XM', '-v'])
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+ at pytest.mark.xfail(raises=BedError)
+def test_short_line_lengths(remove_index):
+ main(['data/genes.fasta', '--bed', 'data/malformed.bed'])
+
+def test_fetch_whole_file(remove_index):
+ main(['data/genes.fasta'])
+
+def test_split_entry(remove_index):
+ main(['--split-files', 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|'])
+ assert os.path.exists('gi557361099gbKF435150.1.fasta')
+ os.remove('gi557361099gbKF435150.1.fasta')
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_error(remove_index):
+ main(['data/genes.fasta', 'gi|557361099|gb|KF435150.1|:1-1000'])
+
+def test_key_warning(remove_index):
+ main(['data/genes.fasta', 'foo'])
+
+def test_auto_strand(remove_index):
+ """ Test that --auto-strand produces the same output as --reverse --complement"""
+ with NamedTemporaryFile() as auto_strand:
+ with NamedTemporaryFile() as noto_strand:
+ main(['--auto-strand', '-o', auto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:100-1'])
+ main(['--reverse', '--complement', '-o', noto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:1-100'])
+ print(auto_strand.read())
+ print()
+ print(noto_strand.read())
+ assert filecmp.cmp(auto_strand.name, noto_strand.name)
+
+def test_regexp(remove_index):
+ main(['data/genes.fasta', '-g', 'XR'])
+
+def test_not_regexp(remove_index):
+ main(['data/genes.fasta', '-g', 'XR','-v'])
+
+def test_not_regexp_multi(remove_index):
+ main(['data/genes.fasta', '-g', 'XR', '-g', 'XM', '-v'])
=====================================
tests/test_feature_bounds_check.py
=====================================
@@ -1,16 +1,14 @@
import os
+import pytest
from pyfaidx import Faidx, Fasta, FetchError
-from nose.tools import raises
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureZeroLength:
- """Tests for handling zero-length entries, added in #155"""
- def setUp(self):
- with open('data/zero_length.fasta', 'w') as fasta:
- fasta.write(""">A
+ at pytest.fixture
+def setup_zero():
+ with open('data/zero_length.fasta', 'w') as fasta:
+ fasta.write(""">A
ATCG
>B
>C
@@ -18,185 +16,172 @@ ATCG
>D
GTA
GC""")
-
- def tearDown(self):
- os.remove('data/zero_length.fasta')
- os.remove('data/zero_length.fasta.fai')
+ yield
+ os.remove('data/zero_length.fasta')
+ os.remove('data/zero_length.fasta.fai')
- def test_index_zero_length(self):
- fasta = Fasta('data/zero_length.fasta')
-
- def test_fetch_zero_length(self):
- fasta = Fasta('data/zero_length.fasta')
- b = fasta["B"]
- assert str(b) == ''
+def test_index_zero_length(setup_zero):
+ fasta = Fasta('data/zero_length.fasta')
+
+def test_fetch_zero_length(setup_zero):
+ fasta = Fasta('data/zero_length.fasta')
+ b = fasta["B"]
+ assert str(b) == ''
-class TestZeroLengthSequenceSubRange(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def test_as_raw_zero_length_subsequence(self):
- fasta = Fasta('data/genes.fasta', as_raw=True, strict_bounds=True)
- expect = ''
- result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
- assert result == expect
-
- def test_zero_length_subsequence(self):
- fasta = Fasta('data/genes.fasta', strict_bounds=True)
- expect = ''
- result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
- assert result.seq == expect
-
-class TestFeatureBoundsCheck:
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_fetch_whole_entry(self):
- faidx = Faidx('data/genes.fasta')
- expect = ('ATGACATCATTTTCCACCTCTGCTCAGTGTTCAACATCTGA'
- 'CAGTGCTTGCAGGATCTCTCCTGGACAAATCAATCAGGTACGACCA'
- 'AAACTGCCGCTTTTGAAGATTTTGCATGCAGCAGGTGCGCAAGG'
- 'TGAAATGTTCACTGTTAAAGAGGTCATGCACTATTTAGGTCAGTACAT'
- 'AATGGTGAAGCAACTTTATGATCAGCAGGAGCAGCATATGGTATATTG'
- 'TGGTGGAGATCTTTTGGGAGAACTACTGGGACGTCAGAGCTTCTCCGTG'
- 'AAAGACCCAAGCCCTCTCTATGATATGCTAAGAAAGAATCTTGTCACTTT'
- 'AGCCACTGCTACTACAGCAAAGTGCAGAGGAAAGTTCCACTTCCAGAAAAA'
- 'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
- 'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 1, 481)
- assert str(result) == expect
-
- def test_fetch_middle(self):
- faidx = Faidx('data/genes.fasta')
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 100, 150)
- assert str(result) == expect
-
- def test_fetch_end(self):
- faidx = Faidx('data/genes.fasta')
- expect = 'TC'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 481)
- assert str(result) == expect
-
- def test_fetch_border(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta')
- expect = 'TC'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 500)
- assert str(result) == expect
-
- def test_rev(self):
- faidx = Faidx('data/genes.fasta')
- expect = 'GA'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 481)
- assert str(-result) == expect, result
-
- @raises(FetchError)
- def test_fetch_past_bounds(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 5000)
-
- @raises(FetchError)
- def test_fetch_negative(self):
- """ Fetch starting with a negative coordinate """
- faidx = Faidx('data/genes.fasta', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- -10, 10)
-
- @raises(FetchError)
- def test_fetch_reversed_coordinates(self):
- """ Fetch starting with a negative coordinate """
- faidx = Faidx('data/genes.fasta', strict_bounds=True)
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 50, 10)
-
- @raises(FetchError)
- def test_fetch_keyerror(self):
- """ Fetch a key that does not exist """
- faidx = Faidx('data/genes.fasta', strict_bounds=True)
- result = faidx.fetch('gi|joe|gb|KF435150.1|',
- 1, 10)
-
- def test_blank_string(self):
- """ seq[0:0] should return a blank string mdshw5/pyfaidx#53 """
- fasta = Fasta('data/genes.fasta', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][0:0] == ''
-
- def test_slice_from_beginning(self):
- fasta = Fasta('data/genes.fasta', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][:4] == 'ATGA'
-
- def test_slice_from_end(self):
- fasta = Fasta('data/genes.fasta', as_raw=True)
- assert fasta['gi|557361099|gb|KF435150.1|'][-4:] == 'ACTC'
-
- def test_issue_74_start(self):
- f0 = Fasta('data/genes.fasta', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta', one_based_attributes=True)
- assert f0['gi|557361099|gb|KF435150.1|'][0:90].start == f1['gi|557361099|gb|KF435150.1|'][0:90].start - 1
-
- def test_issue_74_consistency(self):
- f0 = Fasta('data/genes.fasta', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta', one_based_attributes=True)
- assert str(f0['gi|557361099|gb|KF435150.1|'][0:90]) == str(f1['gi|557361099|gb|KF435150.1|'][0:90])
-
- def test_issue_74_end_faidx(self):
- f0 = Faidx('data/genes.fasta', one_based_attributes=False)
- f1 = Faidx('data/genes.fasta', one_based_attributes=True)
- end0 = f0.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
- end1 = f1.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
- assert end0 == end1
-
- def test_issue_74_end_fasta(self):
- f0 = Fasta('data/genes.fasta', one_based_attributes=False)
- f1 = Fasta('data/genes.fasta', one_based_attributes=True)
- end0 = f0['gi|557361099|gb|KF435150.1|'][1:90].end
- end1 = f1['gi|557361099|gb|KF435150.1|'][1:90].end
- print((end0, end1))
- assert end0 == end1
-
- def test_issue_79_fix(self):
- f = Fasta('data/genes.fasta')
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print((s.start, s.end))
- assert (101, 105) == (s.start, s.end)
-
- def test_issue_79_fix_negate(self):
- f = Fasta('data/genes.fasta')
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- s = -s
- print((s.start, s.end))
- assert (105, 101) == (s.start, s.end)
-
- def test_issue_79_fix_one_based_false(self):
- f = Fasta('data/genes.fasta', one_based_attributes=False)
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print((s.start, s.end))
- assert (100, 105) == (s.start, s.end)
-
- def test_issue_79_fix_one_based_false_negate(self):
- f = Fasta('data/genes.fasta', one_based_attributes=False)
- s = f['gi|557361099|gb|KF435150.1|'][100:105]
- print(s.__dict__)
- s = -s
- print(s.__dict__)
- assert (105, 100) == (s.start, s.end)
+def test_as_raw_zero_length_subsequence(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True, strict_bounds=True)
+ expect = ''
+ result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
+ assert result == expect
+
+def test_zero_length_subsequence(remove_index):
+ fasta = Fasta('data/genes.fasta', strict_bounds=True)
+ expect = ''
+ result = fasta['gi|557361099|gb|KF435150.1|'][100:100]
+ assert result.seq == expect
+
+def test_fetch_whole_entry(remove_index):
+ faidx = Faidx('data/genes.fasta')
+ expect = ('ATGACATCATTTTCCACCTCTGCTCAGTGTTCAACATCTGA'
+ 'CAGTGCTTGCAGGATCTCTCCTGGACAAATCAATCAGGTACGACCA'
+ 'AAACTGCCGCTTTTGAAGATTTTGCATGCAGCAGGTGCGCAAGG'
+ 'TGAAATGTTCACTGTTAAAGAGGTCATGCACTATTTAGGTCAGTACAT'
+ 'AATGGTGAAGCAACTTTATGATCAGCAGGAGCAGCATATGGTATATTG'
+ 'TGGTGGAGATCTTTTGGGAGAACTACTGGGACGTCAGAGCTTCTCCGTG'
+ 'AAAGACCCAAGCCCTCTCTATGATATGCTAAGAAAGAATCTTGTCACTTT'
+ 'AGCCACTGCTACTACAGCAAAGTGCAGAGGAAAGTTCCACTTCCAGAAAAA'
+ 'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
+ 'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 1, 481)
+ assert str(result) == expect
+
+def test_fetch_middle(remove_index):
+ faidx = Faidx('data/genes.fasta')
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 100, 150)
+ assert str(result) == expect
+
+def test_fetch_end(remove_index):
+ faidx = Faidx('data/genes.fasta')
+ expect = 'TC'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 481)
+ assert str(result) == expect
+
+def test_fetch_border(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta')
+ expect = 'TC'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 500)
+ assert str(result) == expect
+
+def test_rev(remove_index):
+ faidx = Faidx('data/genes.fasta')
+ expect = 'GA'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 481)
+ assert str(-result) == expect, result
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_past_bounds(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 5000)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_negative(remove_index):
+ """ Fetch starting with a negative coordinate """
+ faidx = Faidx('data/genes.fasta', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ -10, 10)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_reversed_coordinates(remove_index):
+ """ Fetch starting with a negative coordinate """
+ faidx = Faidx('data/genes.fasta', strict_bounds=True)
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 50, 10)
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_fetch_keyerror(remove_index):
+ """ Fetch a key that does not exist """
+ faidx = Faidx('data/genes.fasta', strict_bounds=True)
+ result = faidx.fetch('gi|joe|gb|KF435150.1|',
+ 1, 10)
+
+def test_blank_string(remove_index):
+ """ seq[0:0] should return a blank string mdshw5/pyfaidx#53 """
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][0:0] == ''
+
+def test_slice_from_beginning(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][:4] == 'ATGA'
+
+def test_slice_from_end(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ assert fasta['gi|557361099|gb|KF435150.1|'][-4:] == 'ACTC'
+
+def test_issue_74_start(remove_index):
+ f0 = Fasta('data/genes.fasta', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta', one_based_attributes=True)
+ assert f0['gi|557361099|gb|KF435150.1|'][0:90].start == f1['gi|557361099|gb|KF435150.1|'][0:90].start - 1
+
+def test_issue_74_consistency(remove_index):
+ f0 = Fasta('data/genes.fasta', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta', one_based_attributes=True)
+ assert str(f0['gi|557361099|gb|KF435150.1|'][0:90]) == str(f1['gi|557361099|gb|KF435150.1|'][0:90])
+
+def test_issue_74_end_faidx(remove_index):
+ f0 = Faidx('data/genes.fasta', one_based_attributes=False)
+ f1 = Faidx('data/genes.fasta', one_based_attributes=True)
+ end0 = f0.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+ end1 = f1.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+ assert end0 == end1
+
+def test_issue_74_end_fasta(remove_index):
+ f0 = Fasta('data/genes.fasta', one_based_attributes=False)
+ f1 = Fasta('data/genes.fasta', one_based_attributes=True)
+ end0 = f0['gi|557361099|gb|KF435150.1|'][1:90].end
+ end1 = f1['gi|557361099|gb|KF435150.1|'][1:90].end
+ print((end0, end1))
+ assert end0 == end1
+
+def test_issue_79_fix(remove_index):
+ f = Fasta('data/genes.fasta')
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print((s.start, s.end))
+ assert (101, 105) == (s.start, s.end)
+
+def test_issue_79_fix_negate(remove_index):
+ f = Fasta('data/genes.fasta')
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ s = -s
+ print((s.start, s.end))
+ assert (105, 101) == (s.start, s.end)
+
+def test_issue_79_fix_one_based_false(remove_index):
+ f = Fasta('data/genes.fasta', one_based_attributes=False)
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print((s.start, s.end))
+ assert (100, 105) == (s.start, s.end)
+
+def test_issue_79_fix_one_based_false_negate(remove_index):
+ f = Fasta('data/genes.fasta', one_based_attributes=False)
+ s = f['gi|557361099|gb|KF435150.1|'][100:105]
+ print(s.__dict__)
+ s = -s
+ print(s.__dict__)
+ assert (105, 100) == (s.start, s.end)
\ No newline at end of file
=====================================
tests/test_feature_default_seq.py
=====================================
@@ -1,24 +1,22 @@
import os
+import pytest
from pyfaidx import Faidx
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureDefaultSeq(TestCase):
- def setUp(self):
- pass
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_fetch_border_padded(self):
- """ Fetch past the end of a gene entry """
- faidx = Faidx('data/genes.fasta', default_seq='N')
- expect = 'TCNNNNNNNNNNNNNNNNNNN'
- result = faidx.fetch('gi|557361099|gb|KF435150.1|',
- 480, 500)
- assert str(result) == expect
+def test_fetch_border_padded(remove_index):
+ """ Fetch past the end of a gene entry """
+ faidx = Faidx('data/genes.fasta', default_seq='N')
+ expect = 'TCNNNNNNNNNNNNNNNNNNN'
+ result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+ 480, 500)
+ assert str(result) == expect
=====================================
tests/test_feature_indexing.py
=====================================
@@ -1,9 +1,7 @@
import os
+import pytest
from os.path import getmtime
from pyfaidx import Faidx, FastaIndexingError, IndexNotFoundError, FastaNotFoundError
-from nose.tools import raises
-from nose.plugins.skip import Skip, SkipTest
-from unittest import TestCase
from tempfile import NamedTemporaryFile, mkdtemp
import time
import platform
@@ -19,333 +17,336 @@ import six.moves.builtins as builtins
path = os.path.dirname(__file__)
os.chdir(path)
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
-class TestIndexing(TestCase):
- def setUp(self):
- pass
+def test_build(remove_index):
+ expect_index = ("gi|563317589|dbj|AB821309.1| 3510 114 70 71\n"
+ "gi|557361099|gb|KF435150.1| 481 3789 70 71\n"
+ "gi|557361097|gb|KF435149.1| 642 4368 70 71\n"
+ "gi|543583796|ref|NR_104216.1| 4573 5141 70 71\n"
+ "gi|543583795|ref|NR_104215.1| 5317 9901 70 71\n"
+ "gi|543583794|ref|NR_104212.1| 5374 15415 70 71\n"
+ "gi|543583788|ref|NM_001282545.1| 4170 20980 70 71\n"
+ "gi|543583786|ref|NM_001282543.1| 5466 25324 70 71\n"
+ "gi|543583785|ref|NM_000465.3| 5523 30980 70 71\n"
+ "gi|543583740|ref|NM_001282549.1| 3984 36696 70 71\n"
+ "gi|543583738|ref|NM_001282548.1| 4113 40851 70 71\n"
+ "gi|530384540|ref|XM_005249645.1| 2752 45151 70 71\n"
+ "gi|530384538|ref|XM_005249644.1| 3004 48071 70 71\n"
+ "gi|530384536|ref|XM_005249643.1| 3109 51246 70 71\n"
+ "gi|530384534|ref|XM_005249642.1| 3097 54528 70 71\n"
+ "gi|530373237|ref|XM_005265508.1| 2794 57830 70 71\n"
+ "gi|530373235|ref|XM_005265507.1| 2848 60824 70 71\n"
+ "gi|530364726|ref|XR_241081.1| 1009 63849 70 71\n"
+ "gi|530364725|ref|XR_241080.1| 4884 65009 70 71\n"
+ "gi|530364724|ref|XR_241079.1| 2819 70099 70 71\n")
+ index_file = Faidx('data/genes.fasta').indexname
+ result_index = open(index_file).read()
+ assert result_index == expect_index
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+def test_build_issue_141(remove_index):
+ expect_index = ("gi|563317589|dbj|AB821309.1| 3510 115 70 72\n"
+ "gi|557361099|gb|KF435150.1| 481 3842 70 72\n"
+ "gi|557361097|gb|KF435149.1| 642 4429 70 72\n"
+ "gi|543583796|ref|NR_104216.1| 4573 5213 70 72\n"
+ "gi|543583795|ref|NR_104215.1| 5317 10040 70 72\n"
+ "gi|543583794|ref|NR_104212.1| 5374 15631 70 72\n"
+ "gi|543583788|ref|NM_001282545.1| 4170 21274 70 72\n"
+ "gi|543583786|ref|NM_001282543.1| 5466 25679 70 72\n"
+ "gi|543583785|ref|NM_000465.3| 5523 31415 70 72\n"
+ "gi|543583740|ref|NM_001282549.1| 3984 37211 70 72\n"
+ "gi|543583738|ref|NM_001282548.1| 4113 41424 70 72\n"
+ "gi|530384540|ref|XM_005249645.1| 2752 45784 70 72\n"
+ "gi|530384538|ref|XM_005249644.1| 3004 48745 70 72\n"
+ "gi|530384536|ref|XM_005249643.1| 3109 51964 70 72\n"
+ "gi|530384534|ref|XM_005249642.1| 3097 55292 70 72\n"
+ "gi|530373237|ref|XM_005265508.1| 2794 58640 70 72\n"
+ "gi|530373235|ref|XM_005265507.1| 2848 61675 70 72\n"
+ "gi|530364726|ref|XR_241081.1| 1009 64742 70 72\n"
+ "gi|530364725|ref|XR_241080.1| 4884 65918 70 72\n"
+ "gi|530364724|ref|XR_241079.1| 2819 71079 70 72\n")
+ index_file = Faidx('data/issue_141.fasta').indexname
+ result_index = open(index_file).read()
+ os.remove('data/issue_141.fasta.fai')
+ print(result_index)
+ assert result_index == expect_index
- def test_build(self):
- expect_index = ("gi|563317589|dbj|AB821309.1| 3510 114 70 71\n"
- "gi|557361099|gb|KF435150.1| 481 3789 70 71\n"
- "gi|557361097|gb|KF435149.1| 642 4368 70 71\n"
- "gi|543583796|ref|NR_104216.1| 4573 5141 70 71\n"
- "gi|543583795|ref|NR_104215.1| 5317 9901 70 71\n"
- "gi|543583794|ref|NR_104212.1| 5374 15415 70 71\n"
- "gi|543583788|ref|NM_001282545.1| 4170 20980 70 71\n"
- "gi|543583786|ref|NM_001282543.1| 5466 25324 70 71\n"
- "gi|543583785|ref|NM_000465.3| 5523 30980 70 71\n"
- "gi|543583740|ref|NM_001282549.1| 3984 36696 70 71\n"
- "gi|543583738|ref|NM_001282548.1| 4113 40851 70 71\n"
- "gi|530384540|ref|XM_005249645.1| 2752 45151 70 71\n"
- "gi|530384538|ref|XM_005249644.1| 3004 48071 70 71\n"
- "gi|530384536|ref|XM_005249643.1| 3109 51246 70 71\n"
- "gi|530384534|ref|XM_005249642.1| 3097 54528 70 71\n"
- "gi|530373237|ref|XM_005265508.1| 2794 57830 70 71\n"
- "gi|530373235|ref|XM_005265507.1| 2848 60824 70 71\n"
- "gi|530364726|ref|XR_241081.1| 1009 63849 70 71\n"
- "gi|530364725|ref|XR_241080.1| 4884 65009 70 71\n"
- "gi|530364724|ref|XR_241079.1| 2819 70099 70 71\n")
- index_file = Faidx('data/genes.fasta').indexname
- result_index = open(index_file).read()
- assert result_index == expect_index
+def test_build_issue_111(remove_index):
+ expect_index = ("gi|563317589|dbj|AB821309 3510 114 70 71\n"
+ "gi|557361099|gb|KF435150 481 3789 70 71\n"
+ "gi|557361097|gb|KF435149 642 4368 70 71\n"
+ "gi|543583796|ref|NR_104216 4573 5141 70 71\n"
+ "gi|543583795|ref|NR_104215 5317 9901 70 71\n"
+ "gi|543583794|ref|NR_104212 5374 15415 70 71\n"
+ "gi|543583788|ref|NM_001282545 4170 20980 70 71\n"
+ "gi|543583786|ref|NM_001282543 5466 25324 70 71\n"
+ "gi|543583785|ref|NM_000465 5523 30980 70 71\n"
+ "gi|543583740|ref|NM_001282549 3984 36696 70 71\n"
+ "gi|543583738|ref|NM_001282548 4113 40851 70 71\n"
+ "gi|530384540|ref|XM_005249645 2752 45151 70 71\n"
+ "gi|530384538|ref|XM_005249644 3004 48071 70 71\n"
+ "gi|530384536|ref|XM_005249643 3109 51246 70 71\n"
+ "gi|530384534|ref|XM_005249642 3097 54528 70 71\n"
+ "gi|530373237|ref|XM_005265508 2794 57830 70 71\n"
+ "gi|530373235|ref|XM_005265507 2848 60824 70 71\n"
+ "gi|530364726|ref|XR_241081 1009 63849 70 71\n"
+ "gi|530364725|ref|XR_241080 4884 65009 70 71\n"
+ "gi|530364724|ref|XR_241079 2819 70099 70 71\n")
+ index = Faidx(
+ 'data/genes.fasta',
+ read_long_names=True,
+ key_function=lambda x: x.split('.')[0])
+ result_index = ''.join(index._index_as_string())
+ assert result_index == expect_index
- def test_build_issue_141(self):
- expect_index = ("gi|563317589|dbj|AB821309.1| 3510 115 70 72\n"
- "gi|557361099|gb|KF435150.1| 481 3842 70 72\n"
- "gi|557361097|gb|KF435149.1| 642 4429 70 72\n"
- "gi|543583796|ref|NR_104216.1| 4573 5213 70 72\n"
- "gi|543583795|ref|NR_104215.1| 5317 10040 70 72\n"
- "gi|543583794|ref|NR_104212.1| 5374 15631 70 72\n"
- "gi|543583788|ref|NM_001282545.1| 4170 21274 70 72\n"
- "gi|543583786|ref|NM_001282543.1| 5466 25679 70 72\n"
- "gi|543583785|ref|NM_000465.3| 5523 31415 70 72\n"
- "gi|543583740|ref|NM_001282549.1| 3984 37211 70 72\n"
- "gi|543583738|ref|NM_001282548.1| 4113 41424 70 72\n"
- "gi|530384540|ref|XM_005249645.1| 2752 45784 70 72\n"
- "gi|530384538|ref|XM_005249644.1| 3004 48745 70 72\n"
- "gi|530384536|ref|XM_005249643.1| 3109 51964 70 72\n"
- "gi|530384534|ref|XM_005249642.1| 3097 55292 70 72\n"
- "gi|530373237|ref|XM_005265508.1| 2794 58640 70 72\n"
- "gi|530373235|ref|XM_005265507.1| 2848 61675 70 72\n"
- "gi|530364726|ref|XR_241081.1| 1009 64742 70 72\n"
- "gi|530364725|ref|XR_241080.1| 4884 65918 70 72\n"
- "gi|530364724|ref|XR_241079.1| 2819 71079 70 72\n")
- index_file = Faidx('data/issue_141.fasta').indexname
- result_index = open(index_file).read()
- os.remove('data/issue_141.fasta.fai')
- print(result_index)
- assert result_index == expect_index
+def test_order(remove_index):
+ order = ("gi|563317589|dbj|AB821309.1|", "gi|557361099|gb|KF435150.1|",
+ "gi|557361097|gb|KF435149.1|",
+ "gi|543583796|ref|NR_104216.1|",
+ "gi|543583795|ref|NR_104215.1|",
+ "gi|543583794|ref|NR_104212.1|",
+ "gi|543583788|ref|NM_001282545.1|",
+ "gi|543583786|ref|NM_001282543.1|",
+ "gi|543583785|ref|NM_000465.3|",
+ "gi|543583740|ref|NM_001282549.1|",
+ "gi|543583738|ref|NM_001282548.1|",
+ "gi|530384540|ref|XM_005249645.1|",
+ "gi|530384538|ref|XM_005249644.1|",
+ "gi|530384536|ref|XM_005249643.1|",
+ "gi|530384534|ref|XM_005249642.1|",
+ "gi|530373237|ref|XM_005265508.1|",
+ "gi|530373235|ref|XM_005265507.1|",
+ "gi|530364726|ref|XR_241081.1|",
+ "gi|530364725|ref|XR_241080.1|",
+ "gi|530364724|ref|XR_241079.1|")
+ result = tuple(Faidx('data/genes.fasta').index.keys())
+ assert result == order
- def test_build_issue_111(self):
- expect_index = ("gi|563317589|dbj|AB821309 3510 114 70 71\n"
- "gi|557361099|gb|KF435150 481 3789 70 71\n"
- "gi|557361097|gb|KF435149 642 4368 70 71\n"
- "gi|543583796|ref|NR_104216 4573 5141 70 71\n"
- "gi|543583795|ref|NR_104215 5317 9901 70 71\n"
- "gi|543583794|ref|NR_104212 5374 15415 70 71\n"
- "gi|543583788|ref|NM_001282545 4170 20980 70 71\n"
- "gi|543583786|ref|NM_001282543 5466 25324 70 71\n"
- "gi|543583785|ref|NM_000465 5523 30980 70 71\n"
- "gi|543583740|ref|NM_001282549 3984 36696 70 71\n"
- "gi|543583738|ref|NM_001282548 4113 40851 70 71\n"
- "gi|530384540|ref|XM_005249645 2752 45151 70 71\n"
- "gi|530384538|ref|XM_005249644 3004 48071 70 71\n"
- "gi|530384536|ref|XM_005249643 3109 51246 70 71\n"
- "gi|530384534|ref|XM_005249642 3097 54528 70 71\n"
- "gi|530373237|ref|XM_005265508 2794 57830 70 71\n"
- "gi|530373235|ref|XM_005265507 2848 60824 70 71\n"
- "gi|530364726|ref|XR_241081 1009 63849 70 71\n"
- "gi|530364725|ref|XR_241080 4884 65009 70 71\n"
- "gi|530364724|ref|XR_241079 2819 70099 70 71\n")
- index = Faidx(
- 'data/genes.fasta',
- read_long_names=True,
- key_function=lambda x: x.split('.')[0])
- result_index = ''.join(index._index_as_string())
- assert result_index == expect_index
-
- def test_order(self):
- order = ("gi|563317589|dbj|AB821309.1|", "gi|557361099|gb|KF435150.1|",
- "gi|557361097|gb|KF435149.1|",
- "gi|543583796|ref|NR_104216.1|",
- "gi|543583795|ref|NR_104215.1|",
- "gi|543583794|ref|NR_104212.1|",
- "gi|543583788|ref|NM_001282545.1|",
- "gi|543583786|ref|NM_001282543.1|",
- "gi|543583785|ref|NM_000465.3|",
- "gi|543583740|ref|NM_001282549.1|",
- "gi|543583738|ref|NM_001282548.1|",
- "gi|530384540|ref|XM_005249645.1|",
- "gi|530384538|ref|XM_005249644.1|",
- "gi|530384536|ref|XM_005249643.1|",
- "gi|530384534|ref|XM_005249642.1|",
- "gi|530373237|ref|XM_005265508.1|",
- "gi|530373235|ref|XM_005265507.1|",
- "gi|530364726|ref|XR_241081.1|",
- "gi|530364725|ref|XR_241080.1|",
- "gi|530364724|ref|XR_241079.1|")
- result = tuple(Faidx('data/genes.fasta').index.keys())
- assert result == order
-
- def test_valgrind_short_lines(self):
- """ Makes all full-length lines short and checks that error is raised
- in all appropriate circumstances.
- """
- # http://stackoverflow.com/a/23212515/717419
- if platform.system() == 'Windows':
- raise SkipTest
- indexed = []
- with open('data/genes.fasta') as genes:
- fasta = genes.readlines()
- n_lines = sum(1 for line in fasta)
- for n in range(n_lines):
- with NamedTemporaryFile(mode='w') as lines:
- for i, line in enumerate(fasta):
- if i == n and line[0] != '>' and len(line) == 71:
- line = line[:-3] + '\n'
- full_line = True
- elif i == n:
- full_line = False
- lines.write(line)
- lines.flush()
- name = lines.name
- if full_line:
- try:
- Faidx(name)
- indexed.append(True)
- except FastaIndexingError:
- indexed.append(False)
- assert not any(indexed)
-
- def test_valgrind_long_lines(self):
- """ Makes all full-length lines long and checks that error is raised
- in all appropriate circumstances.
- """
- # http://stackoverflow.com/a/23212515/717419
- if platform.system() == 'Windows':
- raise SkipTest
- indexed = []
- with open('data/genes.fasta') as genes:
- fasta = genes.readlines()
- n_lines = sum(1 for line in fasta)
- for n in range(n_lines):
- with NamedTemporaryFile(mode='w') as lines:
- for i, line in enumerate(fasta):
- if i == n and line[0] != '>' and len(line) == 71:
- line = line.rstrip('\n') + 'NNN' + '\n'
- full_line = True
- elif i == n:
- full_line = False
- lines.write(line)
- lines.flush()
- name = lines.name
- if full_line:
- try:
- Faidx(name)
- indexed.append(True)
- except FastaIndexingError:
- indexed.append(False)
- assert not any(indexed)
+def test_valgrind_short_lines(remove_index):
+ """ Makes all full-length lines short and checks that error is raised
+ in all appropriate circumstances.
+ """
+ # http://stackoverflow.com/a/23212515/717419
+ if platform.system() == 'Windows':
+ raise SkipTest
+ indexed = []
+ with open('data/genes.fasta') as genes:
+ fasta = genes.readlines()
+ n_lines = sum(1 for line in fasta)
+ for n in range(n_lines):
+ with NamedTemporaryFile(mode='w') as lines:
+ for i, line in enumerate(fasta):
+ if i == n and line[0] != '>' and len(line) == 71:
+ line = line[:-3] + '\n'
+ full_line = True
+ elif i == n:
+ full_line = False
+ lines.write(line)
+ lines.flush()
+ name = lines.name
+ if full_line:
+ try:
+ Faidx(name)
+ indexed.append(True)
+ except FastaIndexingError:
+ indexed.append(False)
+ assert not any(indexed)
- def test_valgrind_blank_lines(self):
- """ Makes all full-length lines blank and checks that error is raised
- in all appropriate circumstances.
- """
- # http://stackoverflow.com/a/23212515/717419
- if platform.system() == 'Windows':
- raise SkipTest
- indexed = []
- with open('data/genes.fasta') as genes:
- fasta = genes.readlines()
- n_lines = sum(1 for line in fasta)
- for n in range(n_lines):
- with NamedTemporaryFile(mode='w') as lines:
- for i, line in enumerate(fasta):
- if i == n and line[0] != '>' and len(line) == 71:
- line = '\n'
- full_line = True
- elif i == n:
- full_line = False
- lines.write(line)
- lines.flush()
- name = lines.name
- if full_line:
- try:
- Faidx(name)
- indexed.append(True)
- except FastaIndexingError:
- indexed.append(False)
- assert not any(indexed)
+def test_valgrind_long_lines(remove_index):
+ """ Makes all full-length lines long and checks that error is raised
+ in all appropriate circumstances.
+ """
+ # http://stackoverflow.com/a/23212515/717419
+ if platform.system() == 'Windows':
+ raise SkipTest
+ indexed = []
+ with open('data/genes.fasta') as genes:
+ fasta = genes.readlines()
+ n_lines = sum(1 for line in fasta)
+ for n in range(n_lines):
+ with NamedTemporaryFile(mode='w') as lines:
+ for i, line in enumerate(fasta):
+ if i == n and line[0] != '>' and len(line) == 71:
+ line = line.rstrip('\n') + 'NNN' + '\n'
+ full_line = True
+ elif i == n:
+ full_line = False
+ lines.write(line)
+ lines.flush()
+ name = lines.name
+ if full_line:
+ try:
+ Faidx(name)
+ indexed.append(True)
+ except FastaIndexingError:
+ indexed.append(False)
+ assert not any(indexed)
- def test_reindex_on_modification(self):
- """ This test ensures that the index is regenerated when the FASTA
- modification time is newer than the index modification time.
- mdshw5/pyfaidx#50 """
- faidx = Faidx('data/genes.fasta')
- index_mtime = getmtime(faidx.indexname)
- faidx.close()
- os.utime('data/genes.fasta', (index_mtime + 10, ) * 2)
- time.sleep(2)
- faidx = Faidx('data/genes.fasta')
- assert getmtime(faidx.indexname) > index_mtime
+def test_valgrind_blank_lines(remove_index):
+ """ Makes all full-length lines blank and checks that error is raised
+ in all appropriate circumstances.
+ """
+ # http://stackoverflow.com/a/23212515/717419
+ if platform.system() == 'Windows':
+ raise SkipTest
+ indexed = []
+ with open('data/genes.fasta') as genes:
+ fasta = genes.readlines()
+ n_lines = sum(1 for line in fasta)
+ for n in range(n_lines):
+ with NamedTemporaryFile(mode='w') as lines:
+ for i, line in enumerate(fasta):
+ if i == n and line[0] != '>' and len(line) == 71:
+ line = '\n'
+ full_line = True
+ elif i == n:
+ full_line = False
+ lines.write(line)
+ lines.flush()
+ name = lines.name
+ if full_line:
+ try:
+ Faidx(name)
+ indexed.append(True)
+ except FastaIndexingError:
+ indexed.append(False)
+ assert not any(indexed)
- def test_build_issue_83(self):
- """ Ensure that blank lines between entries are treated in the
- same way as samtools 1.2. See mdshw5/pyfaidx#83.
- """
- expect_index = ("MT 119 4 70 71\nGL000207.1 60 187 60 61\n")
- index_file = Faidx('data/issue_83.fasta').indexname
- result_index = open(index_file).read()
- os.remove('data/issue_83.fasta.fai')
- assert result_index == expect_index
+def test_reindex_on_modification(remove_index):
+ """ This test ensures that the index is regenerated when the FASTA
+ modification time is newer than the index modification time.
+ mdshw5/pyfaidx#50 """
+ faidx = Faidx('data/genes.fasta')
+ index_mtime = getmtime(faidx.indexname)
+ faidx.close()
+ os.utime('data/genes.fasta', (index_mtime + 10, ) * 2)
+ time.sleep(2)
+ faidx = Faidx('data/genes.fasta')
+ assert getmtime(faidx.indexname) > index_mtime
- def test_build_issue_96_fail_build_faidx(self):
- """ Ensure that the fasta file is closed if construction of the 'Faidx' file
- when attempting to build an index.
- See mdshw5/pyfaidx#96
- """
- tmp_dir = mkdtemp()
- try:
- fasta_path = os.path.join(tmp_dir, 'issue_96.fasta')
- # Write simple fasta file with inconsistent sequence line lengths,
- # so building an index raises a 'FastaIndexingError'
- with open(fasta_path, 'w') as fasta_out:
- fasta_out.write(
- ">seq1\nCTCCGGGCCCAT\nAACACTTGGGGGTAGCTAAAGTGAA\nATAAAGCCTAAA\n"
- )
+def test_build_issue_83(remove_index):
+ """ Ensure that blank lines between entries are treated in the
+ same way as samtools 1.2. See mdshw5/pyfaidx#83.
+ """
+ expect_index = ("MT 119 4 70 71\nGL000207.1 60 187 60 61\n")
+ index_file = Faidx('data/issue_83.fasta').indexname
+ result_index = open(index_file).read()
+ os.remove('data/issue_83.fasta.fai')
+ assert result_index == expect_index
- builtins_open = builtins.open
+def test_build_issue_96_fail_build_faidx(remove_index):
+ """ Ensure that the fasta file is closed if construction of the 'Faidx' file
+ when attempting to build an index.
+ See mdshw5/pyfaidx#96
+ """
+ tmp_dir = mkdtemp()
+ try:
+ fasta_path = os.path.join(tmp_dir, 'issue_96.fasta')
+ # Write simple fasta file with inconsistent sequence line lengths,
+ # so building an index raises a 'FastaIndexingError'
+ with open(fasta_path, 'w') as fasta_out:
+ fasta_out.write(
+ ">seq1\nCTCCGGGCCCAT\nAACACTTGGGGGTAGCTAAAGTGAA\nATAAAGCCTAAA\n"
+ )
- opened_files = []
+ builtins_open = builtins.open
- def test_open(*args, **kwargs):
- f = builtins_open(*args, **kwargs)
- opened_files.append(f)
- return f
+ opened_files = []
- with mock.patch('six.moves.builtins.open', side_effect=test_open):
- try:
- Faidx(fasta_path)
- self.assertFail(
- "Faidx construction should fail with 'FastaIndexingError'."
- )
- except FastaIndexingError:
- pass
- self.assertTrue(all(f.closed for f in opened_files))
- finally:
- shutil.rmtree(tmp_dir)
+ def test_open(*args, **kwargs):
+ f = builtins_open(*args, **kwargs)
+ opened_files.append(f)
+ return f
- def test_build_issue_96_fail_read_malformed_index_duplicate_key(self):
- """ Ensure that the fasta file is closed if construction of the 'Faidx' file
- fails when attempting to read a pre-existing index. The index is malformed because
- it contains mulitple occurrences of the same index.
- See mdshw5/pyfaidx#96
- """
- tmp_dir = mkdtemp()
- try:
- fasta_path = os.path.join(tmp_dir, 'issue_96.fasta')
- faidx_path = os.path.join(tmp_dir, 'issue_96.fasta.fai')
- # Write simple fasta file
- with open(fasta_path, 'w') as fasta_out:
- fasta_out.write(">seq1\nCTCCGGGCCCAT\nATAAAGCCTAAA\n")
- with open(faidx_path, 'w') as faidx_out:
- faidx_out.write("seq1\t24\t6\t12\t13\nseq1\t24\t6\t12\t13\n")
+ with mock.patch('six.moves.builtins.open', side_effect=test_open):
+ try:
+ Faidx(fasta_path)
+ remove_index.assertFail(
+ "Faidx construction should fail with 'FastaIndexingError'."
+ )
+ except FastaIndexingError:
+ pass
+ assert all(f.closed for f in opened_files)
+ finally:
+ shutil.rmtree(tmp_dir)
- builtins_open = builtins.open
+def test_build_issue_96_fail_read_malformed_index_duplicate_key(remove_index):
+ """ Ensure that the fasta file is closed if construction of the 'Faidx' file
+ fails when attempting to read a pre-existing index. The index is malformed because
+ it contains mulitple occurrences of the same index.
+ See mdshw5/pyfaidx#96
+ """
+ tmp_dir = mkdtemp()
+ try:
+ fasta_path = os.path.join(tmp_dir, 'issue_96.fasta')
+ faidx_path = os.path.join(tmp_dir, 'issue_96.fasta.fai')
+ # Write simple fasta file
+ with open(fasta_path, 'w') as fasta_out:
+ fasta_out.write(">seq1\nCTCCGGGCCCAT\nATAAAGCCTAAA\n")
+ with open(faidx_path, 'w') as faidx_out:
+ faidx_out.write("seq1\t24\t6\t12\t13\nseq1\t24\t6\t12\t13\n")
- opened_files = []
+ builtins_open = builtins.open
- def test_open(*args, **kwargs):
- f = builtins_open(*args, **kwargs)
- opened_files.append(f)
- return f
+ opened_files = []
- with mock.patch('six.moves.builtins.open', side_effect=test_open):
- try:
- Faidx(fasta_path)
- self.assertFail(
- "Faidx construction should fail with 'ValueError'.")
- except ValueError:
- pass
- self.assertTrue(all(f.closed for f in opened_files))
- finally:
- shutil.rmtree(tmp_dir)
+ def test_open(*args, **kwargs):
+ f = builtins_open(*args, **kwargs)
+ opened_files.append(f)
+ return f
- def test_read_back_index(self):
- """Ensure that index files written with write_fai() can be read back"""
- import locale
- old_locale = locale.getlocale(locale.LC_NUMERIC)
- try:
- locale.setlocale(locale.LC_NUMERIC, 'en_US.utf8')
- faidx = Faidx('data/genes.fasta')
- faidx.write_fai()
- faidx = Faidx('data/genes.fasta', build_index=False)
- finally:
- locale.setlocale(locale.LC_NUMERIC, old_locale)
+ with mock.patch('six.moves.builtins.open', side_effect=test_open):
+ try:
+ Faidx(fasta_path)
+ remove_index.assertFail(
+ "Faidx construction should fail with 'ValueError'.")
+ except ValueError:
+ pass
+ assert all(f.closed for f in opened_files)
+ finally:
+ shutil.rmtree(tmp_dir)
- @raises(IndexNotFoundError)
- def test_issue_134_no_build_index(self):
- """ Ensure that index file is not built when build_index=False. See mdshw5/pyfaidx#134.
- """
+def test_read_back_index(remove_index):
+ """Ensure that index files written with write_fai() can be read back"""
+ import locale
+ import platform
+
+ if platform.system() == "Linux":
+ new_locale = 'en_US.utf8'
+ elif platform.system() == "Darwin":
+ new_locale = 'en_US.UTF-8'
+
+ old_locale = locale.getlocale(locale.LC_NUMERIC)
+ try:
+ locale.setlocale(locale.LC_NUMERIC, new_locale)
+ faidx = Faidx('data/genes.fasta')
+ faidx.write_fai()
faidx = Faidx('data/genes.fasta', build_index=False)
+ finally:
+ locale.setlocale(locale.LC_NUMERIC, old_locale)
+
+ at pytest.mark.xfail(raises=IndexNotFoundError)
+def test_issue_134_no_build_index(remove_index):
+ """ Ensure that index file is not built when build_index=False. See mdshw5/pyfaidx#134.
+ """
+ faidx = Faidx('data/genes.fasta', build_index=False)
- @raises(FastaIndexingError)
- def test_issue_144_no_defline(self):
- """ Ensure that an exception is raised when a file contains no deflines. See mdshw5/pyfaidx#144.
- """
- tmp_dir = mkdtemp()
- try:
- fasta_path = os.path.join(tmp_dir, 'issue_144.fasta')
- # Write simple fasta file
- with open(fasta_path, 'w') as fasta_out:
- fasta_out.write("CTCCGGGCCCAT\nATAAAGCCTAAA\n")
- faidx = Faidx(fasta_path)
- finally:
- shutil.rmtree(tmp_dir)
-
+ at pytest.mark.xfail(raises=FastaIndexingError)
+def test_issue_144_no_defline(remove_index):
+ """ Ensure that an exception is raised when a file contains no deflines. See mdshw5/pyfaidx#144.
+ """
+ tmp_dir = mkdtemp()
+ try:
+ fasta_path = os.path.join(tmp_dir, 'issue_144.fasta')
+ # Write simple fasta file
+ with open(fasta_path, 'w') as fasta_out:
+ fasta_out.write("CTCCGGGCCCAT\nATAAAGCCTAAA\n")
+ faidx = Faidx(fasta_path)
+ finally:
+ shutil.rmtree(tmp_dir)
\ No newline at end of file
=====================================
tests/test_feature_key_function.py
=====================================
@@ -1,7 +1,6 @@
import os
+import pytest
from pyfaidx import Faidx, Fasta
-from nose.tools import raises
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
@@ -32,49 +31,47 @@ def get_duplicated_gene_name(accession):
return ACCESSION_TO_DUPLICATED_GENE_NAME_DICT.get(accession, accession)
-class TestFeatureKeyFunction(TestCase):
- def setUp(self):
- genes = Fasta('data/genes.fasta')
- del genes # Support feature introduced in #111
- pass
+ at pytest.fixture
+def remove_index():
+ genes = Fasta('data/genes.fasta')
+ del genes # Support feature introduced in #111
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
+def test_keys(remove_index):
+ genes = Fasta('data/genes.fasta', key_function=get_gene_name)
+ expect = ['BARD1', 'FGFR2', 'MDM4', 'gi|530364724|ref|XR_241079.1|', 'gi|530364725|ref|XR_241080.1|', 'gi|530364726|ref|XR_241081.1|', 'gi|530373235|ref|XM_005265507.1|', 'gi|530373237|ref|XM_005265508.1|', 'gi|530384534|ref|XM_005249642.1|', 'gi|530384536|ref|XM_005249643.1|', 'gi|530384538|ref|XM_005249644.1|', 'gi|530384540|ref|XM_005249645.1|', 'gi|543583738|ref|NM_001282548.1|', 'gi|543583740|ref|NM_001282549.1|', 'gi|543583785|ref|NM_000465.3|', 'gi|543583786|ref|NM_001282543.1|', 'gi|543583788|ref|NM_001282545.1|', 'gi|543583794|ref|NR_104212.1|', 'gi|543583795|ref|NR_104215.1|', 'gi|557361097|gb|KF435149.1|']
+ result = sorted(genes.keys())
+ assert result == expect
- def test_keys(self):
- genes = Fasta('data/genes.fasta', key_function=get_gene_name)
- expect = ['BARD1', 'FGFR2', 'MDM4', 'gi|530364724|ref|XR_241079.1|', 'gi|530364725|ref|XR_241080.1|', 'gi|530364726|ref|XR_241081.1|', 'gi|530373235|ref|XM_005265507.1|', 'gi|530373237|ref|XM_005265508.1|', 'gi|530384534|ref|XM_005249642.1|', 'gi|530384536|ref|XM_005249643.1|', 'gi|530384538|ref|XM_005249644.1|', 'gi|530384540|ref|XM_005249645.1|', 'gi|543583738|ref|NM_001282548.1|', 'gi|543583740|ref|NM_001282549.1|', 'gi|543583785|ref|NM_000465.3|', 'gi|543583786|ref|NM_001282543.1|', 'gi|543583788|ref|NM_001282545.1|', 'gi|543583794|ref|NR_104212.1|', 'gi|543583795|ref|NR_104215.1|', 'gi|557361097|gb|KF435149.1|']
- result = sorted(genes.keys())
- assert result == expect
+def test_key_function_by_dictionary_get_key(remove_index):
+ genes = Fasta('data/genes.fasta', key_function=get_gene_name)
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = genes['MDM4'][100-1:150]
+ assert str(result) == expect
- def test_key_function_by_dictionary_get_key(self):
- genes = Fasta('data/genes.fasta', key_function=get_gene_name)
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = genes['MDM4'][100-1:150]
- assert str(result) == expect
+def test_key_function_by_fetch(remove_index):
+ faidx = Faidx('data/genes.fasta', key_function=get_gene_name)
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = faidx.fetch('MDM4',
+ 100, 150)
+ assert str(result) == expect
- def test_key_function_by_fetch(self):
- faidx = Faidx('data/genes.fasta', key_function=get_gene_name)
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = faidx.fetch('MDM4',
- 100, 150)
- assert str(result) == expect
+ at pytest.mark.xfail(raises=ValueError)
+def test_duplicated_keys(remove_index):
+ genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name)
- @raises(ValueError)
- def test_duplicated_keys(self):
- genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name)
+def test_duplicated_keys_shortest(remove_index):
+ genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="shortest")
+ expect = 4573
+ result = len(genes["BARD1"])
+ assert expect == result
- def test_duplicated_keys_shortest(self):
- genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="shortest")
- expect = 4573
- result = len(genes["BARD1"])
- assert expect == result
-
- def test_duplicated_keys_longest(self):
- genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="longest")
- expect = 5317
- result = len(genes["BARD1"])
- assert expect == result
+def test_duplicated_keys_longest(remove_index):
+ genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="longest")
+ expect = 5317
+ result = len(genes["BARD1"])
+ assert expect == result
=====================================
tests/test_feature_read_ahead_buffer.py
=====================================
@@ -1,44 +1,41 @@
import os
+import pytest
from pyfaidx import Faidx, Fasta, FetchError
-from nose.tools import raises
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureBuffer(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_buffer_false(self):
- fasta = Fasta('data/genes.fasta', strict_bounds=True)
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
- assert result == expect
-
- def test_buffer_true(self):
- fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
- assert result == expect
-
- def test_buffer_exceed(self):
- fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
- expect = 'atgacatcattttccacctctgctcagtgttcaacatctgacagtgcttgcaggatctctcctggacaaatcaatcaggtacgaccaaaactgccgcttttgaagattttgcatgcagcaggtgcgcaaggtgaaatgttcactgttaaagaggtcatgcactatttaggtcagtacataatggtgaagcaactttatgatcagcaggagcagcatatggtatattgtggtggagatcttttgggagaactactgggacgtcagagcttctccgtgaaagacccaagccctctctatgatatgctaagaaagaatcttgtcactttagccactgctactacagcaaagtgcagaggaaagttccacttccagaaaaagaactacagaagacgatatcccc'
- result = fasta['gi|557361099|gb|KF435150.1|'][0:400].seq.lower()
- assert result == expect
-
- @raises(FetchError)
- def test_bounds_error(self):
- fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:15000].seq.lower()
-
- @raises(ValueError)
- def test_buffer_value(self):
- Fasta('data/genes.fasta', read_ahead=0.5)
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_buffer_false(remove_index):
+ fasta = Fasta('data/genes.fasta', strict_bounds=True)
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
+ assert result == expect
+
+def test_buffer_true(remove_index):
+ fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
+ assert result == expect
+
+def test_buffer_exceed(remove_index):
+ fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
+ expect = 'atgacatcattttccacctctgctcagtgttcaacatctgacagtgcttgcaggatctctcctggacaaatcaatcaggtacgaccaaaactgccgcttttgaagattttgcatgcagcaggtgcgcaaggtgaaatgttcactgttaaagaggtcatgcactatttaggtcagtacataatggtgaagcaactttatgatcagcaggagcagcatatggtatattgtggtggagatcttttgggagaactactgggacgtcagagcttctccgtgaaagacccaagccctctctatgatatgctaagaaagaatcttgtcactttagccactgctactacagcaaagtgcagaggaaagttccacttccagaaaaagaactacagaagacgatatcccc'
+ result = fasta['gi|557361099|gb|KF435150.1|'][0:400].seq.lower()
+ assert result == expect
+
+ at pytest.mark.xfail(raises=FetchError)
+def test_bounds_error(remove_index):
+ fasta = Fasta('data/genes.fasta', read_ahead=300, strict_bounds=True)
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:15000].seq.lower()
+
+ at pytest.mark.xfail(raises=ValueError)
+def test_buffer_value(remove_index):
+ Fasta('data/genes.fasta', read_ahead=0.5)
\ No newline at end of file
=====================================
tests/test_feature_sequence_as_raw.py
=====================================
@@ -1,45 +1,42 @@
import os
+import pytest
from pyfaidx import Faidx, Fasta
-from nose.tools import raises
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureSequenceAsRaw(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_as_raw_false(self):
- fasta = Fasta('data/genes.fasta')
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
- assert result == expect
-
- def test_as_raw_true(self):
- fasta = Fasta('data/genes.fasta', as_raw=True)
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].lower()
- assert result == expect
-
- @raises(AttributeError)
- def test_as_raw_false_error(self):
- fasta = Fasta('data/genes.fasta')
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].lower()
-
- @raises(AttributeError)
- def test_as_raw_true_error(self):
- fasta = Fasta('data/genes.fasta', as_raw=True)
- result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
-
- def test_as_raw_type_when_blen_lt_0(self):
- fasta = Fasta('data/genes.fasta', as_raw=True)
- expect = ''
- result = fasta.faidx.fetch('gi|557361099|gb|KF435150.1|', 10, 0)
- assert result == expect
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_as_raw_false(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
+ assert result == expect
+
+def test_as_raw_true(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'.lower()
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].lower()
+ assert result == expect
+
+ at pytest.mark.xfail(raises=AttributeError)
+def test_as_raw_false_error(remove_index):
+ fasta = Fasta('data/genes.fasta')
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].lower()
+
+ at pytest.mark.xfail(raises=AttributeError)
+def test_as_raw_true_error(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ result = fasta['gi|557361099|gb|KF435150.1|'][100-1:150].seq.lower()
+
+def test_as_raw_type_when_blen_lt_0(remove_index):
+ fasta = Fasta('data/genes.fasta', as_raw=True)
+ expect = ''
+ result = fasta.faidx.fetch('gi|557361099|gb|KF435150.1|', 10, 0)
+ assert result == expect
=====================================
tests/test_feature_spliced_seq.py
=====================================
@@ -1,74 +1,72 @@
import os
+import pytest
from pyfaidx import Fasta
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureSplicedSeq(TestCase):
- def setUp(self):
- pass
+ at pytest.fixture
+def remove_index():
+ yield
+ fais = [
+ "data/gene.bed12.fasta.fai",
+ "data/chr17.hg19.part.fa.fai"
+ ]
+ for fai in fais:
+ try:
+ os.remove(fai)
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_split_seq(remove_index):
+ """ Fetch sequence by blocks """
+ fa = Fasta('data/chr17.hg19.part.fa')
+
+ gene = Fasta("data/gene.bed12.fasta")
+ expect = gene[list(gene.keys())[0]][:].seq
+
+ bed = "data/gene.bed12"
+ with open(bed) as fi:
+ record = fi.readline().strip().split("\t")
- def tearDown(self):
- fais = [
- "data/gene.bed12.fasta.fai",
- "data/chr17.hg19.part.fa.fai"
- ]
- for fai in fais:
- try:
- os.remove(fai)
- except EnvironmentError:
- pass # some tests may delete this file
+ chrom = record[0]
+ start = int(record[1])
+ strand = record[5]
- def test_split_seq(self):
- """ Fetch sequence by blocks """
- fa = Fasta('data/chr17.hg19.part.fa')
-
- gene = Fasta("data/gene.bed12.fasta")
- expect = gene[list(gene.keys())[0]][:].seq
-
- bed = "data/gene.bed12"
- with open(bed) as fi:
- record = fi.readline().strip().split("\t")
+ # parse bed12 format
+ starts = [int(x) for x in record[11].split(",")[:-1]]
+ sizes = [int(x) for x in record[10].split(",")[:-1]]
+ starts = [start + x for x in starts]
+ ends = [start + size for start,size in zip(starts, sizes)]
+
+ # bed half-open
+ if strand == "-":
+ starts = [start + 1 for start in starts]
+ else:
+ ends = [end - 1 for end in ends]
+
+ intervals = zip(starts, ends)
+ result = fa.get_spliced_seq(chrom, intervals, rc=True)
+ print(result.seq)
+ print("====")
+ print(expect)
- chrom = record[0]
- start = int(record[1])
- strand = record[5]
+ assert result.seq == expect
- # parse bed12 format
- starts = [int(x) for x in record[11].split(",")[:-1]]
- sizes = [int(x) for x in record[10].split(",")[:-1]]
- starts = [start + x for x in starts]
- ends = [start + size for start,size in zip(starts, sizes)]
-
- # bed half-open
- if strand == "-":
- starts = [start + 1 for start in starts]
- else:
- ends = [end - 1 for end in ends]
-
- intervals = zip(starts, ends)
- result = fa.get_spliced_seq(chrom, intervals, rc=True)
- print(result.seq)
- print("====")
- print(expect)
-
- assert result.seq == expect
-
- def test_get_seq_rc(self):
- """ Check get_seq with rc argument """
- fa = Fasta('data/chr17.hg19.part.fa')
-
- result = fa.get_seq("chr17", 11, 20, rc=False)
- expect = "CCCTGTTCCT"
- print("normal")
- print(result.seq)
- print(expect)
- assert result.seq == expect
-
- result = fa.get_seq("chr17", 11, 20, rc=True)
- expect = "AGGAACAGGG"
- assert result.seq == expect
- print("rc")
- print(result.seq)
- print(expect)
+def test_get_seq_rc(remove_index):
+ """ Check get_seq with rc argument """
+ fa = Fasta('data/chr17.hg19.part.fa')
+
+ result = fa.get_seq("chr17", 11, 20, rc=False)
+ expect = "CCCTGTTCCT"
+ print("normal")
+ print(result.seq)
+ print(expect)
+ assert result.seq == expect
+
+ result = fa.get_seq("chr17", 11, 20, rc=True)
+ expect = "AGGAACAGGG"
+ assert result.seq == expect
+ print("rc")
+ print(result.seq)
+ print(expect)
=====================================
tests/test_feature_split_char.py
=====================================
@@ -1,41 +1,38 @@
import os
+import pytest
from pyfaidx import Faidx, Fasta
-from nose.tools import raises
-from unittest import TestCase
path = os.path.dirname(__file__)
os.chdir(path)
-class TestFeatureSplitChar(TestCase):
- def setUp(self):
- pass
-
- def tearDown(self):
- try:
- os.remove('data/genes.fasta.fai')
- except EnvironmentError:
- pass # some tests may delete this file
-
- def test_keys(self):
- fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
- expect = ['530364724', '530364725', '530364726', '530373235', '530373237', '530384534', '530384536', '530384538', '530384540', '543583738', '543583740', '543583785', '543583786', '543583788', '543583794', '543583795', '543583796', '557361097', '557361099', '563317589', 'AB821309.1', 'KF435149.1', 'KF435150.1', 'NM_000465.3', 'NM_001282543.1', 'NM_001282545.1', 'NM_001282548.1', 'NM_001282549.1', 'NR_104212.1', 'NR_104215.1', 'NR_104216.1', 'XM_005249642.1', 'XM_005249643.1', 'XM_005249644.1', 'XM_005249645.1', 'XM_005265507.1', 'XM_005265508.1', 'XR_241079.1', 'XR_241080.1', 'XR_241081.1', 'dbj']
- result = sorted(fasta.keys())
- assert result == expect
-
- def test_key_function_by_dictionary_get_key(self):
- fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = fasta['KF435150.1'][100-1:150]
- assert str(result) == expect
-
- def test_key_function_by_fetch(self):
- faidx = Faidx('data/genes.fasta', split_char='|', duplicate_action="drop")
- expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
- result = faidx.fetch('KF435150.1',
- 100, 150)
- assert str(result) == expect
-
- @raises(ValueError)
- def test_stop(self):
- fasta = Fasta('data/genes.fasta', split_char='|')
+ at pytest.fixture
+def remove_index():
+ yield
+ try:
+ os.remove('data/genes.fasta.fai')
+ except EnvironmentError:
+ pass # some tests may delete this file
+
+def test_keys(remove_index):
+ fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
+ expect = ['530364724', '530364725', '530364726', '530373235', '530373237', '530384534', '530384536', '530384538', '530384540', '543583738', '543583740', '543583785', '543583786', '543583788', '543583794', '543583795', '543583796', '557361097', '557361099', '563317589', 'AB821309.1', 'KF435149.1', 'KF435150.1', 'NM_000465.3', 'NM_001282543.1', 'NM_001282545.1', 'NM_001282548.1', 'NM_001282549.1', 'NR_104212.1', 'NR_104215.1', 'NR_104216.1', 'XM_005249642.1', 'XM_005249643.1', 'XM_005249644.1', 'XM_005249645.1', 'XM_005265507.1', 'XM_005265508.1', 'XR_241079.1', 'XR_241080.1', 'XR_241081.1', 'dbj']
+ result = sorted(fasta.keys())
+ assert result == expect
+
+def test_key_function_by_dictionary_get_key(remove_index):
+ fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = fasta['KF435150.1'][100-1:150]
+ assert str(result) == expect
+
+def test_key_function_by_fetch(remove_index):
+ faidx = Faidx('data/genes.fasta', split_char='|', duplicate_action="drop")
+ expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+ result = faidx.fetch('KF435150.1',
+ 100, 150)
+ assert str(result) == expect
+
+ at pytest.mark.xfail(raises=ValueError)
+def test_stop(remove_index):
+ fasta = Fasta('data/genes.fasta', split_char='|')
=====================================
tests/test_sequence_class.py
=====================================
@@ -1,5 +1,5 @@
+import pytest
from pyfaidx import Sequence, complement
-from nose.tools import assert_raises, raises
seq = Sequence(name='gi|557361099|gb|KF435150.1|', seq='TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA',
start=100, end=150)
@@ -19,8 +19,9 @@ def test_negate_metadata():
seq_neg = -seq
assert seq_neg.__repr__() == seq.complement[::-1].__repr__()
+ at pytest.mark.xfail(raises=ValueError)
def test_seq_invalid():
- assert_raises(ValueError, lambda: seq_invalid.complement)
+ seq_invalid.complement()
def test_integer_index():
assert seq[1].seq == 'T'
@@ -28,11 +29,11 @@ def test_integer_index():
def test_slice_index():
assert seq[0:10].seq == 'TTGAAGATTT'
- at raises(ValueError)
+ at pytest.mark.xfail(raises=ValueError)
def test_comp_invalid():
complement(comp_invalid)
- at raises(ValueError)
+ at pytest.mark.xfail(raises=ValueError)
def test_check_coordinates():
x = Sequence(name='gi|557361099|gb|KF435150.1|', seq='TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA',
start=100, end=110)
View it on GitLab: https://salsa.debian.org/med-team/python-pyfaidx/-/compare/8a5e6010c218cd89a51a03b575ab6c7730a14cca...707ff65c7f1d0a8320fa8dd428b0873f931c92f9
--
View it on GitLab: https://salsa.debian.org/med-team/python-pyfaidx/-/compare/8a5e6010c218cd89a51a03b575ab6c7730a14cca...707ff65c7f1d0a8320fa8dd428b0873f931c92f9
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220217/9d9575c5/attachment-0001.htm>
More information about the debian-med-commit
mailing list