[med-svn] [Git][med-team/python-pyfaidx][master] 8 commits: New upstream version 0.5.3.1

Andreas Tille gitlab at salsa.debian.org
Mon Apr 16 12:21:04 BST 2018


Andreas Tille pushed to branch master at Debian Med / python-pyfaidx


Commits:
e3d8d1b3 by Andreas Tille at 2018-04-16T11:50:28+02:00
New upstream version 0.5.3.1
- - - - -
5fc8481b by Andreas Tille at 2018-04-16T11:50:28+02:00
Update upstream source from tag 'upstream/0.5.3.1'

Update to upstream version '0.5.3.1'
with Debian dir 00eb618e67ca3115f09e4153b232f1ce0e751606
- - - - -
65c5e1b2 by Andreas Tille at 2018-04-16T11:50:29+02:00
New upstream version

- - - - -
8547bc0a by Andreas Tille at 2018-04-16T11:51:25+02:00
Point Vcs fields to salsa.debian.org

- - - - -
f4e25e3e by Andreas Tille at 2018-04-16T11:51:25+02:00
Standards-Version: 4.1.4

- - - - -
3cc39157 by Andreas Tille at 2018-04-16T11:51:25+02:00
debhelper 11

- - - - -
aa5007d6 by Andreas Tille at 2018-04-16T11:54:50+02:00
Testsuite: autopkgtest-pkg-python

- - - - -
6ab6bd1f by Andreas Tille at 2018-04-16T13:15:06+02:00
Upload to unstable

- - - - -


25 changed files:

- + .codecov.yml
- .travis.yml
- + CODE_OF_CONDUCT.md
- README.rst
- + debian/TODO
- debian/changelog
- debian/compat
- debian/control
- pyfaidx/__init__.py
- pyfaidx/cli.py
- scripts/benchmark.py
- + tests/data/chr17.hg19.part.fa
- tests/data/download_gene_fasta.py
- + tests/data/gene.bed12
- + tests/data/gene.bed12.fasta
- tests/test_FastaRecord.py
- tests/test_FastaRecord_iter.py
- tests/test_FastaVariant.py
- + tests/test_Fasta_bgzip.py
- tests/test_faidx.py
- tests/test_feature_bounds_check.py
- tests/test_feature_indexing.py
- tests/test_feature_key_function.py
- + tests/test_feature_spliced_seq.py
- tests/test_feature_split_char.py


Changes:

=====================================
.codecov.yml
=====================================
--- /dev/null
+++ b/.codecov.yml
@@ -0,0 +1,4 @@
+comment: false
+coverage:
+  status:
+    patch: false


=====================================
.travis.yml
=====================================
--- a/.travis.yml
+++ b/.travis.yml
@@ -2,6 +2,7 @@ language: python
 sudo: false
 python:
     - 'nightly'
+    - '3.6'
     - '3.5'
     - '3.4'
     - '3.3'
@@ -13,14 +14,19 @@ install:
     - pip wheel -f wheelhouse coverage biopython cython pysam pyvcf || true
     - pip install -f wheelhouse biopython cython pysam pyfasta coverage pyvcf || true
     - python setup.py install
-    - if [ ! -f samtools-1.2 ]; then wget -q -O - https://github.com/samtools/samtools/releases/download/1.2/samtools-1.2.tar.bz2 | tar -xjv; fi
+    - if [ ! -f samtools-1.2 ]; then curl -sL https://github.com/samtools/samtools/releases/download/1.2/samtools-1.2.tar.bz2 | tar -xjv; fi
     - cd samtools-1.2
     - make
     - export PATH=$PATH:$PWD
     - cd ..
+    - if [ ! -f htslib-1.4 ]; then curl -sL https://github.com/samtools/htslib/releases/download/1.4/htslib-1.4.tar.bz2 | tar -xjv; fi
+    - cd htslib-1.4
+    - make
+    - export PATH=$PATH:$PWD
+    - cd ..
 before_script:
-    - /usr/local/bin/pip install --user biopython
-    - /usr/bin/python tests/data/download_gene_fasta.py
+    - if [ $(python --version) -gt 2.6 ]; then env pip install biopython; fi
+    - env python tests/data/download_gene_fasta.py
 script: nosetests --with-coverage --cover-package=pyfaidx
 deploy:
   provider: pypi
@@ -29,18 +35,20 @@ deploy:
     secure: MbSaeuitkVTZqxa0PJ3RcR1aMf+B/sMbcx2sWOo9xfLlRFDFpYWJZ0EfXWEhrVu2YWXpBsasgunTDWSi0jNcZMH92MzOC+UTVYr45LO5sy6hm4iSiAgm/DPgYWdjP0SFKr7eL/HWPS+gHvgkXL1upleX21O358bxaezoasuKFvs=
   on:
     all_branches: true
-    python: 2.6
+    python: 2.7
     tags: true
     repo: mdshw5/pyfaidx
 matrix:
   allow_failures:
     - python: 'nightly'
     - python: 'pypy3'
+    - python: 'pypy'
   fast_finish: true
 cache:
     directories:
         - tests/data
         - samtools-1.2
+        - htslib-1.4
         - wheelhouse
 after_success:
     - bash <(curl -s https://codecov.io/bash)


=====================================
CODE_OF_CONDUCT.md
=====================================
--- /dev/null
+++ b/CODE_OF_CONDUCT.md
@@ -0,0 +1,46 @@
+# Contributor Covenant Code of Conduct
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of sexualized language or imagery and unwelcome sexual attention or advances
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic address, without explicit permission
+* Other conduct which could reasonably be considered inappropriate in a professional setting
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at mdshw5 at gmail.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
+
+[homepage]: http://contributor-covenant.org
+[version]: http://contributor-covenant.org/version/1/4/


=====================================
README.rst
=====================================
--- a/README.rst
+++ b/README.rst
@@ -1,4 +1,4 @@
-|Travis| |PyPI| |Landscape| |Coverage| |Depsy| |Appveyor| |Flattr|
+|Travis| |PyPI| |Landscape| |Coverage| |Depsy| |Appveyor|
 
 Description
 -----------
@@ -38,6 +38,8 @@ or download a `release <https://github.com/mdshw5/pyfaidx/releases>`_ and:
 
     python setup.py install
 
+If using ``pip install --user`` make sure to add ``/home/$(whoami)/.local/bin`` to your ``$PATH`` if you want to run the ``faidx`` script.
+
 Usage
 -----
 
@@ -72,7 +74,7 @@ Acts like a dictionary.
     >>> genes['NM_001282543.1'][200:230].end
     230
 
-    >>> genes['NM_001282543.1'][200:230].longname
+    >>> genes['NM_001282543.1'][200:230].fancy_name
     'NM_001282543.1:201-230'
 
     >>> len(genes['NM_001282543.1'])
@@ -108,27 +110,7 @@ Slices just like a string:
     >NM_001282543.1:1-5466
     CCCCGCCCCT........
 
-- Slicing start and end coordinates are 0-based, just like Python sequences.
-
-Sequence can be buffered in memory using a read-ahead buffer
-for fast sequential access:
-
-.. code:: python
-
-    >>> from timeit import timeit
-    >>> fetch = "genes['NM_001282543.1'][200:230]"
-    >>> read_ahead = "import pyfaidx; genes = pyfaidx.Fasta('tests/data/genes.fasta', read_ahead=10000)"
-    >>> no_read_ahead = "import pyfaidx; genes = pyfaidx.Fasta('tests/data/genes.fasta')"
-    >>> string_slicing = "genes = {}; genes['NM_001282543.1'] = 'N'*10000"
-
-    >>> timeit(fetch, no_read_ahead, number=10000)
-    0.2204863309962093
-    >>> timeit(fetch, read_ahead, number=10000)
-    0.1121859749982832
-    >>> timeit(fetch, string_slicing, number=10000)
-    0.0033553699977346696
-
-Read-ahead buffering can reduce runtime by 1/2 for sequential accesses to buffered regions.
+- Slicing start and end coordinates are 0-based, just like Python sequences.  
 
 Complements and reverse complements just like DNA
 
@@ -145,6 +127,31 @@ Complements and reverse complements just like DNA
     >>> -genes['NM_001282543.1'][200:230]
     >NM_001282543.1 (complement):230-201
     CATCCGGTTCCATGGCGGGCGCGGAACGAG
+    
+``Fasta`` objects can also be accessed using method calls:
+
+.. code:: python
+
+    >>> genes.get_seq('NM_001282543.1', 201, 210)
+    >NM_001282543.1:201-210
+    CTCGTTCCGC
+    
+    >>> genes.get_seq('NM_001282543.1', 201, 210, rc=True)
+    >NM_001282543.1 (complement):210-201
+    GCGGAACGAG
+    
+Spliced sequences can be retrieved from a list of [start, end] coordinates:
+**TODO** update this section
+
+.. code:: python
+    
+    # new in v0.5.1
+    segments = [[1, 10], [50, 70]]
+    >>> genes.get_spliced_seq('NM_001282543.1', segments)
+    >gi|543583786|ref|NM_001282543.1|:1-70
+    CCCCGCCCCTGGTTTCGAGTCGCTGGCCTGC
+
+.. _keyfn:
 
 Custom key functions provide cleaner access:
 
@@ -158,6 +165,24 @@ Custom key functions provide cleaner access:
     >NR_104212:1-10
     CCCCGCCCCT
 
+You can specify a character to split names on, which will generate additional entries:
+
+.. code:: python
+
+    >>> from pyfaidx import Fasta
+    >>> genes = Fasta('tests/data/genes.fasta', split_char='.', duplicate_action="first") # default duplicate_action="stop"
+    >>> genes.keys()
+    dict_keys(['.1', 'NR_104212', 'NM_001282543', 'XM_005249644', 'XM_005249645', 'NR_104216', 'XM_005249643', 'NR_104215', 'KF435150', 'AB821309', 'NM_001282549', 'XR_241081', 'KF435149', 'XR_241079', 'NM_000465', 'XM_005265508', 'XR_241080', 'XM_005249642', 'NM_001282545', 'XM_005265507', 'NM_001282548'])
+
+If your `key_function` or `split_char` generates duplicate entries, you can choose what action to take:
+
+.. code:: python
+
+    # new in v0.4.9
+    >>> genes = Fasta('tests/data/genes.fasta', split_char="|", duplicate_action="longest")
+    >>> genes.keys()
+    dict_keys(['gi', '563317589', 'dbj', 'AB821309.1', '', '557361099', 'gb', 'KF435150.1', '557361097', 'KF435149.1', '543583796', 'ref', 'NR_104216.1', '543583795', 'NR_104215.1', '543583794', 'NR_104212.1', '543583788', 'NM_001282545.1', '543583786', 'NM_001282543.1', '543583785', 'NM_000465.3', '543583740', 'NM_001282549.1', '543583738', 'NM_001282548.1', '530384540', 'XM_005249645.1', '530384538', 'XM_005249644.1', '530384536', 'XM_005249643.1', '530384534', 'XM_005249642.1', '530373237','XM_005265508.1', '530373235', 'XM_005265507.1', '530364726', 'XR_241081.1', '530364725', 'XR_241080.1', '530364724', 'XR_241079.1'])
+
 Filter functions (returning True) limit the index:
 
 .. code:: python
@@ -226,6 +251,36 @@ Sequence names are truncated on any whitespace. This is a limitation of the inde
     gi|557361097|gb|KF435149.1| Homo sapiens MDM4 protein variant G (MDM4) mRNA, complete cds
     ...
 
+    # new in v0.4.9
+    >>> from pyfaidx import Fasta
+    >>> genes = Fasta('tests/data/genes.fasta', read_long_names=True)
+    >>> for record in genes:
+    ...   print(record.name)
+    ...
+    gi|563317589|dbj|AB821309.1| Homo sapiens FGFR2-AHCYL1 mRNA for FGFR2-AHCYL1 fusion kinase protein, complete cds
+    gi|557361099|gb|KF435150.1| Homo sapiens MDM4 protein variant Y (MDM4) mRNA, complete cds, alternatively spliced
+    gi|557361097|gb|KF435149.1| Homo sapiens MDM4 protein variant G (MDM4) mRNA, complete cds
+
+Sequence can be buffered in memory using a read-ahead buffer
+for fast sequential access:
+
+.. code:: python
+
+    >>> from timeit import timeit
+    >>> fetch = "genes['NM_001282543.1'][200:230]"
+    >>> read_ahead = "import pyfaidx; genes = pyfaidx.Fasta('tests/data/genes.fasta', read_ahead=10000)"
+    >>> no_read_ahead = "import pyfaidx; genes = pyfaidx.Fasta('tests/data/genes.fasta')"
+    >>> string_slicing = "genes = {}; genes['NM_001282543.1'] = 'N'*10000"
+
+    >>> timeit(fetch, no_read_ahead, number=10000)
+    0.2204863309962093
+    >>> timeit(fetch, read_ahead, number=10000)
+    0.1121859749982832
+    >>> timeit(fetch, string_slicing, number=10000)
+    0.0033553699977346696
+
+Read-ahead buffering can reduce runtime by 1/2 for sequential accesses to buffered regions.
+
 .. role:: red
 
 If you want to modify the contents of your FASTA file in-place, you can use the `mutable` argument.
@@ -304,12 +359,18 @@ cli script: faidx
                             default base for missing positions and masking. default: N
       -d DELIMITER, --delimiter DELIMITER
                             delimiter for splitting names to multiple values (duplicate names will be discarded). default: None
+      -e HEADER_FUNCTION, --header-function HEADER_FUNCTION
+                            python function to modify header lines e.g: "lambda x: x.split("|")[0]". default: lambda x: x.split()[0]
+      -u {stop,first,last,longest,shortest}, --duplicates-action {stop,first,last,longest,shortest}
+                            entry to take when duplicate sequence names are encountered. default: stop
       -g REGEX, --regex REGEX
                             selected sequences are those matching regular expression. default: .*
       -v, --invert-match    selected sequences are those not matching 'regions' argument. default: False
       -m, --mask-with-default-seq
                             mask the FASTA file using --default-seq default: False
       -M, --mask-by-case    mask the FASTA file by changing to lowercase. default: False
+      -e HEADER_FUNCTION, --header-function HEADER_FUNCTION
+                            python function to modify header lines e.g: "lambda x: x.split("|")[0]". default: None
       --no-rebuild          do not rebuild the .fai index even if it is out of date. default: False
       --version             print pyfaidx version number
 
@@ -424,8 +485,6 @@ Examples:
     AGCTTCCCTGTGGTTTCCCGAGGCTTCCTTGCTTCCCGCTCTGCGAGGAGCCTTTCATCCGAAGGCGGGA
     .......
 
-
-
     $ faidx --size-range 5500,6000 -i chromsizes tests/data/genes.fasta
     NM_000465.3	5523
 
@@ -435,6 +494,14 @@ Examples:
     $ faidx -M --bed regions.bed tests/data/genes.fasta
     ### Modifies tests/data/genes.fasta by masking regions using lowercase characters ###
 
+    $ faidx -e "lambda x: x.split('.')[0]" tests/data/genes.fasta -i bed
+    AB821309	1	3510
+    KF435150	1	481
+    KF435149	1	642
+    NR_104216	1	4573
+    NR_104215	1	5317
+    .......
+
 
 Similar syntax as ``samtools faidx``
 
@@ -462,6 +529,19 @@ A lower-level Faidx class is also available:
    where "filename.fa" is the original FASTA file.
 -  Start and end coordinates are 1-based.
 
+Support for compressed FASTA
+----------------------------
+
+``pyfaidx`` can create and read ``.fai`` indices for FASTA files that have
+been compressed using the `bgzip <http://www.htslib.org/doc/tabix.html>`_
+tool from `samtools <http://www.htslib.org/>`_. ``bgzip`` writes compressed
+data in a ``BGZF`` format. ``BGZF`` is ``gzip`` compatible, consisting of
+multiple concatenated ``gzip`` blocks, each with an additional ``gzip``
+header making it possible to build an index for rapid random access. I.e.,
+files compressed with ``bgzip`` are valid ``gzip`` and so can be read by
+``gunzip``.  See `this description
+<http://pydoc.net/Python/biopython/1.66/Bio.bgzf/>`_ for more details on
+``bgzip``.
 
 Changelog
 ---------
@@ -478,7 +558,7 @@ I try to fix as many bugs as possible, but most of this work is supported by a s
 Contributing
 ------------
 
-Create a new Pull Request with one feauture. If you add a new feature, please
+Create a new Pull Request with one feature. If you add a new feature, please
 create also the relevant test.
 
 To get test running on your machine:
@@ -519,8 +599,3 @@ Comprehensive Cancer Center in the Department of Oncology.
 
 .. |Appveyor| image:: https://ci.appveyor.com/api/projects/status/80ihlw30a003596w?svg=true
    :target: https://ci.appveyor.com/project/mdshw5/pyfaidx
-   
-.. |Flattr| image:: http://button.flattr.com/flattr-badge-large.png
-   :target: https://flattr.com/submit/auto?fid=po00kq&url=https%3A%2F%2Fgithub.com%2Fmdshw5%2Fpyfaidx
-   
-


=====================================
debian/TODO
=====================================
--- /dev/null
+++ b/debian/TODO
@@ -0,0 +1,5 @@
+Replace
+
+    Testsuite: autopkgtest-pkg-python
+
+by a real test suite running build time tests


=====================================
debian/changelog
=====================================
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,13 @@
+python-pyfaidx (0.5.3.1-1) unstable; urgency=medium
+
+  * New upstream version
+  * Point Vcs fields to salsa.debian.org
+  * Standards-Version: 4.1.4
+  * debhelper 11
+  * Testsuite: autopkgtest-pkg-python
+
+ -- Andreas Tille <tille at debian.org>  Mon, 16 Apr 2018 13:12:29 +0200
+
 python-pyfaidx (0.4.8.1-1) unstable; urgency=medium
 
   * New upstream version


=====================================
debian/compat
=====================================
--- a/debian/compat
+++ b/debian/compat
@@ -1 +1 @@
-10
+11


=====================================
debian/control
=====================================
--- a/debian/control
+++ b/debian/control
@@ -2,8 +2,9 @@ Source: python-pyfaidx
 Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
 Uploaders: Andreas Tille <tille at debian.org>
 Section: python
+Testsuite: autopkgtest-pkg-python
 Priority: optional
-Build-Depends: debhelper (>= 10),
+Build-Depends: debhelper (>= 11~),
                dh-python,
                python-all,
                python-coverage,
@@ -19,9 +20,9 @@ Build-Depends: debhelper (>= 10),
                python3-numpy,
                python3-six,
                python3-mock
-Standards-Version: 3.9.8
-Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/python-pyfaidx.git
-Vcs-Git: https://anonscm.debian.org/git/debian-med/python-pyfaidx.git
+Standards-Version: 4.1.4
+Vcs-Browser: https://salsa.debian.org/med-team/python-pyfaidx
+Vcs-Git: https://salsa.debian.org/med-team/python-pyfaidx.git
 Homepage: https://github.com/mdshw5/pyfaidx
 
 Package: python-pyfaidx


=====================================
pyfaidx/__init__.py
=====================================
--- a/pyfaidx/__init__.py
+++ b/pyfaidx/__init__.py
@@ -1,5 +1,4 @@
 # pylint: disable=R0913, R0914, C0301
-
 """
 Fasta file -> Faidx -> Fasta -> FastaRecord -> Sequence
 """
@@ -11,7 +10,7 @@ from six import PY2, PY3, string_types, integer_types
 from six.moves import zip_longest
 try:
     from collections import OrderedDict
-except ImportError: #python 2.6
+except ImportError:  #python 2.6
     from ordereddict import OrderedDict
 from collections import namedtuple
 import re
@@ -22,31 +21,44 @@ from threading import Lock
 
 dna_bases = re.compile(r'([ACTGNactgnYRWSKMDVHBXyrwskmdvhbx]+)')
 
-__version__ = '0.4.8.1'
+__version__ = '0.5.3.1'
+
+
+class KeyFunctionError(ValueError):
+    """Raised if the key_function argument is invalid."""
 
 
 class FastaIndexingError(Exception):
-    def __init__(self, msg=None):
-        self.msg = msg
-        super(FastaIndexingError, self).__init__(self.msg)
+    """Raised if we encounter malformed FASTA that prevents indexing."""
+
+
+class IndexNotFoundError(IOError):
+    """Raised if read_fai cannot open the index file."""
+
 
+class FastaNotFoundError(IOError):
+    """Raised if the fasta file cannot be opened."""
 
-class FetchError(Exception):
-    def __init__(self, msg=None):
-        self.msg = msg
-        super(FetchError, self).__init__(self.msg)
 
+class FetchError(IndexError):
+    """Raised if a request to fetch a FASTA sequence cannot be fulfilled."""
 
-class BedError(Exception):
-    def __init__(self, msg=None):
-        self.msg = 'Malformed BED entry!\n' if not msg else msg
-        super(BedError, self).__init__(self.msg)
+
+class BedError(ValueError):
+    """Indicates a malformed BED entry."""
 
 
 class RegionError(Exception):
-    def __init__(self, msg=None):
-        self.msg = 'Malformed region! Format = rname:start-end.\n' if not msg else msg
-        super(RegionError, self).__init__(self.msg)
+    # This exception class is currently unused, but has been retained for
+    # backwards compatibility.
+    """A region error occurred."""
+
+
+class UnsupportedCompressionFormat(IOError):
+    """
+    Raised when a FASTA file is given with a recognized but unsupported
+    compression extension.
+    """
 
 
 class Sequence(object):
@@ -56,6 +68,7 @@ class Sequence(object):
     start, end = coordinates of subsequence (optional)
     comp = boolean switch for complement property
     """
+
     def __init__(self, name='', seq='', start=None, end=None, comp=False):
         self.name = name
         self.seq = seq
@@ -108,22 +121,26 @@ class Sequence(object):
         """
         if self.start is None or self.end is None:
             correction_factor = 0
-        elif len(self.seq) == abs(self.end - self.start) + 1:  # determine coordinate system
+        elif len(
+                self.seq
+        ) == abs(self.end - self.start) + 1:  # determine coordinate system
             one_based = True
             correction_factor = -1
         elif len(self.seq) == abs(self.end - self.start):
             one_based = False
             correction_factor = 0
         elif len(self.seq) != abs(self.end - self.start):
-            raise ValueError("Coordinates start=%s and end=%s imply a diffent length than sequence (length %s)." % (self.start, self.end, len(self.seq)))
+            raise ValueError(
+                "Coordinates (Sequence.start=%s and Sequence.end=%s) imply a different length than Sequence.seq (len=%s). Did you modify Sequence.seq?"
+                % (self.start, self.end, len(self.seq)))
 
         if isinstance(n, slice):
             slice_start, slice_stop, slice_step = n.indices(len(self))
             if self.start is None or self.end is None:  # there should never be self.start != self.end == None
                 start = None
                 end = None
-                return self.__class__(self.name, self.seq[n],
-                                      start, end, self.comp)
+                return self.__class__(self.name, self.seq[n], start, end,
+                                      self.comp)
             self_end, self_start = (self.end, self.start)
             if abs(slice_step) > 1:
                 start = None
@@ -137,7 +154,8 @@ class Sequence(object):
             else:
                 start = self_start + slice_start
                 end = self_start + slice_stop + correction_factor
-            return self.__class__(self.name, self.seq[n], start, end, self.comp)
+            return self.__class__(self.name, self.seq[n], start, end,
+                                  self.comp)
         elif isinstance(n, integer_types):
             if n < 0:
                 n = len(self) + n
@@ -167,7 +185,7 @@ class Sequence(object):
         return self[::-1].complement
 
     def __repr__(self):
-        return '\n'.join([''.join(['>', self.longname]), self.seq])
+        return '\n'.join([''.join(['>', self.fancy_name]), self.seq])
 
     def __len__(self):
         """
@@ -177,10 +195,10 @@ class Sequence(object):
         return len(self.seq)
 
     @property
-    def longname(self):
+    def fancy_name(self):
         """ Return the fancy name for the sequence, including start, end, and complementation.
         >>> x = Sequence(name='chr1', seq='ATCGTA', start=1, end=6, comp=True)
-        >>> x.longname
+        >>> x.fancy_name
         'chr1:1-6 (complement)'
         """
         name = self.name
@@ -190,6 +208,17 @@ class Sequence(object):
             name += ' (complement)'
         return name
 
+    @property
+    def long_name(self):
+        """ DEPRECATED: Use fancy_name instead.
+        Return the fancy name for the sequence, including start, end, and complementation.
+        >>> x = Sequence(name='chr1', seq='ATCGTA', start=1, end=6, comp=True)
+        >>> x.long_name
+        'chr1:1-6 (complement)'
+        """
+        msg = "The `Sequence.long_name` property is deprecated, and will be removed in future versions. Please use `Sequence.fancy_name` instead."
+        warnings.warn(msg, DeprecationWarning, stacklevel=2)
+        return self.fancy_name
 
     @property
     def complement(self):
@@ -199,8 +228,8 @@ class Sequence(object):
         >chr1 (complement)
         TAGCAT
         """
-        comp = self.__class__(self.name, complement(self.seq),
-                              start=self.start, end=self.end)
+        comp = self.__class__(
+            self.name, complement(self.seq), start=self.start, end=self.end)
         comp.comp = False if self.comp else True
         return comp
 
@@ -235,7 +264,6 @@ class Sequence(object):
         else:
             return None
 
-
     @property
     def gc(self):
         """ Return the GC content of seq as a float
@@ -251,7 +279,9 @@ class Sequence(object):
         return (g + c) / len(self.seq)
 
 
-class IndexRecord(namedtuple('IndexRecord', ['rlen', 'offset', 'lenc', 'lenb', 'bend', 'prev_bend'])):
+class IndexRecord(
+        namedtuple('IndexRecord',
+                   ['rlen', 'offset', 'lenc', 'lenb', 'bend', 'prev_bend'])):
     __slots__ = ()
 
     def __getitem__(self, key):
@@ -260,16 +290,32 @@ class IndexRecord(namedtuple('IndexRecord', ['rlen', 'offset', 'lenc', 'lenb', '
         return tuple.__getitem__(self, key)
 
     def __str__(self):
-        return "{rlen:n}\t{offset:n}\t{lenc:n}\t{lenb:n}\n".format(**self._asdict())
+        return "{rlen:n}\t{offset:n}\t{lenc:n}\t{lenb:n}\n".format(
+            **self._asdict())
+
+    def __len__(self):
+        return self.rlen
 
 
 class Faidx(object):
     """ A python implementation of samtools faidx FASTA indexing """
-    def __init__(self, filename, default_seq=None, key_function=None,
-                 as_raw=False, strict_bounds=False, read_ahead=None,
-                 mutable=False, split_char=None, filt_function=None,
+
+    def __init__(self,
+                 filename,
+                 default_seq=None,
+                 key_function=lambda x: x,
+                 as_raw=False,
+                 strict_bounds=False,
+                 read_ahead=None,
+                 mutable=False,
+                 split_char=None,
+                 duplicate_action="stop",
+                 filt_function=lambda x: True,
                  one_based_attributes=True,
-                 sequence_always_upper=False, rebuild=True):
+                 read_long_names=False,
+                 sequence_always_upper=False,
+                 rebuild=True,
+                 build_index=True):
         """
         filename: name of fasta file
         key_function: optional callback function which should return a unique
@@ -279,41 +325,102 @@ class Faidx(object):
           Default: False (i.e. return a Sequence() object).
         """
         self.filename = filename
-        if mutable:
-            self.file = open(filename, 'r+b')
+
+        if filename.lower().endswith('.bgz') or filename.lower().endswith(
+                '.gz'):
+            # Only try to import Bio if we actually need the bgzf reader.
+            try:
+                from Bio import bgzf
+            except ImportError:
+                raise ImportError(
+                    "BioPython must be installed to read gzipped files.")
+            else:
+                self._fasta_opener = bgzf.open
+                self._bgzf = True
+        elif filename.lower().endswith('.bz2') or filename.lower().endswith(
+                '.zip'):
+            raise UnsupportedCompressionFormat(
+                "Compressed FASTA is only supported in BGZF format. Use "
+                "bgzip to compresss your FASTA.")
         else:
-            self.file = open(filename, 'rb')
+            self._fasta_opener = open
+            self._bgzf = False
+
+        try:
+            self.file = self._fasta_opener(filename, 'r+b'
+                                           if mutable else 'rb')
+        except (ValueError, IOError) as e:
+            if str(e).find('BGZF') > -1:
+                raise UnsupportedCompressionFormat(
+                    "Compressed FASTA is only supported in BGZF format. Use "
+                    "the samtools bgzip utility (instead of gzip) to "
+                    "compress your FASTA.")
+            else:
+                raise FastaNotFoundError(
+                    "Cannot read FASTA file %s" % filename)
+
         self.indexname = filename + '.fai'
-        self.key_function = key_function if key_function else lambda rname: rname
-        self.filt_function = filt_function if filt_function else lambda x: True
+        self.read_long_names = read_long_names
+        self.key_function = key_function
+        try:
+            key_fn_test = self.key_function(
+                "TestingReturnType of_key_function")
+            if not isinstance(key_fn_test, string_types):
+                raise KeyFunctionError(
+                    "key_function argument should return a string, not {0}".
+                    format(type(key_fn_test)))
+        except Exception as e:
+            pass
+        self.filt_function = filt_function
+        assert duplicate_action in ("stop", "first", "last", "longest",
+                                    "shortest", "drop")
+        self.duplicate_action = duplicate_action
         self.as_raw = as_raw
         self.default_seq = default_seq
-        self.strict_bounds = strict_bounds
+        if self._bgzf and self.default_seq is not None:
+            raise FetchError(
+                "The default_seq argument is not supported with using BGZF compression. Please decompress your FASTA file and try again."
+            )
+        if self._bgzf:
+            self.strict_bounds = True
+        else:
+            self.strict_bounds = strict_bounds
+        self.split_char = split_char
         self.one_based_attributes = one_based_attributes
         self.sequence_always_upper = sequence_always_upper
         self.index = OrderedDict()
         self.lock = Lock()
-        self.buffer = dict((('seq', None), ('name', None), ('start', None), ('end', None)))
+        self.buffer = dict((('seq', None), ('name', None), ('start', None),
+                            ('end', None)))
         if not read_ahead or isinstance(read_ahead, integer_types):
             self.read_ahead = read_ahead
         elif not isinstance(read_ahead, integer_types):
-            raise ValueError("read_ahead value must be int, not {0}".format(type(read_ahead)))
+            raise ValueError("read_ahead value must be int, not {0}".format(
+                type(read_ahead)))
 
         self.mutable = mutable
         with self.lock:  # lock around index generation so only one thread calls method
             try:
-                if os.path.exists(self.indexname) and getmtime(self.indexname) >= getmtime(self.filename):
-                    self.read_fai(split_char)
-                elif os.path.exists(self.indexname) and getmtime(self.indexname) < getmtime(self.filename) and not rebuild:
-                    self.read_fai(split_char)
-                    warnings.warn("Index file {0} is older than FASTA file {1}.".format(self.indexname, self.filename), RuntimeWarning)
-                else:
+                if os.path.exists(self.indexname) and getmtime(
+                        self.indexname) >= getmtime(self.filename):
+                    self.read_fai()
+                elif os.path.exists(self.indexname) and getmtime(
+                        self.indexname) < getmtime(
+                            self.filename) and not rebuild:
+                    self.read_fai()
+                    warnings.warn(
+                        "Index file {0} is older than FASTA file {1}.".format(
+                            self.indexname, self.filename), RuntimeWarning)
+                elif build_index:
                     self.build_index()
-                    self.read_fai(split_char)
-            except FastaIndexingError as e:
+                    self.read_fai()
+                else:
+                    self.read_fai()
+
+            except FastaIndexingError:
                 os.remove(self.indexname)
                 self.file.close()
-                raise FastaIndexingError(e)
+                raise
             except Exception:
                 # Handle potential exceptions other than 'FastaIndexingError'
                 self.file.close()
@@ -331,36 +438,64 @@ class Faidx(object):
     def __repr__(self):
         return 'Faidx("%s")' % (self.filename)
 
-    def read_fai(self, split_char):
-        duplicate_ids = []
-        with open(self.indexname) as index:
-            prev_bend = 0
-            for line in index:
-                line = line.rstrip()
-                rname, rlen, offset, lenc, lenb = line.split('\t')
-                rlen, offset, lenc, lenb = map(int, (rlen, offset, lenc, lenb))
-                newlines = int(ceil(rlen / lenc) * (lenb - lenc))
-                bend = offset + newlines + rlen
-                rname = filter(self.filt_function, self.key_function(rname).split(split_char))
-                for i, key in enumerate(rname):  # mdshw5/pyfaidx/issues/64
-                    if key in self.index and split_char is None:
-                        if i == 0:
-                            raise ValueError('Duplicate key "%s"' % key)
-                        else:
-                            continue
-                    # eliminate duplicate keys if they result from split_char
-                    elif key in self.index and split_char:
-                        duplicate_ids.append(key)
-                        continue
+    def _index_as_string(self):
+        """ Returns the string representation of the index as iterable """
+        for k, v in self.index.items():
+            yield '\t'.join([k, str(v)])
+
+    def read_fai(self):
+        try:
+            with open(self.indexname) as index:
+                prev_bend = 0
+                drop_keys = []
+                for line in index:
+                    line = line.rstrip()
+                    rname, rlen, offset, lenc, lenb = line.split('\t')
+                    rlen, offset, lenc, lenb = map(int,
+                                                   (rlen, offset, lenc, lenb))
+                    newlines = int(ceil(rlen / lenc) * (lenb - lenc))
+                    bend = offset + newlines + rlen
+                    rec = IndexRecord(rlen, offset, lenc, lenb, bend,
+                                      prev_bend)
+                    if self.read_long_names:
+                        rname = self._long_name_from_index_record(rec)
+                    if self.split_char:
+                        rname = filter(self.filt_function,
+                                       self.key_function(rname).split(
+                                           self.split_char))
                     else:
-                        self.index[key] = IndexRecord(rlen, offset, lenc, lenb, bend, prev_bend)
-                prev_bend = bend
-            for dup in duplicate_ids:
+                        # filter must act on an iterable
+                        rname = filter(self.filt_function,
+                                       [self.key_function(rname)])
+                    for key in rname:  # mdshw5/pyfaidx/issues/64
+                        if key in self.index:
+                            if self.duplicate_action == "stop":
+                                raise ValueError('Duplicate key "%s"' % key)
+                            elif self.duplicate_action == "first":
+                                continue
+                            elif self.duplicate_action == "last":
+                                self.index[key] = rec
+                            elif self.duplicate_action == "longest":
+                                if len(rec) > len(self.index[key]):
+                                    self.index[key] = rec
+                            elif self.duplicate_action == "shortest":
+                                if len(rec) < len(self.index[key]):
+                                    self.index[key] = rec
+                            elif self.duplicate_action == "drop":
+                                if key not in drop_keys:
+                                    drop_keys.append(key)
+                        else:
+                            self.index[key] = rec
+                    prev_bend = bend
+            for dup in drop_keys:
                 self.index.pop(dup, None)
+        except IOError:
+            raise IndexNotFoundError(
+                "Could not read index file %s" % self.indexname)
 
     def build_index(self):
         try:
-            with open(self.filename, 'r') as fastafile:
+            with self._fasta_opener(self.filename, 'r') as fastafile:
                 with open(self.indexname, 'w') as indexfile:
                     rname = None  # reference sequence name
                     offset = 0  # binary offset of end of current line
@@ -369,30 +504,42 @@ class Faidx(object):
                     clen = None  # character line length
                     bad_lines = []  # lines > || < than blen
                     thisoffset = offset
+
+                    lastline = None
                     for i, line in enumerate(fastafile):
                         line_blen = len(line)
                         line_clen = len(line.rstrip('\n\r'))
+                        lastline = i
                         # write an index line
                         if line[0] == '>':
-                            valid_entry = check_bad_lines(rname, bad_lines, i - 1)
+                            valid_entry = check_bad_lines(
+                                rname, bad_lines, i - 1)
                             if valid_entry and i > 0:
-                                indexfile.write("{0}\t{1:d}\t{2:d}\t{3:d}\t{4:d}\n".format(rname, rlen, thisoffset, clen, blen))
+                                indexfile.write(
+                                    "{0}\t{1:d}\t{2:d}\t{3:d}\t{4:d}\n".format(
+                                        rname, rlen, thisoffset, clen, blen))
                             elif not valid_entry:
-                                raise FastaIndexingError("Line length of fasta"
-                                                         " file is not "
-                                                         "consistent! "
-                                                         "Inconsistent line found in >{0} at "
-                                                         "line {1:n}.".format(rname, bad_lines[0][0] + 1))
+                                raise FastaIndexingError(
+                                    "Line length of fasta"
+                                    " file is not "
+                                    "consistent! "
+                                    "Inconsistent line found in >{0} at "
+                                    "line {1:n}.".format(
+                                        rname, bad_lines[0][0] + 1))
                             blen = None
                             rlen = 0
                             clen = None
                             bad_lines = []
-                            try:  # must catch empty deflines
-                                rname = line.rstrip('\n\r')[1:].split()[0]  # remove comments
+                            try:  # must catch empty deflines (actually these might be okay: https://github.com/samtools/htslib/pull/258)
+                                rname = line.rstrip('\n\r')[1:].split()[
+                                    0]  # duplicates are detected with read_fai
                             except IndexError:
-                                raise FastaIndexingError("Bad sequence name %s at line %s." % (line.rstrip('\n\r'), str(i)))
+                                raise FastaIndexingError(
+                                    "Bad sequence name %s at line %s." %
+                                    (line.rstrip('\n\r'), str(i)))
                             offset += line_blen
-                            thisoffset = offset
+                            thisoffset = fastafile.tell(
+                            ) if self._bgzf else offset
                         else:  # check line and advance offset
                             if not blen:
                                 blen = line_blen
@@ -406,23 +553,36 @@ class Faidx(object):
                             offset += line_blen
                             rlen += line_clen
 
-                    # write the final index line
-                    valid_entry = check_bad_lines(rname, bad_lines, i)  # advance index since we're at the end of the file
-                    if not valid_entry:
-                        raise FastaIndexingError("Line length of fasta"
-                                                 " file is not "
-                                                 "consistent! "
-                                                 "Inconsistent line found in >{0} at "
-                                                 "line {1:n}.".format(rname, bad_lines[0][0] + 1))
-                    indexfile.write("{0:s}\t{1:d}\t{2:d}\t{3:d}\t{4:d}\n".format(rname, rlen, thisoffset, clen, blen))
-        except IOError:
-            raise IOError("%s may not be writable. Please use Fasta(rebuild=False), Faidx(rebuild=False) or faidx --no-rebuild." % self.indexname)
+                    # write the final index line, if there is one.
+                    if lastline is not None:
+                        valid_entry = check_bad_lines(
+                            rname, bad_lines, lastline
+                        )  # advance index since we're at the end of the file
+                        if valid_entry:
+                            indexfile.write(
+                                "{0:s}\t{1:d}\t{2:d}\t{3:d}\t{4:d}\n".format(
+                                    rname, rlen, thisoffset, clen, blen))
+                        else:
+                            raise FastaIndexingError(
+                                "Line length of fasta"
+                                " file is not "
+                                "consistent! "
+                                "Inconsistent line found in >{0} at "
+                                "line {1:n}.".format(rname,
+                                                     bad_lines[0][0] + 1))
+        except (IOError, FastaIndexingError) as e:
+            if isinstance(e, IOError):
+                raise IOError(
+                    "%s may not be writable. Please use Fasta(rebuild=False), Faidx(rebuild=False) or faidx --no-rebuild."
+                    % self.indexname)
+            elif isinstance(e, FastaIndexingError):
+                raise e
 
     def write_fai(self):
         with self.lock:
             with open(self.indexname, 'w') as outfile:
-                for k, v in self.index.items():
-                    outfile.write('\t'.join([k, str(v)]))
+                for line in self._index_as_string:
+                    outfile.write(line)
 
     def from_buffer(self, start, end):
         i_start = start - self.buffer['start']  # want [0, 1) coordinates from [1, 1] coordinates
@@ -458,8 +618,8 @@ class Faidx(object):
         4. Seek to start position, taking newlines into account
         5. Read to end position, return sequence
         """
-        assert isinstance(start, integer_types)
-        assert isinstance(end, integer_types)
+        assert start == int(start)
+        assert end == int(end)
         try:
             i = self.index[rname]
         except KeyError:
@@ -467,32 +627,40 @@ class Faidx(object):
                              "Please check your FASTA file.".format(rname))
         start0 = start - 1  # make coordinates [0,1)
         if start0 < 0:
-            raise FetchError("Requested start coordinate must be greater than 1.")
+            raise FetchError(
+                "Requested start coordinate must be greater than 1.")
         seq_len = end - start0
 
-
         # Calculate offset (https://github.com/samtools/htslib/blob/20238f354894775ed22156cdd077bc0d544fa933/faidx.c#L398)
-        newlines_before = int((start0 - 1) / i.lenc * (i.lenb - i.lenc)) if start0 > 0 else 0
+        newlines_before = int(
+            (start0 - 1) / i.lenc * (i.lenb - i.lenc)) if start0 > 0 else 0
         newlines_to_end = int(end / i.lenc * (i.lenb - i.lenc))
         newlines_inside = newlines_to_end - newlines_before
         seq_blen = newlines_inside + seq_len
         bstart = i.offset + newlines_before + start0
+        if seq_blen < 0 and self.strict_bounds:
+            raise FetchError("Requested coordinates start={0:n} end={1:n} are "
+                             "invalid.\n".format(start, end))
+        elif end > i.rlen and self.strict_bounds:
+            raise FetchError("Requested end coordinate {0:n} outside of {1}. "
+                             "\n".format(end, rname))
 
         with self.lock:
-            self.file.seek(bstart)
-
-            if bstart + seq_blen > i.bend and not self.strict_bounds:
-                seq_blen = i.bend - bstart
-            elif bstart + seq_blen > i.bend and self.strict_bounds:
-                raise FetchError("Requested end coordinate {0:n} outside of {1}. "
-                                 "\n".format(end, rname))
-            if seq_blen > 0:
-                seq = self.file.read(seq_blen).decode()
-            elif seq_blen <= 0 and not self.strict_bounds:
-                seq = ''
-            elif seq_blen <= 0 and self.strict_bounds:
-                raise FetchError("Requested coordinates start={0:n} end={1:n} are "
-                                 "invalid.\n".format(start, end))
+            if self._bgzf:  # We can't add to virtual offsets, so we need to read from the beginning of the record and trim the beginning if needed
+                self.file.seek(i.offset)
+                chunk = start0 + newlines_before + newlines_inside + seq_len
+                chunk_seq = self.file.read(chunk).decode()
+                seq = chunk_seq[start0 + newlines_before:]
+            else:
+                self.file.seek(bstart)
+
+                if bstart + seq_blen > i.bend and not self.strict_bounds:
+                    seq_blen = i.bend - bstart
+
+                if seq_blen > 0:
+                    seq = self.file.read(seq_blen).decode()
+                elif seq_blen <= 0 and not self.strict_bounds:
+                    seq = ''
 
         if not internals:
             return seq.replace('\n', '')
@@ -501,7 +669,9 @@ class Faidx(object):
 
     def format_seq(self, seq, rname, start, end):
         start0 = start - 1
-        if len(seq) < end - start0 and self.default_seq:  # Pad missing positions with default_seq
+        if len(
+                seq
+        ) < end - start0 and self.default_seq:  # Pad missing positions with default_seq
             pad_len = end - start0 - len(seq)
             seq = ''.join([seq, pad_len * self.default_seq])
         else:  # Return less than requested range
@@ -516,19 +686,23 @@ class Faidx(object):
         if self.as_raw:
             return seq
         else:
-            return Sequence(name=rname, start=int(start),
-                            end=int(end), seq=seq)
+            return Sequence(
+                name=rname, start=int(start), end=int(end), seq=seq)
 
     def to_file(self, rname, start, end, seq):
         """ Write sequence in region from start-end, overwriting current
         contents of the FASTA file. """
         if not self.mutable:
-            raise IOError("Write attempted for immutable Faidx instance. Set mutable=True to modify original FASTA.")
+            raise IOError(
+                "Write attempted for immutable Faidx instance. Set mutable=True to modify original FASTA."
+            )
         file_seq, internals = self.from_file(rname, start, end, internals=True)
 
         with self.lock:
             if len(seq) != len(file_seq) - internals['newlines_inside']:
-                raise IOError("Specified replacement sequence needs to have the same length as original.")
+                raise IOError(
+                    "Specified replacement sequence needs to have the same length as original."
+                )
             elif len(seq) == len(file_seq) - internals['newlines_inside']:
                 line_len = internals['i'].lenc
                 self.file.seek(internals['bstart'])
@@ -542,6 +716,39 @@ class Faidx(object):
                         n = m
                         m += line_len
                     self.file.write(seq[n:].encode())
+                    self.file.flush()
+
+    def get_long_name(self, rname):
+        """ Return the full sequence defline and description. External method using the self.index """
+        index_record = self.index[rname]
+        if self._bgzf:
+            return self._long_name_from_bgzf(index_record)
+        else:
+            return self._long_name_from_index_record(index_record)
+
+    def _long_name_from_index_record(self, index_record):
+        """ Return the full sequence defline and description. Internal method passing IndexRecord """
+        prev_bend = index_record.prev_bend
+        defline_end = index_record.offset
+        self.file.seek(prev_bend)
+        return self.file.read(defline_end - prev_bend).decode()[1:-1]
+
+    def _long_name_from_bgzf(self, index_record):
+        """ Return the full sequence defline and description. Internal method passing IndexRecord
+        This method is present for compatibility with BGZF files, since we cannot subtract their offsets.
+        It may be possible to implement a more efficient method. """
+        raise NotImplementedError(
+            "FastaRecord.long_name and Fasta(read_long_names=True) "
+            "are not supported currently for BGZF compressed files.")
+        prev_bend = index_record.prev_bend
+        self.file.seek(prev_bend)
+        defline = []
+        while True:
+            chunk = self.file.read(4096).decode()
+            defline.append(chunk)
+            if '\n' in chunk or '\r' in chunk:
+                break
+        return ''.join(defline)[1:].split('\n\r')[0]
 
     def close(self):
         self.__exit__()
@@ -554,6 +761,7 @@ class Faidx(object):
 
 
 class FastaRecord(object):
+    __slots__ = ['name', '_fa']
     def __init__(self, name, fa):
         self.name = name
         self._fa = fa
@@ -595,12 +803,66 @@ class FastaRecord(object):
                 raise StopIteration
             start += line_len
 
+    def __reversed__(self):
+        """ Reverse line-based iterator """
+        line_len = self._fa.faidx.index[self.name].lenc
+        # have to determine last line length
+        last_line = len(self) % line_len
+        if last_line == 0:
+            last_line = line_len
+        end = len(self)
+        start = end - last_line
+        while True:
+            if start > 0:
+                yield self[start:end][::-1]
+            else:
+                yield self[:end][::-1]
+                raise StopIteration
+            if end == len(self):  # first iteration
+                end -= last_line
+            else:
+                end -= line_len
+            start = end - line_len
+
     def __repr__(self):
         return 'FastaRecord("%s")' % (self.name)
 
     def __len__(self):
         return self._fa.faidx.index[self.name].rlen
 
+    @property
+    def unpadded_len(self):
+        """ Returns the length of the contig without 5' and 3' N padding.
+        Functions the same as contigNonNSize in Fasta.cpp at
+        https://github.com/Illumina/hap.py/blob/master/src/c%2B%2B/lib/tools/Fasta.cpp#L284
+        """
+        length = len(self)
+        stop = False
+        for line in iter(self):
+            if stop:
+                break
+            if isinstance(line, Sequence):
+                line = line.seq
+            for base in line.upper():
+                if base == 'N':
+                    length -= 1
+                else:
+                    stop = True
+                    break
+        stop = False
+        for line in reversed(self):
+            if stop:
+                break
+            if isinstance(line, Sequence):
+                line = line.seq
+            for base in line.upper():
+                if base == 'N':
+                    length -= 1
+                else:
+                    stop = True
+                    break
+        return length
+
     def __str__(self):
         return str(self[:])
 
@@ -612,25 +874,27 @@ class FastaRecord(object):
             for site in var:
                 if site.is_snp:
                     sample = site.genotype(self._fa.sample)
-                    if sample.gt_type in self._fa.gt_type and eval(self._fa.filter):
+                    if sample.gt_type in self._fa.gt_type and eval(
+                            self._fa.filter):
                         pos.append(site.POS)
             return tuple(pos)
         else:
-            raise NotImplementedError("variant_sites() only valid for FastaVariant.")
+            raise NotImplementedError(
+                "variant_sites() only valid for FastaVariant.")
 
     @property
     def long_name(self):
         """ Read the actual defline from self._fa.faidx mdshw5/pyfaidx#54 """
-        index_record = self._fa.faidx.index[self.name]
-        prev_bend = index_record.prev_bend
-        newline_len = index_record.lenb - index_record.lenc
-        defline_end = index_record.offset
-        self._fa.faidx.file.seek(prev_bend)
-        return self._fa.faidx.file.read(defline_end - prev_bend).decode()[1:-1]
+        return self._fa.faidx.get_long_name(self.name)
+
 
 class MutableFastaRecord(FastaRecord):
     def __init__(self, name, fa):
         super(MutableFastaRecord, self).__init__(name, fa)
+        if self._fa.faidx._fasta_opener != open:
+            raise UnsupportedCompressionFormat(
+                "BGZF compressed FASTA is not supported for MutableFastaRecord. "
+                "Please decompress your FASTA file.")
 
     def __setitem__(self, n, value):
         """Mutate sequence in region [start, end)
@@ -660,26 +924,51 @@ class MutableFastaRecord(FastaRecord):
 
 
 class Fasta(object):
-    def __init__(self, filename, default_seq=None, key_function=None, as_raw=False,
-                 strict_bounds=False, read_ahead=None, mutable=False, split_char=None,
-                 filt_function=None, one_based_attributes=True,
-                 sequence_always_upper=False, rebuild=True):
+    def __init__(self,
+                 filename,
+                 default_seq=None,
+                 key_function=lambda x: x,
+                 as_raw=False,
+                 strict_bounds=False,
+                 read_ahead=None,
+                 mutable=False,
+                 split_char=None,
+                 filt_function=lambda x: True,
+                 one_based_attributes=True,
+                 read_long_names=False,
+                 duplicate_action="stop",
+                 sequence_always_upper=False,
+                 rebuild=True,
+                 build_index=True):
         """
         An object that provides a pygr compatible interface.
         filename: name of fasta file
         """
         self.filename = filename
         self.mutable = mutable
-        self.faidx = Faidx(filename, key_function=key_function, as_raw=as_raw,
-                           default_seq=default_seq, strict_bounds=strict_bounds,
-                           read_ahead=read_ahead, mutable=mutable, split_char=split_char,
-                           filt_function=filt_function, one_based_attributes=one_based_attributes,
-                           sequence_always_upper=sequence_always_upper, rebuild=rebuild)
+        self.faidx = Faidx(
+            filename,
+            key_function=key_function,
+            as_raw=as_raw,
+            default_seq=default_seq,
+            strict_bounds=strict_bounds,
+            read_ahead=read_ahead,
+            mutable=mutable,
+            split_char=split_char,
+            filt_function=filt_function,
+            one_based_attributes=one_based_attributes,
+            read_long_names=read_long_names,
+            duplicate_action=duplicate_action,
+            sequence_always_upper=sequence_always_upper,
+            rebuild=rebuild,
+            build_index=build_index)
         self.keys = self.faidx.index.keys
         if not self.mutable:
-            self.records = dict([(rname, FastaRecord(rname, self)) for rname in self.keys()])
+            self.records = dict(
+                [(rname, FastaRecord(rname, self)) for rname in self.keys()])
         elif self.mutable:
-            self.records = dict([(rname, MutableFastaRecord(rname, self)) for rname in self.keys()])
+            self.records = dict([(rname, MutableFastaRecord(rname, self))
+                                 for rname in self.keys()])
 
     def __contains__(self, rname):
         """Return True if genome contains record."""
@@ -701,13 +990,40 @@ class Fasta(object):
         for rname in self.keys():
             yield self[rname]
 
-    def get_seq(self, name, start, end):
+    def get_seq(self, name, start, end, rc=False):
         """Return a sequence by record name and interval [start, end).
 
-        Coordinates are 0-based, end-exclusive.
+        Coordinates are 1-based, end-exclusive.
+        If rc is set, reverse complement will be returned.
         """
         # Get sequence from real genome object and save result.
-        return self.faidx.fetch(name, start, end)
+        seq = self.faidx.fetch(name, start, end)
+        if rc:
+            return -seq
+        else:
+            return seq
+
+    def get_spliced_seq(self, name, intervals, rc=False):
+        """Return a sequence by record name and list of intervals
+
+        Interval list is an iterable of [start, end].
+        Coordinates are 1-based, end-exclusive.
+        If rc is set, reverse complement will be returned.
+        """
+        # Get sequence for all intervals
+        chunks = [self.faidx.fetch(name, s, e) for s, e in intervals]
+        start = chunks[0].start
+        end = chunks[-1].end
+
+        # reverce complement
+        if rc:
+            seq = "".join([(-chunk).seq for chunk in chunks[::-1]])
+        else:
+            seq = "".join([chunk.seq for chunk in chunks])
+
+        # Sequence coordinate validation wont work since
+        # len(Sequence.seq) != end - start
+        return Sequence(name=name, seq=seq, start=None, end=None)
 
     def close(self):
         self.__exit__()
@@ -723,7 +1039,16 @@ class FastaVariant(Fasta):
     """ Return consensus sequence from FASTA and VCF inputs
     """
     expr = set(('>', '<', '=', '!'))
-    def __init__(self, filename, vcf_file, sample=None, het=True, hom=True, call_filter=None, **kwargs):
+
+    def __init__(self,
+                 filename,
+                 vcf_file,
+                 sample=None,
+                 het=True,
+                 hom=True,
+                 call_filter=None,
+                 **kwargs):
+        super(FastaVariant, self).__init__(filename, **kwargs)
         try:
             import pysam
         except ImportError:
@@ -734,9 +1059,10 @@ class FastaVariant(Fasta):
             raise ImportError("PyVCF must be installed for FastaVariant.")
         if call_filter is not None:
             try:
-                key, expr, value = call_filter.split() # 'GQ > 30'
+                key, expr, value = call_filter.split()  # 'GQ > 30'
             except IndexError:
-                raise ValueError("call_filter must be a string in the format 'XX <>!= NN'")
+                raise ValueError(
+                    "call_filter must be a string in the format 'XX <>!= NN'")
             assert all([x in self.expr for x in list(expr)])
             assert all([x in string.ascii_uppercase for x in list(key)])
             assert all([x in string.printable for x in list(value)])
@@ -752,19 +1078,21 @@ class FastaVariant(Fasta):
         else:
             self.sample = self.vcf.samples[0]
             if len(self.vcf.samples) > 1 and sample is None:
-                warnings.warn("Using sample {0} genotypes.".format(self.sample), RuntimeWarning)
+                warnings.warn("Using sample {0} genotypes.".format(
+                    self.sample), RuntimeWarning)
         if het and hom:
             self.gt_type = set((1, 2))
         elif het:
-            self.gt_type = set((1,))
+            self.gt_type = set((1, ))
         elif hom:
-            self.gt_type = set((2,))
+            self.gt_type = set((2, ))
         else:
             self.gt_type = set()
-        super(FastaVariant, self).__init__(filename, **kwargs)
 
     def __repr__(self):
-        return 'FastaVariant("%s", "%s", gt="%s")' % (self.filename, self.vcf.filename, str(self.gt_type))
+        return 'FastaVariant("%s", "%s", gt="%s")' % (self.filename,
+                                                      self.vcf.filename,
+                                                      str(self.gt_type))
 
     def get_seq(self, name, start, end):
         """Return a sequence by record name and interval [start, end).
@@ -797,23 +1125,26 @@ class FastaVariant(Fasta):
 def wrap_sequence(n, sequence, fillvalue=''):
     args = [iter(sequence)] * n
     for line in zip_longest(fillvalue=fillvalue, *args):
-        yield ''.join(line + ("\n",))
+        yield ''.join(line + ("\n", ))
+
 
 # To take a complement, we map each character in the first string in this pair
 # to the corresponding character in the second string.
-complement_map = ('ACTGNactgnYRWSKMDVHBXyrwskmdvhbx', 'TGACNtgacnRYWSMKHBDVXrywsmkhbdvx')
+complement_map = ('ACTGNactgnYRWSKMDVHBXyrwskmdvhbx',
+                  'TGACNtgacnRYWSMKHBDVXrywsmkhbdvx')
 invalid_characters_set = set(
     chr(x) for x in range(256) if chr(x) not in complement_map[0])
 invalid_characters_string = ''.join(invalid_characters_set)
 
 if PY3:
-    complement_table = str.maketrans(
-        complement_map[0], complement_map[1], invalid_characters_string)
-    translate_arguments = (complement_table,)
+    complement_table = str.maketrans(complement_map[0], complement_map[1],
+                                     invalid_characters_string)
+    translate_arguments = (complement_table, )
 elif PY2:
     complement_table = string.maketrans(complement_map[0], complement_map[1])
     translate_arguments = (complement_table, invalid_characters_string)
 
+
 def complement(seq):
     """ Returns the complement of seq.
     >>> seq = 'ATCGTA'
@@ -825,10 +1156,12 @@ def complement(seq):
     if len(result) != len(seq):
         first_invalid_position = next(
             i for i in range(len(seq)) if seq[i] in invalid_characters_set)
-        raise ValueError("Sequence contains non-DNA character '{0}' at position {1:n}\n".format(
-            seq[first_invalid_position], first_invalid_position + 1))
+        raise ValueError(
+            "Sequence contains non-DNA character '{0}' at position {1:n}\n".
+            format(seq[first_invalid_position], first_invalid_position + 1))
     return result
 
+
 def translate_chr_name(from_name, to_name):
     chr_name_map = dict(zip(from_name, to_name))
 
@@ -860,6 +1193,7 @@ def ucsc_split(region):
         start, end = (None, None)
     return (rname, start, end)
 
+
 def check_bad_lines(rname, bad_lines, i):
     """ Find inconsistent line lengths in the middle of an
     entry. Allow blank lines between entries, and short lines


=====================================
pyfaidx/cli.py
=====================================
--- a/pyfaidx/cli.py
+++ b/pyfaidx/cli.py
@@ -4,23 +4,23 @@ import sys
 import os.path
 import re
 from pyfaidx import Fasta, wrap_sequence, FetchError, ucsc_split, bed_split
+from collections import defaultdict
 
 keepcharacters = (' ', '.', '_')
 
-
 def write_sequence(args):
     _, ext = os.path.splitext(args.fasta)
     if ext:
         ext = ext[1:]  # remove the dot from extension
     filt_function = re.compile(args.regex).search
-    fasta = Fasta(args.fasta, default_seq=args.default_seq, strict_bounds=not args.lazy, split_char=args.delimiter, filt_function=filt_function, rebuild=not args.no_rebuild)
+    fasta = Fasta(args.fasta, default_seq=args.default_seq, key_function=eval(args.header_function), strict_bounds=not args.lazy, split_char=args.delimiter, filt_function=filt_function, read_long_names=args.long_names, rebuild=not args.no_rebuild)
 
     regions_to_fetch, split_function = split_regions(args)
     if not regions_to_fetch:
         regions_to_fetch = fasta.keys()
     if args.invert_match:
         sequences_to_exclude = set([split_function(region)[0] for region in regions_to_fetch])
-        fasta = Fasta(args.fasta, default_seq=args.default_seq, strict_bounds=not args.lazy, split_char=args.delimiter, rebuild=not args.no_rebuild)
+        fasta = Fasta(args.fasta, default_seq=args.default_seq, key_function=eval(args.header_function), strict_bounds=not args.lazy, split_char=args.delimiter, rebuild=not args.no_rebuild)
         regions_to_fetch = (key for key in fasta.keys() if key not in sequences_to_exclude)
         split_function = ucsc_split
 
@@ -45,14 +45,14 @@ def write_sequence(args):
         try:
             if args.transform:
                 if not header and args.transform == 'nucleotide':
-                    outfile.write("name\tstart\tend\tA\tT\tC\tG\tN\n")
+                    outfile.write("name\tstart\tend\tA\tT\tC\tG\tN\tothers\n")
                     header = True
                 outfile.write(transform_sequence(args, fasta, name, start, end))
             else:
                 for line in fetch_sequence(args, fasta, name, start, end):
                     outfile.write(line)
         except FetchError as e:
-            raise FetchError(e.msg.rstrip() + "Try setting --lazy.\n")
+            raise FetchError(str(e) + " Try setting --lazy.\n")
         if args.split_files:
             outfile.close()
     fasta.__exit__()
@@ -61,7 +61,12 @@ def write_sequence(args):
 def fetch_sequence(args, fasta, name, start=None, end=None):
     try:
         line_len = fasta.faidx.index[name].lenc
-        sequence = fasta[name][start:end]
+        if args.auto_strand and start > end and start is not None and end is not None:
+            # flip (0, 1] coordinates
+            sequence = fasta[name][end - 1:start + 1]
+            sequence = sequence.reverse.complement
+        else:
+            sequence = fasta[name][start:end]
     except KeyError:
         sys.stderr.write("warning: {name} not found in file\n".format(**locals()))
         return
@@ -69,13 +74,13 @@ def fetch_sequence(args, fasta, name, start=None, end=None):
         sequence = sequence.complement
     if args.reverse:
         sequence = sequence.reverse
+    if args.no_output:
+        return
     if args.no_names:
         pass
-    elif args.full_names:
-        yield ''.join(['>', fasta[name].long_name, '\n'])
     else:
-        if start or end:
-            yield ''.join(['>', sequence.longname, '\n'])
+        if (start or end) and not args.no_coords:
+            yield ''.join(['>', sequence.fancy_name, '\n'])
         else:
             yield ''.join(['>', sequence.name, '\n'])
     for line in wrap_sequence(line_len, sequence.seq):
@@ -114,14 +119,23 @@ def split_regions(args):
 def transform_sequence(args, fasta, name, start=None, end=None):
     line_len = fasta.faidx.index[name].lenc
     s = fasta[name][start:end]
+    if args.no_output:
+        return
     if args.transform == 'bed':
-        return '{name}\t{start}\t{end}\n'.format(name=s.name, start=s.start, end=s.end)
+        return '{name}\t{start}\t{end}\n'.format(name=s.name, start=s.start - 1 , end=s.end)
     elif args.transform == 'chromsizes':
         return '{name}\t{length}\n'.format(name=s.name, length=len(s))
     elif args.transform == 'nucleotide':
-        nucs = Counter(dict([('A', 0), ('T', 0), ('C', 0), ('G', 0), ('N', 0)]))
-        nucs.update(str(s).upper())
-        return '{name}\t{start}\t{end}\t{A}\t{T}\t{C}\t{G}\t{N}\n'.format(name=s.name, start=s.start, end=s.end, **nucs)
+        ss = str(s).upper()
+        nucs = defaultdict(int)
+        nucs.update([(c, str(ss).count(c)) for c in set(str(ss))])
+        A = nucs.pop('A', 0)
+        T = nucs.pop('T', 0)
+        C = nucs.pop('C', 0)
+        G = nucs.pop('G', 0)
+        N = nucs.pop('N', 0)
+        others = '|'.join([':'.join((k, v)) for k, v in nucs.items()])
+        return '{sname}\t{sstart}\t{send}\t{A}\t{T}\t{C}\t{G}\t{N}\t{others}\n'.format(sname=s.name, sstart=s.start, send=s.end, **locals())
     elif args.transform == 'transposed':
         return '{name}\t{start}\t{end}\t{seq}\n'.format(name=s.name, start=s.start, end=s.end, seq=str(s))
 
@@ -133,25 +147,33 @@ def main(ext_args=None):
                                      epilog="Please cite: Shirley MD, Ma Z, Pedersen BS, Wheelan SJ. (2015) Efficient \"pythonic\" access to FASTA files using pyfaidx. PeerJ PrePrints 3:e1196 https://dx.doi.org/10.7287/peerj.preprints.970v1")
     parser.add_argument('fasta', type=str, help='FASTA file')
     parser.add_argument('regions', type=str, nargs='*', help="space separated regions of sequence to fetch e.g. chr1:1-1000")
-    parser.add_argument('-b', '--bed', type=argparse.FileType('r'), help="bed file of regions")
-    parser.add_argument('-o', '--out', type=argparse.FileType('w'), help="output file name (default: stdout)")
-    parser.add_argument('-i', '--transform', type=str, choices=('bed', 'chromsizes', 'nucleotide', 'transposed'), help="transform the requested regions into another format. default: %(default)s")
-    parser.add_argument('-c', '--complement', action="store_true", default=False, help="complement the sequence. default: %(default)s")
-    parser.add_argument('-r', '--reverse', action="store_true", default=False, help="reverse the sequence. default: %(default)s")
-    parser.add_argument('-a', '--size-range', type=parse_size_range, default=None, help='selected sequences are in the size range [low, high]. example: 1,1000 default: %(default)s')
-    names = parser.add_mutually_exclusive_group()
+    _input = parser.add_argument_group('input options')
+    output = parser.add_argument_group('output options')
+    header = parser.add_argument_group('header options')
+    _input.add_argument('-b', '--bed', type=argparse.FileType('r'), help="bed file of regions")
+    output.add_argument('-o', '--out', type=argparse.FileType('w'), help="output file name (default: stdout)")
+    output.add_argument('-i', '--transform', type=str, choices=('bed', 'chromsizes', 'nucleotide', 'transposed'), help="transform the requested regions into another format. default: %(default)s")
+    output.add_argument('-c', '--complement', action="store_true", default=False, help="complement the sequence. default: %(default)s")
+    output.add_argument('-r', '--reverse', action="store_true", default=False, help="reverse the sequence. default: %(default)s")
+    output.add_argument('-y', '--auto-strand', action="store_true", default=False, help="reverse complement the sequence when start > end coordinate. default: %(default)s")
+    output.add_argument('-a', '--size-range', type=parse_size_range, default=None, help='selected sequences are in the size range [low, high]. example: 1,1000 default: %(default)s')
+    names = header.add_mutually_exclusive_group()
     names.add_argument('-n', '--no-names', action="store_true", default=False, help="omit sequence names from output. default: %(default)s")
-    names.add_argument('-f', '--full-names', action="store_true", default=False, help="output full names including description. default: %(default)s")
-    parser.add_argument('-x', '--split-files', action="store_true", default=False, help="write each region to a separate file (names are derived from regions)")
-    parser.add_argument('-l', '--lazy', action="store_true", default=False, help="fill in --default-seq for missing ranges. default: %(default)s")
-    parser.add_argument('-s', '--default-seq', type=check_seq_length, default='N', help='default base for missing positions and masking. default: %(default)s')
-    parser.add_argument('-d', '--delimiter', type=str, default=None, help='delimiter for splitting names to multiple values (duplicate names will be discarded). default: %(default)s')
-    matcher = parser.add_mutually_exclusive_group()
+    names.add_argument('-f', '--long-names', action="store_true", default=False, help="output full (long) names from the input fasta headers. default: headers are truncated after the first whitespace")
+    header.add_argument('-t', '--no-coords', action="store_true", default=False, help="omit coordinates (e.g. chr:start-end) from output headers. default: %(default)s")
+    output.add_argument('-x', '--split-files', action="store_true", default=False, help="write each region to a separate file (names are derived from regions)")
+    output.add_argument('-l', '--lazy', action="store_true", default=False, help="fill in --default-seq for missing ranges. default: %(default)s")
+    output.add_argument('-s', '--default-seq', type=check_seq_length, default='N', help='default base for missing positions and masking. default: %(default)s')
+    header.add_argument('-d', '--delimiter', type=str, default=None, help='delimiter for splitting names to multiple values (duplicate names will be discarded). default: %(default)s')
+    header.add_argument('-e', '--header-function', type=str, default='lambda x: x.split()[0]', help='python function to modify header lines e.g: "lambda x: x.split("|")[0]". default: %(default)s')
+    header.add_argument('-u', '--duplicates-action', type=str, default="stop", choices=("stop", "first", "last", "longest", "shortest"), help='entry to take when duplicate sequence names are encountered. default: %(default)s')
+    matcher = header.add_mutually_exclusive_group()
     matcher.add_argument('-g', '--regex', type=str, default='.*', help='selected sequences are those matching regular expression. default: %(default)s')
     matcher.add_argument('-v', '--invert-match', action="store_true", default=False, help="selected sequences are those not matching 'regions' argument. default: %(default)s")
-    masking = parser.add_mutually_exclusive_group()
+    masking = output.add_mutually_exclusive_group()
     masking.add_argument('-m', '--mask-with-default-seq', action="store_true", default=False, help="mask the FASTA file using --default-seq default: %(default)s")
     masking.add_argument('-M', '--mask-by-case', action="store_true", default=False, help="mask the FASTA file by changing to lowercase. default: %(default)s")
+    output.add_argument('--no-output', action="store_true", default=False, help="do not output any sequence. default: %(default)s")
     parser.add_argument('--no-rebuild', action="store_true", default=False, help="do not rebuild the .fai index even if it is out of date. default: %(default)s")
     parser.add_argument('--version', action="version", version=__version__, help="print pyfaidx version number")
     # print help usage if no arguments are supplied
@@ -162,12 +184,19 @@ def main(ext_args=None):
         args = parser.parse_args(ext_args)
     else:
         args = parser.parse_args()
+        
+    if args.auto_strand:
+        if args.complement:
+            sys.stderr.write("--auto-strand and --complement are both set. Are you sure this is what you want?\n")
+        if args.reverse:
+            sys.stderr.write("--auto-strand and --reverse are both set. Are you sure this is what you want?\n")
 
     if args.mask_with_default_seq or args.mask_by_case:
         mask_sequence(args)
     else:
         write_sequence(args)
 
+    
 
 def check_seq_length(value):
     if len(value) != 1:


=====================================
scripts/benchmark.py
=====================================
--- a/scripts/benchmark.py
+++ b/scripts/benchmark.py
@@ -48,6 +48,10 @@ def make_long_fasta(filename, nrecs=250, seqlen=SEQLEN):
                 f.write(line)
     return headers
 
+def bgzip_compress_fasta(filename):
+    from subprocess import call
+    call(' '.join(['bgzip', '-c', filename, '>', filename + '.gz']), shell=True)
+
 def read_dict(f, headers):
     for k in islice(headers, 0, None, 10):
         for start, end in intervals:
@@ -89,7 +93,10 @@ def read_samtools(f, headers):
 def main():
     fa_file = NamedTemporaryFile()
     index = fa_file.name + '.fai'
+    bgzf_index = fa_file.name + '.gz.fai'
     headers = make_long_fasta(fa_file.name)
+    bgzip_compress_fasta(fa_file.name)
+
     def pyfaidx_fasta(n):
         print('timings for pyfaidx.Fasta')
         ti = []
@@ -113,6 +120,29 @@ def main():
         print(mean(tf)/nreads/10*1000*1000)
         tracemalloc.stop()
 
+    def pyfaidx_bgzf_faidx(n):
+        print('timings for pyfaidx.Faidx with bgzf compression')
+        ti = []
+        tf = []
+        for _ in range(n):
+            t = time.time()
+            f = pyfaidx.Faidx(fa_file.name + '.gz')
+            ti.append(time.time() - t)
+
+            t = time.time()
+            read_faidx(f, headers)
+            tf.append(time.time() - t)
+            os.remove(index)
+        # profile memory usage and report timings
+        tracemalloc.start()
+        f = pyfaidx.Faidx(fa_file.name + '.gz')
+        read_faidx(f, headers)
+        os.remove(index)
+        print(tracemalloc.get_traced_memory())
+        print(mean(ti))
+        print(mean(tf)/nreads/10*1000*1000)
+        tracemalloc.stop()
+
     def pyfaidx_faidx(n):
         print('timings for pyfaidx.Faidx')
         ti = []
@@ -276,6 +306,7 @@ def main():
     n = 3
     pyfaidx_fasta(n)
     pyfaidx_faidx(n)
+    pyfaidx_bgzf_faidx(n)
     pyfasta_fasta(n)
     pyfasta_fseek(n)
     seqio_read(n)


=====================================
tests/data/chr17.hg19.part.fa
=====================================
--- /dev/null
+++ b/tests/data/chr17.hg19.part.fa
@@ -0,0 +1,2 @@
+>chr17
+AAGCTTCTCACCCTGTTCCTGCATAGATAATTGCATGACAATTGCCTTGTCCCTGCTGAATGTGCTCTGGGGTCTCTGGGGTCTCACCCACGACCAACTCCCTGGGCCTGGCACCAGGGAGCTTAACAAACATCTGTCCAGCGAATACCTGCATCCCTAGAAGTGAAGCCACCGCCCAAAGACACGCCCATGTCCAGCTTAACCTGCATCCCTAGAAGTGAAGGCACCGCCCAAAGACACGCCCATGTCCAGCTTATTCTGCCCAGTTCCTCTCCAGAAAGGCTGCATGGTTGACACACAGTGcctgcgacaaagctgaatgctatcatttaaaaactccttgctggtttgagaggcagaaaatgatatctcatagttgctttactttgcatattttAAAATTGTGACTTTCATGGCATAAATAATACTGGTTTATTACAGAAGCACTAGAAAATGCATGTGGACAAAAGTTGGGATTAGGAGagagaaatgaagacatatgtccacacaaaaacctgttcattgcagctttctaccatcaccaaaaattgcaaacaaccacacgcccttcaactggggaactcatcaacaacaaacttgtggtttacccacacaatggaagaccacttagcaacaaaaaggaccaaactcctggtacatgcaactgacagatgaatctcaaacgcattcctccgtgtgaaagaagccggactcacagggcaacacactatctgactgtttcatgggaaagtctggaaacggcaacaccattgagacagaaaacaggtgagtggttgcctggggccagggaactttctggggtcatattctctgtgttgattctggtggtggaaacaagactgtCccagcctgggtgatacagcgagaccccatctctaccaaaaaattaaaaattagctgggcatggtggtgcatgcctgtagtcccagctattcacagtgctgaggtgggaagatgcttgagcccaggagttcaaggctgcaatgagctatgattgcgccactgcactttggcctggacaacagagcaaaaccctgtctctaaaaaaagaaaagaaaagaaaaaCTCACTGGATATGAATGATAcaggttgaggatccattatctgaaatgcttggaccagatgttttgaattttggattttttcatattttgtaatctttgcagtatatttaccagttcagcatccctaactcaaaaattcaaaaatctgaaatcccaaacgcgccaataagcattccctttgagcgtcatgtcggtgcttggaatgtttggggttttggatttacagctttgggacgctcaacctgTACCTCAATAAACCTGATTTTAAAAAAGTTTGGGGGGATTCCCCTAAGCCCGCCACCCGGAGACAGCGGATTTCCTTAGTTACTTACTATGCTCCTTGGCCATTTCTCTAGGTATTGGTATATTGTGTCTGCTGTGAACTGTCCTTGGCCTGTTTGGTGACGGGTGAGGAGCAGGGACAGAAGGGTCCTGCGTGCCCTGCCTTCACAAGCCCCTGGAAGGAAAGTTGTTTTGGGATCTCTGCACCCTCAGCCTGGACAACTTGTGCCCATCTGGTGACCCCTCACTCAGCCACCAGACTTCCACGACAGGCTCCAGCCTCGGCACCTTCAGCCATGGACAGTTCCGCCAGCGTTGCCCTCTGTTCTGCTGTTTTCTCTACCAGAAGTGCCCTTCCCTCCTCACCTGACCACTCTGGGGAAATCCCTCAGCACCCTCCCTGAGCATACCCTACTCTGGCACAAGCCCACCCTGCAAAGCCCCTGAGGCCCGCCCTGTGGCGTCTCTCCCTCCCTTGCTGTCAGGACAGTGGTCCTGGCCACCGGGGCTCACGGAGCCGCCCTGTGCCGTGTACCTCTGAGCCCTCTGCACAGTGCCTTCTGCTTGCCTGTGGCTTTGAGAAGAaaccccttctggttatacataagacagccagagaagggagttgcccagggtggcacagcacgttgctgccagTTACTGCCATTTTCACGGGCATGAAATGGAGATAACAACAGGAGCGACCGCACAGGCTGCTGAGCGCGTCACACGCAGCCATCGCGCAGCTCAGGGATATTACGTGTAACTCGACATGTCAGCGATTGTCACAGGCACTGCTACTCCTGGGGTTTTCCATCAAACCCTCAAGAGCTGGGCCTGGGGTCAACTTCCGGCCTGGGGAAACTGGGGCAAGTATCACCAGAGATGAGCTTTATAAAAATAATGGTGCTAgctgggcatggtggcttgcacctgtaatcccagcactttgggaggccgagctaggaggatcgtttgagtccagcagtttgagaccagcctggccaatacggcaaaacccagtctctacaaaaaatacaaaaaacaactagccaggcgtggtggtgcacacctgtagtcccagctactcaggaggctgagggggaaggactgcttgagcccaggagtttgaggctgctgtgagctgtgatcgcatcactgcattccagcccggtgacagagtgagtcactgtctcaaaaaagaaaggaagaaataaagaaaacaaATAAAAATAATAGTGCAGACAAAAGGCCTTGACCCATCTAGCTTTGGCCCTCAGCATCAACCGCTAGATACGTCCCTCCCTTTCTTCTGGGGCACAGGTCACACTCTCTTCCAGGTCTAGGATGCAGCTGAGGGGTGCCCCTCTTACCATCTAATCTGTGCCCTTATTTCCTCTGCTTTAGTGAGGAAGAGGCCCCTGGTCCATGAAGGGGCCTTTCAGAGACGGGGACCCCTGAGGAGCCCCGAGCAGCAGCCGTCGTGTCTCACCCAGGGTGTCTGAAACAGATGTGGAGGTCTCGGGTGAGGCGTGGCTCAGATACAGGGAGTGGCCCACAGCTCGGCCTGTCTTTGAAAGGCCACGTGACCTGGCCCACGGCTGGCAGGTGGGACCCAGCTGCAGGGGTCCAGCAGCACCCACAGCAGCCACCTGTGGCAGGGAGGAGCTTGTGGTACAGTGGACAGGCCCTGCCCAGATGGCCCCCCGCCTGCCTGTGGAAGTTGACCAGACCATCTGTCACAGCAGGTAAGACTCTGCTTTCTGGGCAACCCAGCAGGTGACCCTGGAATTCCTGTCCATCTGGCAGGTGGGCATTGAAACTGGTTTAAAAATGTCACACCATAggccgggcacagtggctcacgcctgtaatcccagccctttgggaggccagggtgggtggatcacttgaggtcaggagttcaagaccagcctggccaacatggtgaaaccccgtctactaaaaatacaaaaattagcctggcgtggtggcgcatgcctgtaatcccagctacttgggaagctgagggatgagaactgcttgaacctgggaggcagacgttgcagtgagctgagatcacgccactgcactccagcctgggcaacagagtaagactctgtctcaaaaaaaaaaaaaTCACACCATTTTGGCTTCAGATTGCATATCCTCCTGCAAGGATATATACGCGTGAAATTCAAGTCAATGACAAATCAGAAGAAAAAACATATATATACGCAAACCAGTATCCTACTGTGTGTGTCGTTTGTTGTGTTTTCGACAGCTGTCCGTGTTATAATAATTCctctagttcaaatttattcatttttaacttcatagtaccacattctacacactgcccatgtcccctcaagcttcccctggctcctgcaaccacaaatctactctctgcctctgtgggttgacctattctggacacgtcatagaaatagagtcctgcaacacgtggccgtctgtgtctggcttctctcgcttagcatcttgtttccaaggtcctcccacagtgtagcatgcacctgctacactccttcttagggctgatattccaCGCACCTGCTACACTCCTTCTTATGGCTGATATTCCACGCACCTGCTACACTCCTTCTTAGGGCTGATATTCCACACACCCGCTACACTCCTTCTTAGGGCTGATATTCCACGCACCCGCTACACTCCTTCTTAGGGCTGATATTCCACGCACCTGCTACACTCCTTCTTAGGGCTGATATTCCACGCACCTGCTACACTCCTTCTTAGGGCTGATATTCCACGCACCTGCTACACTCCTTCTTAGGGCTGATATTCCACGCACCTGCTACACTCCTTCTTATGACTGATATTCCACGCACCtgctacactccttcttagggctgatattccactgcagggacagacttcatttgtgtatccattcatcagtggatggacacttggggtgtttccacttttggctgttgtggatagtgctgctatgaacattcctgcacaagttttaggatggacatgtttttcttatctcttgggtatataacaaggagtggaattgccagatcaaatggtgattctgtgtttaactttctgaggaactatcagctgcttcccaaagtggccatcccattattctcaTAttatttttatttgtttattatattttgagagtgtctcgctctgtcaccctggctggagtgcagtggtgtgatctcggctcactgcaatctccacctcccaggttgaagtgattttcctgcctcagcctcccaagtagctgcgattacaggcgcccgccaccacacccagctaatttttttatttttagtagagacgggtttccccatattggccaagctggtctcaaactcctgacctcaggtgatccgtgcgcctcggcctcccaaagtactgggattacaagcacgagccactgcactcagccTAATTTATTACTATTTTTAACTGTAGAGACAAGGTCTCAGTATGTTGCCAAGGCTGGTctcaaattcctgggttcaagcaatcctctcaagtagttaggactacagggacatgccactacaccaggctaatttttaatttttttatagagatggaggtctcactatgttgcccaggatggtctcaaactcctagcctcaagcaatcctcctgccttggcctcccaaagtgttcagaatgtaagtgtaactcactgcacctgCCACATACAATTTTTAAAGTGACAGAAATATGATCATGGCCATGGGAATGGTGCCCCTAGAGCTGTGCCACGAGGAGTACTGGCCTCTTCATGGTGCCAAGATGTCCCTGAGGCCTTAGTCACCTGGGTCCTTGGTGTCCCCTAGGTCAGGGCCATCCCTCTGTCATTCCCCCTCCCTGAAGCACCTGCCCCTCCTTTCTGCTGAACTAAATTCCTCCCCAAGTCTCAGTTTTCCAGAGTCTCCCTGTGAGCTCACACTCATACCTACCTAGTTTCTGAAGAGCCCCGAGCACTGACGGTAGTCACTGTGGCACCTGTAGCACCCTCTCCAAAGGGTCGCCAGCTCCTGCCTGGCTCTCAGAGCTACACAGCCTCTCCTGACCAGGCTCACAGCTCCACAGCTCTGCGGCCCAGGCCACCTGGCATGGCCCCCTGAGTTAGTCTCTCCCCAGGCCCACCATCAGCCCTGTTGGTAGAGCTGGGTGGACTCTTATCCGCATCTGTAGCACCATTATAGGGCTGGGCAAATGTGGGCAGTTGCAAAGGCCTGGCAGAGTTCCTCTGCATCCTCCCCCAGCCTCCTGGCTACCCCCGGCACTGGCCCCGCCTTCTGTCCCCTCCTGCAACTCATGGCCCCTCCTGGGCCCCTCAGTCACAGAGAGGCCTGGACATAGCATCAGGTGATAACtaatacctgcatgaccctgggaagccactcagcttctctgtccctcagtttccccatctgtgaaatgggctggccatgcttaacccctggagttgCCAAGGTAGCCCATCAGGGAACACAGCGCCCCTGTACCTCAGGCACTCCCTGGGTGCCTGCCACCTGGGACCACGGAGCCGGCACACGGACCCCCGTCCTTGGAGGTGAAGACGTGGCAGGTGGTCACGCGCACGGCACACTCACGTTTTCACGTAGGGGTCCGAGTAGCCGTTGGCGTCCATGGCGGCCAGGTGGGCGCACCGCACGATGCCTACCAGCAGGCCTTGCTTCTGTGAGCTGTACTTGAGGGAGATGAGGATGCGGCCCCGCTCCTCCAGGGACTTGTCTTCAGTCTTGTCCACCTGTTGGACGGGACGGTCACTCAGTCCTCACCTGCTCCCACCCCTCTCTTGGCCGTCCTTGGCCTCCTCTTCCTGAGCCCGCCCATCCGGCTGCTGCAGCCGGGCCTGGTCACGGTCCCTGGTGAGTGGCCTCACTGTGAACTCACAGCCCTTCTGTCATCTCTTCCCTCCCAAGGGCTCTTCCAGGGCTGGATCTGACCCACACCCTCCCTGTTCTTGGCTGCATGTGGCCTACGGTGGCCTTCACAGCCCAGCAAGGGCCAGCCcaggggttggcaactgcagcccgggggccagatcaggcccgacgcctggctctgtgtgggatgtgagctaagcatggctcttacccttatcttaaacaatttctttttcaaaaaaatagagacaggggtctcactattttgcccaggctgatctcaaactcctgggctcaagggatcttcccaccttggcctcccaaagtgctgggattacagacatgagccaccgtgcccagctggttcttactttttggaatggcagaaagaaaaatgaaatgaaaaatattttgtgacacatgaaaatgacatgaaattcacacttcagcttccattaacaccgtgttgttggaacgcagccttgccagctccgtgatgcttctctatggctgcttttgccctaaggacagagctgcatggtggccacagattgcgaggcacacagagcctaacattagcgctaagcggcccttcacggGTCTGCAGCCCCCGGCCTGGCGCGCTCGGCTTCCGCGGACGGTCTTCCACCAGGTCCTCCTCACAGCCCGCCCGGACTTTCCCTTGGCTTTGAAGCCGGGTCCCACCTCCACACCTTTGCGCACCCTTAATCCACAGCTCCAGCACCCTCCTCTGACCTCCTCAGTGTTCACCGTTGATAAATTCCTGGTCCTTCTGGGCCTGGGGGGCTTTTCTGACCCTCAGGGTGGGTTGGCCGTCCCCTCTGCGCGCCCACTGCCTCGGGTTTATCCTGTCATGGCCTGGTGTGGCCCCCTGCTTCCTCAGGGCTCTGCCCCAGCCTCTAAGTTCCTTGAAGTTGGAACTCTACCTGTAAAttttgattttgacagagtctgactcttgtcacccaggctggagtgcagtggcatgatcttggctcactacaacctctgcctcccgggtttaagcgattctcctgcctcagccttctgagtagctgggactacagtctaatttttctattttttggtagagacagggttttgccatgtctgccaggctggtctcgaattcctgacctcaagtgatccacccacctcagcctcctgaagtgctgggattacaggtatgagccaccgcatccagATAATACCTGCGTCATCTTTAATACCTCAGGGCTGGGCAAGGCCCTGGCACTAAGGGCCCACTGGAGGCTTGTACTAGTGAAGGCAGGAATGGAGGGACAGATGGTCCCCTGTGCGTTCAACCTCATACAAGCAACTCCAGCCTGGAGGTTCAAAGAAAGGCAGAAAGGTCTGCCCATGACAATGGAGCCTGGTGGATGGAAAAGGGTCACCGTGGCCCCAGTTCCCTGAAGGTGCTACCTGGAGCCTCTCAGGGTGCTGGATGTGGCTCCCTCAGGAGAACCCCGAAGACAGAGTTCTGGTACATTCCTCACCCTGGAGAAGCTGGGAGCCAAGTAGGGAGCCCATCCAGTGCCTTCCCTGCCTCAGCAGCTGCCAGTGCCTGCCTCTCAAGGCAGATCTCAGCTCCAGGCCTCCCCATCCCCAGCCAGCCTGATGCTTCTCCATTCCTTGCCCTCCCGAGACTACCCGGCTTCTTCCTTCTGGCTGCTGCACACCCCAAACCCTCTCTCCTAGGCCCTCAGCCCCCCCAGACAAACCCAGCTTGAGCCCACCCACACACGCCCAGCTTGAGCCCTGTGTCCCCCTGGATCCCCTTAGGACTCTACCAGGCTCCTCTCCCAAGCAGCCCCAGCCACCTCCTTTTGCCCACAACCGTGCCCTGCCTAACACCAAAGTGAGTCCCTCTGGCTACATCTAACCCTTTCTGGAGATGAGACTCCTGTGAGGTGAGCAGGGATGTTGAGcagggaagacatacatgccccaccagccaccttcgccacctcccacgcccagagcagacatggctaatcgatcctagcatttgccctaggctccaatgccacctcagaatccttttcaacacagtgctcaggcagccattcctggtggatcaggccaggcctgcgagatCTCGCTATCTGCCGCTCAGGCAGAGCTTTGTGTGTGAGGGCCTTGATGCCCTGTGCTGCTTGTGTCAGTGTGTGTGGTGTGAAAATTAAACAATATAAACTAGCAGGggccaggcacagtgactcatgcctgtaaccccagcactttgggaggccaaggcgggcagatcacctgaggtcagcagctcaggaccagcctggccaacatggtgaaaccccatctctactaaaaatacaaaaattagccgggtgtggtggcgcatgcctgtaatcccagctattgggaggctgaggcagaagaatcacttgaacccgggaggcggaggttgaagtaagccgacattgcgccactgcactccagcctgggcgacatagactccatctcataaaaaaaaaaaaaaaacaaCCAACCAGCAGGCATATTTTTAGCTCTTTTTTCAGGGGTGGGACATTGTACTTCGGGTTGTTTCATATGAAGACCACTGGGTCTTGCTCAGTATTGACTTAAAACAGATAATGTTCGCAGACTGTAAATTCTAAAACCTACCGCCAGAggcctggcacaggggctcatgcctgtaatcccagcactttgggaggccgaggcaggaggatcactggaggtcaggagtttgagacttgcctggccaacatggcgaacccccgtctctactaaaaataccaaaattagccaggcgtggtggcgcacacctgtaatcccagcactttgggaggctgaggcaggtggatcattcaaggtcaggagtttgagacacctggacaatatggtgaaaccccatctctaccaaatatacaaaaattaaccaggcgtggcggcacacgcctgtagtcccagctactcgggaggctgaggcatgagaattgtttgaactcaggaggtagaggttgcagtgaacagagattttgccactgaactccagcctggatgacagagcaagactcagtctcaaaaaataataataataaaaGTACCACCAGAATGTGGCTGTACTGTCAGGGGTGCATCCCCAGCTGCACTCCTGCGGTCACTGTGAGTCCCTGAACGGCACCAATGGGCCGGTAGCGCATCCAGCAACGCCCTGATCATGGCcacgcacagggacgcacatgctttcacgaacgcacaccacacatgtggacacacacactgtcgcacacagacacgtactgacatatgctcttacacacaattcacacacgagcacacacacacacacgctgacaccccacgtacatacccacGTGGTTGTTTGTTTATGCCAGTGATGAAAACTCAGGAACACTAAGGCAGGGCTGGTGTTGCtttttttttttttttttgagacagagtcttgctcttgctcttgtcacccaggctggagtgcaacggtgcaatctcggctcactgcaacctccgcctctcaggttcaagcgattctcctgcctcagcctcctgagtagctgggattacaggcatgcacccccacacccggctaatttttttatttttagcagagacggggtttcgccatgttggccaggctctctcgaactcttgacctcatgacccacctgccttggcctcctaagatgttgCCTTTCTTAAGTGACATAGACCATGTGGAAAAACCGGGTTACCTGTGGTTAGTGACTAACAATAAAACAGGAAAGGTTATATCCATCACACAAATGTCTGAGGGGGAGAGAATGTGACAAGGAATAAAATTGGATCAAATTCTGCAAAAGTAACTGGGATTCTGGGAAGAAGCCGTGGCCTCAGGCTGACTCGCCCCCGGGGCTTGACTTGGGCTAAGCTCGAGGTGAGTCCACGTCCCCGGGCCCCACTGGGGCTGGGTACACTGGGGACAGCCGCCGGGCTCTGTCCTCCCAAACTTGCCCCTTGCCCAGTCCTCTTAGGGGGACAACGTGCCATCGAGGGGACCATGCCTCCGCCTGTGTCCTGAACGCTGGGAGGCTGAGGCCCCAGGATTTCTCTTGACCCCAGTGGCACGGGGGACTCCTGGCTTCACCAGCCCTATGAACCAGGTGAAGGTGAGGCCATAGACAAGGGAGGATGGGGGAGGGAAGAGGGACATAGAGACCAAGACTCAGAGGGCGTAGCTGCTGGAGCAGGCCGAGGGCAAATCTGTTCTGACATAACGTTGAGACAAATGCCATTTCTAGGAAAGGAtactctgctgtctcctctgcgtatctcacaggcactcaggtctaacatgttccaagcgtgctccttgcgcgtcctgtccacccgtggtccctgctgagtcctctgagtgcagcaaacagcccctcaggcttcagtggctcaggccccaaacctcggatctgcccttccctcacccaaggacgtcctatctgctgcctgcacatctggttcagaatcaagccctcctaccgctgccaaggtcaggccagggttgtgcccatctctccccatctccccaggcctcctgccctctcctccctcttcttcaagccatcctgagcccaggccagcgagcctggtaaaatgtcatcccccatgatccctcagcttagcaccctcccgtggccactcagagtgaaagccagggtccttcctcacctccacccccttgactctccatgctcacctccccggtctcccctcccctctcactctgcccctcATGAGTCCCATCACAGGCAGGAAGTTctgccttcccagcacctgccaccgagccaggtacacagcaggtgctcaatcaatCCTCTCACCGGCAGCTGCTTCTCCAGGCAGATGCTGAAGGTCTTGGTGTGGTTGGGTTTCAGCTTCTTCAGGGGCACACGTGTCTCCCCGATGAACTCATTGTGCCGGAATTTGTCCTCGTCACACACAGAGATCCTAGAGGGGGCGGTGGTGAGGGGCACAGCCAGTGCCTCAGACGCACTGGGCATGGTGGAGGTGTGCGCAGGTAGGGCCAGCCCTGGCTTCTCCTGCCCCAAGCCCTGCCCTGGTCTGGGGTGGGAGACGCACAAGATGCCTGGGCCCTGACAGGGGCAGAGTGTGGCACGATATCAGGCACTGTCCTCATGGACAAGTGTCCTCAGGTTGGAAGAGGGGACAGGAGAAGGCAGAACCAGTGCCAGGAGTAGCCAGGAGGCTGGGAGAGCCGGTTCTCTGGAGGGAACCACCTCCAGCACCCTGAGAGGCCCCAGGAAGCACCTTCAGGGGACTGGGGCCAGGGTGACCCTTTTCCACCCACCAGGCTCCATTGGCGTGGGCAGACCTTTCTGCCTTTCTTCAAGCCCCCAAGCTGGAGTTGCTGGCATGAGGTTGAACCCACAGGGGACCATCTGACCCTCCAGTCCTGCCTTCATTAGGAAAAGCCGGGTGGGAGTAGGGGTTGGGGAGGGAGCAGGCGGCCTGGGACCCTCACCCACCGCAGGGTCTTGCGGATCATGTCTTCATCTGTGATCCCGTAGTAAGTGAGGGTCTCGTTCCATGTGGGGTTCAGAGTGTTACGGAGAGTTTTTGTTCTGAGCTTATTTGCCTGGAGAAGAGAAAAATGATCTTATTAGCATCAAAGTGtgtatcaaacagaacaatggcccccagagatggccacgtgctcatcttggagcctgtgaatgtgttatcaacatggccaagtggactgaggctgcaggtggacttagggttggtaatgagctgacattagaatagggagattatcctagattgttgggtggcccaatgtggtcacagggttcttaaaagcagaagaatggacagagaagacagtcagggacgtcaccaaggaagggggccagagagatgcaatggggcccgctgtgaaggtggaggaaggggccacaagcccaggagggccgatggcctctagaagctggaaggagctaggaaactgtgggctccccgggctccagaaggaatgcagccttgccgacaccttgattttagcccagagagagaccactgctggacttctaacctgcagagcagtccgagagtaaacgcgctgctttaagccacgaagtttgtggtcatttgttgcagcagccgtaggaggctcatccaAGGAGGACCCCACCTCCAGCCCGATGCCACGGGGTAGGTTCTGCAGGAGGCCTGTGTGGAACTGGAGTCTCCTTCCCAGGGCATTCCCTGCCTTTGTGGACCGTCCCCTCTGCCTGGAACCCATCCCGTCCTTGGGTCTGCCTCGGGGGTGGCCTCTCCGAGCTGGAATATTGCTGCCACTCCCTCCTCCTGTGCCAGCGGCTCTGAGCTCCTTGTCTGAATCGGTCTGATGCCTGCCACACCCGgcccaggctgggaggtcagacagcctgggcttccaattccagctctgtggcagcatcaggttccccactgtggaaagggcacaggaatccctctctcccattgctttcaggagctgtttgtaaggagactgctctttataaaacactagggaaagtcctggGGACTTCCTACAACTCGGCAGCCATGCACTGCCGGCTCCAGTCCCACAAATGAAGGGTCACTGAGCACACTTCCCTGGTCATACTCGCCCTCGGCCCTCATATCCCTGAGCCCTTCTTGCAGCATAAGGGCATCAAGACCCTGTGTGGGGAGCCCATTTCTTCCCCAGGAGTAGGTGGTGGGGCACTGCCATTTCTCCTACAGTCCTGCCTTCCCTAGAGAAGGGGGAAGGGCCCCTTCTGGGGTGGTCCTCCTGGGCTGCTGTGGGCCCCAGGCCTACCCATTTAGGCCTCCACGTAAACTCAGACTCTGCCCCATGAAATTAAAAATAAAACAGCACTGAAGTGTAGGATGCACAGAAGGAAGTCTAATTAAGTTCTGACTTTCCATGGCTCAAGGTCACCTTGATGCTTTTATTTAACAGCAATTTTTCTCTCTCCTGTCCGAGATGCACCTGTTGTGCTGGTGCCCTGAGCACATGTGTGGTCGGACGCTTGGGTCATCCGGCGTAGGAATGAGGGGCAGATTctgcccacttcatcccccttgcggagagtcagggaacagcccatcaacacgttagatgctctgagaagtcctgtggtacagaaacctgtttaacaatgtttccccatgttatttaaccacagagtcctttgtttGCACCTAATAGTAACTTCCAGCTGAGAATGCTTCTTATAAAATGCTGGCCTGGAAATTCTCCAGGGGTTTCCCCAGATCAGCAGGTAGGTAACGCTGACTCGTGAGGTCCCCAGGAGGCAACAGGGTGTGGGGCAGACTGGGTGCTCTGTGTGCAGTTGAATGGCTGAGGTCCCATCAGCTTGCTTGGACTCTGAGGGGAGTGGCAGCTGTGGGCCTCCCTGGGTGCCCGCAGCCCAGTCCAGGCCCCAAAGCAAAAGGACCAGAGTGCATTGCCTGAGGTGGTGGGCGGGTGCAGCTGGTGGGCAGGCCTGGGTGGAACCCCTCACCTTACTGGCTCCTGGCAGCAGGTGCAGCTTGACGTAGGGGTCTGCCAGCCCATTGTGGTCCATTGGCTTCAGGCCCTGGGCAGAGAAGAGCAAACGGTGTGAACTGGAAATCGGGGACATGGAACTGACAGGGCCTGCAGATGCCCTTGTCCCCTGGGTCTCCTGGCTGGGCCCATTCATGGCCCCTTTCACAGAAGCCCCCGGGCCCCACACCTCCAGATGGAGGGAGTGAGGATGGCTCAACGCTGATGCAATCTCCCCTCCCCAAAACATCCTGGCCCAGGGCAGCTGTCATGGGGCCTCTGTCAGGACAGACTTTCTCTTGTCCTCATCCTGGAAGCCAGGTGGCTCCCCCTTCTCTGAGCCTCCCTCTCCACCCTGCTTTGCTCAGGCCCTGCCATCATCCTGACTCCCGTCTCCCATCTACCAAGCGTGGCCTCCTGAGTGGCCATGCCCCTCACTGACCCTAAATCCACATAACCCCGGGACACAGTCTGAGACTAGGCCCCATTTCTGGCCATATTTGCCACAACTCCCAGTCAGTCCACAGAGATCACACAAGCAGACTGGCCACAGAGGGTCCCTCCACAAGATGGCACCCCTCTCTCACTCTGGTTCTTTCAGAAAACCCAGCCACGCGTGCACAGCCAGAGGCACACAGAAGCATGGACAGAGAGGGGCTTTGCTGACCTCAGCAGGCTCTTGTCTCTGAAAATTGCATTTGTTTCCTTCTGGGATGGTGCACAGGGATGCAGGGAGGCAGCTGTGGCATCTGTGGGGGTGCAGCCGGCAACCCCAGGATAGTATTCAAAGGGCTTCTACCCAGCCGGGTCAGGGAGGCAGCTCGGCGGCTCACACCAGGCCCATCTCCAAGGGGAGGCTGGGGCTCCCTCCCAGGCCCACCCCACCTCGGTTTGACGGAGCTCTGGGGGATGCGGCAGGGGCTTCCAACCACCTACTTCCCTGCcactggcaaagggcagaggtttgggggcctggctggcttgggctccaggctcagctcctgttcaccagctgggaagtggatgagttactttgcacgctaagcctcagctttatcatctgttaaaggggttggccgggcacggtggctcacgcctgtaatcccagcgcttgagcccaggagttcaagaccagcctgggcaacacagtaagaccctgtctcttcaaaaaactaaaaaattagctgggtggggtggcatgtgcctgtggttccagcaactagagaagctgaggtgggaggatcgcttgaacctgggacacggaggctgcagtgaaccatgatcgcaccactgcactccagcctgggtgacagagtgagaccctgtctcaagtttaaaaaaaaaaaaaaaaaaagggcgggggtgtaatactcccaccttcctagggctggagtgagagggagggagactgtgtggacacagggcctggcaggcagGGTTTGCTCCCTTCCCTGCACGCCAGGCCCCCGTGAGACGCCGGAGAAGCGGTTGAGGTCTGCTGGGTGCATCTGAGCTGTGCCCTGCACTGTGGGGCTGCCTCTGGGACAGGGCCTGGCTCAGTGGCCCAACACGTGCTCAAGGTGACAAGGCTGCTCTTCCAAACCCAACTCCCTCACCCTGCCTTGCACCTTTCCCACCAAGTGCCTGAAGTGCAGAGTTCTGGGAATCTGGTGGCCTGGCCTGACCAGGCGGTGCCTGACTGCTGTGTCTTATTCTTACAGTCCTTATACATTTGTTTGTTGTAGCTTAAAATGTTTCCATTTGGGCCACGTTTCCTGACACGTTTCACTTACAAACCTTTGGGCCACCACATGTTCTGGCGTATGTCTTTCCAGTGGCTCTGGTGGTCCCACCTCCGTACCCCGAGACACCCGTGGGAGCTAGGCTGGGTCGAGGGGCGCTGTCATGCTGGGACGGGAGTGATGCCTGCTCTGTCCTTCAGACCCCGGGACTGTGACGAGGCCCCTGCTGCTGCTCTCTACTCCTCCACCCGGCAGTTCCCTGAGCACCCTGGGCAGCCACACATTTCGCCAAAAGCAAAAGGACCAGCAGGCACTGCCTGAAGCCCTGTCACTGTGCTGCCACCAACTTCTGTGCCCAAACAGGCAGCTTCCCTGGTCCTGACCCGGGGTTCCGCCAGTGCCTCCACCTTCTGTGGGGCTGGCCTCCCACGGAGCCTGACCTCTGTCCACGGAAGGGGAAGGTCGGGAGGCTGTTTCCAGGGCAGGGAGCTGCTAGTGGGGCCCTTGGGCACATGCTCCCCAGCTTGGGAGTTGGCAAGAGACATAAACTGATCCCATGGTTTAGGGGTGGGCCTCACGGGAGGtgacgtggtttggcggtgtccccacccaagtctcatctggagttgtactcccataattcctacaggttgtgggaaggacccagcgggagataactgaatcacaaggggaggtttctcccagactgttcttgacgtagtgaatacgtctcacgagatctgatgatctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACACTCTTCTTGACGTAGTGAATACGTCTCACGAGATCTCATGGTCTGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctgatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctgatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctgatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActgttcttgaggtagtgaatacgtctcacgagatctcatggtctGACActcttcttgaggtagtgaatacatctcacgagatctgatggcctgataaggggaaacccgttttccttggctctcattctctctcttgccaccaaacatgtgagaagtgcctttcactttcagccataactgtgaggtcttctcagccacgtgtaacggcaagtccaataaacctctttcctttataaattacacagtctcaggtatgtctttatcggcagcatgaaaatggactaatacaGGGAGGATGGGTGGAAGCGGCCCCGGGGGAGGCCCTGGCTGGTACTGGCACTGAGGGAAGAGATGGGGGGTCTGGCTTTGAGAGGAGAGCTTCTCCCCAAAAACCTAGCCCTGCCCCGCCCTGGGCCTCTCAGAGGCTGTTGCTGGTGAAGTGTTCGGAAGAGGAGCTTTCTAGTCtgaagtatcattcagcctgaaaaggaagttttgacacgtgctgcaatggggatgaagcctgaagacattctgcggagtgaaagaaggcagactcaaaaggacagatcccggggactgcagactcaaaaggacagataccgggagactgcacttacatgaggtccctagaatagtcaaatccatagagaaggaagccaaatggcagcctccccaggggccaggggaggaggaagggggagctgttgtttaatgggtccagtttagttttgcaagctgaagagggctctggagatgggtttcacagcagtgtgaaggtagaaggtacttagcacaactgaactgtacgctaaaaatggttgagCGCTGAGGGAAGAAGTAAAAAAAAGTGGTTGAGGTGGGAAATGTATATCTGTGTATTTTACCACAATAAAAATAAAAAGTCTCCCAGAACTGGTAgtgccaggggccacgtgttaactcatttaatgctcacaacaggcatgtagggcagggacaaccaaccctatttacagatgggcaaactgagactGACCCTTATAAGGGGGGACAAGCAAGGGTGCACCCCAGGGTGTCCAGCCCCCACCCTGGCCCTCCAGAGGCCAGCCCTCCTTCAGCTCACCCACCCTGGGCCCCTCCCCACACCCCAGCCCAGAGCCCCAGCTCTTCCCCAGCCTGCACCACCCCTTCCCTACAGAACTGGATTTACACAGAGAAGGAACTGGGCCTCCCACCCCCACTTCTGATACCTGAGGGATACAGCCCAAAGTGGacacacacttacatgtgtgcacgcacggtaccacacatgtacacacagagacacacatacagccatgtatgtgcatacacacaaacgcacCTGGAGCTGGGAAGGGAAGGCCCTGGTGTCTGGCATGGAGAGAGGAAGGGGTGGGCTTTGGCCAGAGTGGCCTGGCAGCCGGCACCTCTCCAGTCCCCAGGCCTGGACCACCTCTACAAAGTTGGACAGAGGGAAAGGAGGAAGGGTCTAGCTTGGTCTCTACCTTGGCACAGCTGGGATTTGACAAATGCTCAGTTCTGCTCCTAGGGGTGGGCTGGAGCCCCCGCCAGGCAGGGCTGGACATGCCCTGAGTCATAGCATGGGTGGTTCTAGAGAGGGCAGGGGTGGGATGGAGCGTGCAGGCCTCTCAGTGCCCTACCAGGGCCCTGAGGCTTGCGTGGATGGCACTCACACCTACCCATGGCAGTCCACATGTGGCCCAGGCTGGGCTGGGGGACAGCCTGGGGTGGCACGCAGTAGCCTGTCCTGCTGGGTGAGCATGCTGCCAAGGGCAGCCCTTGCTGCCAGGCTGGGAGGAGGGGCAGGGGGCCTGCAGGTTGGGAGGCTGGGTGGGGCCTGGGCCCAGGCAGCTCTGTGGGAAGCCGCTGGATCTGAGCTGGGCTGGCTCAGGCCCTTACATGGCACTACTAGGGAGACTCTACTGGCCATGCAGGCCCTTACCTTGCGAGAGAACATCAATTTTGGCACCTTCCTCCCACAGGGAGCAATGGGGTGAGGGGAAGGGAACAGGACAGTTGAGAACATGGAGCTGACACATGCTTGAGTGGCAGAGCCAGAGGGCAGCACCAGGGACCAGGCCAGGCTGCAGAGGGGAGCACCAGgggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgCAGAGGGGAGAGCACCAGGGGCCGGGCCAGGCTGCAGAGGGGAGAGCACCAGgggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgCAGAGGGGAGAGCACCAGGGGCCGGGCCAGGCTGCAGAGGGGAGAGCACCAGGGGCCGGGCCAGGCTGCAGAGGGGAGAGCACCAGGGGCCGGGCCAGGCTGCAGAGGGCGGCACCAGGGGCCGGGCCAGGTTGCAGAGGGGAGCACCAGgggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagaggggagagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgCAGAGGGGAGAGCACCAGGGgccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgCAGAGGGGAGAGCACCAGGGgccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagagggcagcaccaggggccgggccaggctgcagaggggagagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagcaccaggggccgggccaggctgcagaggggagcaccagcggccgggccaggctgcagaggggTCCACAGGCACCCACAACCCCAGCCCACGTAGTGAGGCTCAGAGGGCCTCTGGGCTCAGGCCGTGGACACCCTGCCTGGAGTGGCATCGGCCTCCTACAGTGGCTCGGCTTCCAGGGTGCAAAGTGGCGTCCCCACTCCTCAGGGCGTCTGGGAGGCCTGGAGGCACCAGTACCCAACCCGCCCTCCCTTGGCCTGCAGGACAGACATCACCCTGCCCCTCTCTTTCCCTCTCAGCAGCCCCTCCCCAGGCTCGAGGGTCCTCGGTCCAGGCCTTCATCTTCCCATTCTCATCTGTTTCTTTGCTCCCCGCAATGCCTGACTGTCCAAGGCATTTCTTGGGGTTGGGTATTCAAGAAGGTTTAAAGAAGAATTCCTTCCTGGCCCCGCACCCCACACAGCGCAGACATCCAAAAGCCTGGACAGGAACCTGGGGAGTGGTGTGGTCTGGCCTCCCTGACCTAGGCCCTCTTGAGGACCCCGGGGCAGGGAATTTGGGGGCAGGCTGGCGGGGCCTACCTTGGCCTTGGTGATGGTGCAGTGGAGGGCGTTGTTCTCCTGGTCATACAGCAGGCTGAAGTCCAGCGTGCCCAGGGCAGCTGCGGACAGAGGAGGGCACAGGTCCCACCCTGGCCGCATCTTGGAGAGGCTTCGCCTGCCCCTGTGAGACCAGATGAGCCTGGCCTGGGCAGGTGCCACCGTTCCATGGCGgctgccaccaaccaagcacctgctctatgccagcccctcacccatcctcccagtcccacccaaccctgggaagtcacaataatctccccactttccagaggaggagctgagacccagagaggtcaggtgggtcactccagttcCCTGTCTGGCCCAGTGTGTGGTCCACCTGAGCTGGGCACCCAGCCAAAGGAGTTCCTGCCCTCCTGGTGCAACTGCCAGTCTGGGccccgtgcctcagtttccctcacctgtgaaatgTCACAAGGATCACACAAGGCGGAGGAGACGAGGCTTTGAGAGGAACAGGTCCTGGCCAGGAAGATCAGCTGATTTGCTCAACAGTCCCCCAGCCAACACACAGGACTGCAGCTCCTCTGTCTCTCTGGCCCGTTGGGAACCCCGAGCAGGCCGTGAGGAGCCAGCTGGGTCCTCATACTGCTGCCCCCAAGTCTCTCAATGGCAGTGGTAACTCCCAAAAGCCGGGGGGAGGGTGCAGCCATAATTGGGGGAGGTTGCAGCCATAATTGGGTGCAGCTGCCTCCCTCCCCGGGGGCAATCACTCATAGCAGCTGTGGCTTTCCATGCAGAAGCGGCTCCCAGCATGGAGGCCAAGGTGATGGTTGGGGCAGAGCTTGGAGATGATGGTGGGGGGCAGAGCTTAGTGGCAGTCCGCAGACAGCAAGATGCACATTCACAGATGGCTTCAGAAGCCCAGAGCCTGCTCCCAGGCTGCAGGGCTGGTCAAATGGTGTCACattccttcattatttaccaagtgtttacaacatgccagactctaggggatggacatctgtgaggcagagtccctatcccaaggaaagcacagcttaaggggaggaacaaatggaaataatttgccgcagtagagtctggtgagccgggagccatggggagcccatgtgaattgggaccaccagggaaggcttcctggaggaggtgatgcttcacctgagccatgagggatgagtaggagttggccaatggcaaagaggggctgggggaggtgacgattccaaccacaggccaaccagcaggtacgggcttagtggctggtgtgatcccggGATAAGGGAAGGCCAAACTTGAAAGACTGTGTTCTTCAGGCTCTGAGCTAGGCTCCCAACCTGGATCTAGATCCAATTACGGCCACACCTGAccaggcccatcatttacatctcttctgcggctgctttcctacttcagaagcagaccctacggcctgcacagctgaaaatatttactatccagcccttctttatagaacaagtttgcagatctctgGGCGAGTGCTGGGCCCAGGCATAACAAAGTTTCTTGGGTCTGCAGGGAAATTGGCTAATGAAAGGCCACGAGTGTGGAGTGTCCTTTATTCTGAGCTGGTGATGGGAGTGGCATTTGTCTAGATCATCCCCTTCTAACACGGGTCAGGCACATGTCTCTGTCTTCGCAgcagtgtggggtgcaagacctgccctctgacatcagtctcctcggttcaaatcccagcgtgcacttccctgctgtgtggccagcttatctaacgtctttatgcctcagtttactcatctgtgaaatggagacaacgatagtatccacctcacagcacagtgtgcctgcggtgtaggaggtgctcagtagatcattatagaaggagcagatCCCTATGGTGGGTTTTCTAGCTAACAGTACTGGCTTCTCTAAGACCCATTCTGGAATGAAACTGTCCTCAATTGGCTCATTTCCCTCCCTGCCTTTGGGAACAGAACATACAAATAGTCACTTAAATTTTTTGAATAACCTGAGTCATTGCTATGGCCACGATGACTCTTCCGTATGTCACGGTCATGTTCCAGAGGCAAAAGGGGACACGAAGCAACCCCTGCACCAGAGGGTCCCGGGTGGCCCTCTTTTCGGTTTTCTTCATTTTTCTGCCTCTTCCTTTGACATCAGCCTGAAACTCCTATTAGATTACTTGTCTGTGTTGGCGCAGACAGCTCTCACCTATTCAATAATTGTTTCTCCCATGAGACCAGGTGATGATACTTGTTGGAAGTGTGTGCAAACTAAGAGCCAACAGGCCTTGAGTGGCTggtgtggggctcatgcctgtcatcccagcagtttgggaggccgaagcagatggatcacttgagctcaggagttcgagaccagcctggccaacatggcaacaaaaactacaaaaatggccgggtgtggtggctcacgcctataatcccagcactttgggaggctgagacgggtggatcacttgagcccaggatttcaagaccggcctggccaacatggaaacacccatttctacaaaaaatacaaaaatagccgggcatggtggcgcatgcctgtggtcccagctactcgggaggctgaggtgggaggaccgcttgagcccaggaagttgaggctacagtgagtgtgattgtgccactgcatccaacctaggcgacagagcgagatcctgtctcaaaaaaataaataaaTACTAGGTCTTGAGTGAAGCCTGAGGCACGTTCTCCTGAGCAGGGTATGTGTGCAGCAGCAATAGCTGTAGCTGGCCTTCAGCTGCCTGCATGCTAGCTGCCATCTCCACACCAGGCAGGTGAGGCTGCTGGAAGGGAGGTGTATCCCACTGCTCCTGCAGCAATGAACACAGCCACAGCCCGAGATGCACCAATCTGTACTTTTGTAACAATATTCTCTGAAGCTGTTGGCTTTACATGATAGTAAATGGATCTGCTGTTCAGGTCAGGCCTACCTGGGgtttgttcccccaacgaattagccccctgagcacagggaccatgccttggccacacctgtgagaccaacaaacagcagatgcagacacacaTGGCTGGTCCATTCAAACCAACTTGCCCTCCAGGTGCTCCCAGGAGCTGGGATCTGGTGTGACACCCAAGTGTAAAAATGCACATTCTGATTTCTGCCTGTTTCCCAACCCCAGAGAAGGACAGCCAGAAGACAGAGCGGCTGCCTGCTCCCCCTGGGACACCCAGCTCCTGGAGGGGAGAAGCCCCTGCACTGGCTCTACAGAAACCCCTGTCCAAGGAGGGCAGCGATCTTCTCCAACTGCCTGGGGGGAAAACTGACACCCGTCACCACCTCCTGGGAAGGAACAGCACACCCAGTGTGTCACCAGGAAGGGACTGATGAATTTCTTTGAACTCCACTGGAAGGTCTGTAAGAAATGATAGTTTATAAATAGAAGCTCGGTCACAGTTTTGAGAGGCTGGTGAAGCTCTcttacggagcacatggcccaaactcttcatttccctttagagaaatgaggctcagagaggggaggggagctgctcagcgtggcccagcagacccaggcacagaatctaggcctcttcacttccatcctgatagtttctcccttaaagcctactggctcCCAAGAAGttttttttttttgagacagggtcttgttctgttacccaggctggagtacagtggtgcgatcacagttcagcctcccaggctcaaacgatcctctcacctcagcctcccaagtagctgggactacaggcgcccaccaccatgcccggctaacttaaaaaatatatttacaaaaaaagagacagtggtgttggtgcggggggtgcgggggggtaggtctcactacgttgctcaggctggtcttgaactcccgggctcaagcaatcctcccgccttcacctcccaaagtgctgggattacaggcacacacgggccactctgcccTGGTCCAAAaagcttttttttggccactttgtttagctgaaatcttaggaggatgcaactaagaaacaaaaagagagaaaagcagagttgctctaggggaaggagagtgggggctcagCCGCTGCCCTGGAGAGCCCTCCGAGTTGGAAAAGCACCTTGGGCTTGTGCTGGGCGTGAGGAGGGCCCTAAGGGCAGGAGAAGGTGCTTCTCTGTGACCCACAGTGCCCTGCCCTTGGAGCGCTCCAATCACATGACCCCACTGAAGTTTACATCTTCGGAAAAATAATGAGGGTTGGTCACTAATGGGCCTTTGAGAGTTGTCACCCGCTTGTCGATAATTACAGAGCTTTAATGGGGCTAATTCAGAGAAAATCAAGCTCAAGCCCCTGTTGGCTCCTCTAAGGCCGAGCACCCCCTGCTTGGCTGATGTAGACCTGACAGCATCATCATAACCTTGGTTTGCACCTGGACTTTCCTGACACCCTTGGGAATAGGAAATCTGCCCTGAttttttttttttttttttttAAGAAAAATGTCAGGCTCTCAAATACTTTAAAAAAAACAAAACAATTGTGGCAAGGACCATAGAAAATCATCAAGCTTGTGTTGAAAGCATTTTATAGTTGGGGAAACAGCCATTCAGCCAGGCCGGCACtgcaggggggacagagagaaatgagccatcatccctgcttcccaggagcttagagacgagtgTTAAGATTTTTATAAACATCTtcattcatctgctttaaattgttgaaaggttctccatcaactcctggactcagttcaacctccttggtgtggcattcaagaccttgtgtgatatgaccactttggcctcctcataggtctcttccccttctttgagccacttggctctactgttttgcctacactgggctgagtttctgacacttcaggacctgtgtacacagctgtccctctgcgaagagcacctccctttcctcttttccctctgtcagcccacctcctccctcctcccagtcttagggctgagccctctctgtgagtgacctccgcctgccagagctttccatctgggttcttgcctccctgtccctcgctgtggggtcacctgtttgtcttggttttgatgaacaccctcaggtcagaggttgcctctcagcattcttgtatctctgtgcctaacacagagcctgccacatagttggtgcCGTAACATCGAGTGCATTACAAATATCACAGGTCCAGGTAGAACATGACTGGGGCCAAAGAGGTTCTGGGCTCAAAGGAAGGGGCACCACAGCCCATGGGGGCTGGAATATGCCCCCTAACACGGAGAACCTTTGTGGTGGACTCAACCCCTTGTCCTTGCTCCAGACCTGCACAGGCATCGGTCCTGACCCGGTCCTGGGTGAACCACACAGGGCAggtttaacacgggcatgtgatccaattctggccaatgagacaaaaggacaggtctcctggggacttctgggagaggtttcctcgctcttaaactgagacatgagaaaaggaatcggtccttaccaattccttacccagtcaaggaccttattgttacatgtgatgcctggacctgcagcagccaccttggacctgagggTCCTGCTTGGCTGATGGAGACTGACAGCGTCATTCTGACTTTGGTCTGCACCTGAACCTTCCTCACCCAGACACCCTTAGGAATAGGAAACGGGCCTTGATTTTGCTCTACAAGGAAAAACGCACAGCCACGAGCAAAGATGGACAGAACCATCACCCCAACAGTGCTGAGGGGCTGCTGCATTGGTGCACCCTGGTGCAGCAAGACCTCAGCTGtcctcattgtttaagccactctgagttggttatctgtcacttgcaacccatggcactctaGTGATTACagctttgaagaatggactgggcttccacaggtgggtgccgggcaggagagcctctcaggtaggggcagcgtgtgGGGTGTGTGGCCCACGGCCCACCTGGAGGGCAGCACGCCACAGAGGCCACCCTAGGAGAACTGGCAGGGCCGACACTGGGTCCTGCTCCTCATTGGGACATTCACTGCCTTTCCTCCCTGTCATCGCCCCTCTTCGAGATCTGCTGGGAACATGCGAGTTATGTGCTTGTGCTTTGACATGACTTAATAAAGAAAGGGCTTTAAAAAGCCAGGAGAAAAGGCTGAGTAAATAAACAGCACTTCACTCTCTGCCTTGGACAACAGAGCAACCGCCCTCATTTCTTTTTGTCAGAACAGAGATAATCCAATAATTAGGGCAGCAAATAGCATAACCCTGGCTAAAGCCGTAATGAACCATCTGGCTTGAATCATTGAGGCTTATTCTAAGGATCTGGTTGTCACAGCCCAGAGGGGGGCACCGCCTGGGGTACTGGGTGGAAGGCAGTCCCACCCCAGAACCTGTTCCCCCACCATGGACAAGAGCCAAGATTTCTGGGCTTCTGGGCCTAGGATGTTTGTCAGCCTCCCATAGGATCCAAGCCTAGGGTGGGGCCCCCCTCGAAGGGGGCGACAGGGGCCCAAGCTGGCCCTAGCTCAGTGGCACACCGTCCTCCTTAGCTCCTCTGAGCCCGTCTTCAGCATTTCTTTGGCATGTCCTGATAATACGGGTGTGGCCAGCCCACAGTTCAAGCTGGCAAAGCTGCATCCAGCTCTGCCCTGCACTGCTGACCTCATGTGAGTCTGGCCCCTGCACCGCATGTATCTCAGGCCAGGGGCCCAGCCCTGCTGCCAGGACATGCTCCCTGAAGGGCTGCTGAGCTGATGGATTGGAGATGGGGCTGGCTGGCCTGAGTTTCTAGCTTCCTGGTCCCCTCCTTGCTTTGTGGGTTCAGAGACAGCAAAAAAAAGAAAAAGAAAAAGCCTCTTGGCCATTCGCCATGCAGAGGGCCCTCTGTGGCACCAGGAGGGTGGCCTGAAGCTCTGCCCTCCCCTCTTACTTCCTTGGTGCAGCAGCAGAGATGCCAGAAATGGGAACAACTTTCCCATCCGTTCTTCTGGGGGAGGCTTTGGATTCAGGGCAGCCAAAGCAGTTACTTGGGTCCCACTCAGGGCCCACCCAGGGGATCACCAAGTCCAGGGTGAGCTTCGCACAGGGTGCCAGGGGCAGAGGGGAGGGGGGGTGGATCTAGGGCACAGCCCTGAGGGCAAAAAACTCTTCCGACCCAATCTCCCCAAGCTGGCAGGAAAGTGGAGGGACAAGACTGCTCCCCAGCCCCCACGCCCCAGGGCAGGGCCTTCATGTGCCAGGCGCTGGCCCGAGGGCCGTGGTCCAGGCCTCTAGAAAGTGCAAAGCAGGCAGTTTCCCCTACAGGGGCCCTGCTCTAACCGGCCACTGCTGATGGGCCTCCCCAGGTGGGGCGATGGGGGGTCTGTGCCCCGGGGGCACTGGTAATCCCTACCTTCAGCTTCTGGTGGCACATTTGATGCTTGGGAAACTCCAGGCCCGCAGCCCACAGGCCCTGGTGAGTGCCCAGGCCAGGCGCAGACATCCCTGCTGCGCAGGGGAGGGGCAGCACCAGCCCTGGAGAAGGGCCACATCCCGGGAAGGGCTGGGGTTTGACGAGACGCTGGTTTTCCAGGTCTCAGTGACAAGTCTGGAGCCACAGCTGAGCTAGGAGGGGGTTCTCACATGCCATCCCCACCCCGCGCAAACCGACTCCTCACTGGACTGCGACCTCTTCCGGCCTCGGTTTCCCAGCCAGTCCCGGCTCGGGCCGGACAGGCACCCTCGGGGACGGGAAAAGGCGCCAGGAGCGCCCACCGGCCGGGCCTCGGTCCCGGGACTCCGGCGCTTGCCTGCTCCCGGGGGCTCAGGGCTCAGTCCGGGAGGAGGGGGAGCGGGCTGGGGGGCCCTCTCTGCCCGGGGGCCGCGGGCGTAACAGGTGGGCGAAGGTGCGCGGCCCTGGCGAGGGCTGCGGCGGGGTCCATGGACACCGGAGGAGGAAACGCCAAGGTTTTTCCAAAGGACAAGCGGCCCCGCGGTCCTCCTGGTCCTCTGCTCGCGCGCCAGCAAAGCAGCTGCGCTCTGCGGGCCGCCGGGACCACGCGGGAGGCCGGGCCGCTCCCAGCCTCGGGCCCCTCCCCAGCTCGCCCCAGCCCCGACCCTCGGCCGCGAGGCCCTCCCGGAGCGGCTGGCGAGCGGGGAGCGACCGCGCGGCCGGCAGCAACTGGTGTCTCCCCGGGACGCAGCTCCGCCCTTCCCGGGAACAAAAGCAGCCGCCCGCGCCGGAGCTCCGGGAGGGCGGGCTGGCAGGGAGGGGGCGCGGCGCCGGCTCCGAGGAACCCGGCCCCGGAAATGGGACACCCCCAGGGGGTGCCCCCGAACTTCCCTCCTGTCCGGCCTGGGTCGGGGGAGGGCTCGAGGCCGGTGGGCAGGGCGCGGAGAGCGCACCGAGTGCGCCAGGGGCCCGCAAGCCCGCGGCGGGGTTGTGAACCGAGGCAGAGCGCGAGCGCGCGAGGGGGACCGGCGGAGGGAAGCCGCGAGGCCGTGGGGGGGCCGAGCCCGAGCCAGGGGAGGGGGCGCGAAGTCGGCGCGTGGGAAACTTACTGCAGTCGTCCGACTCGTAGCCGTCGGCGTCCGGCTCGTCCTCCGGCGGCTTGGCTGGCGGCCGCGCggggctgggacccgggctggggcccgggctggagccgTAGGCTCCGAAGAGCTGGTCCACATCCTCGTCGTCCTcgcgggcgccgtcggaggggctgcggcggccggcaccggccacagccgggcgcgcgggggcgtccgggggtgcagcggctcggggcccggcgtccgggggcaggccccgcgggAAGCGGGGGAAGTAGTCGGAGATCTGCTTGATGGGACGGATGGGGCCGGGGCACACGTCGATGGCCATATGCTCCTGGATGCTGATGGTCGCCTTCTCCCCGCGCCGCCGGAGGGTCATGCAGGCAGCGccgccccgccccgggcgcggcccggcccggcgcgaccccggcccgggggcggctcagcaggcccggcggggcgcggcGGGGGCTGCGGGCATCGCCGGCCGCGCCCCCGGACGGCCCTGACTTGGCCGCTGCCCGCTCCGCTGCGGACGGCGCGAGCGAGTGCCAGAGGCCGGCAGGCAGGGGGCGGGCCCAGCCCGCGTCACCCGGCAGCAACCAAGCAGGGTGAGTGTGCGGGCTGCGCGGGCGGCGCGGAGCGGAGGGAGCCGCGCGGCGCCACACTCACTCGCACTCGCACTCACACCGGCGTGCACGCCGGCCCGGGACCCCGCGCGCGCACACTCGCGGCCAGGCAGGGCCGCCGGGCGCCTTCCGCTCATGCACGCCCGCGGCACAAGCTGGGAGTCAACCCGGCAGGGACCCGCAGGACGCGCACCCACACGTTCCCGCGCGTGCCAGCCCACGCTGGGCTCCGCTCGCTCTCTCGGGACACACGCAGGTGTGCAAGTGCACACATGCGTGTGCAGAGACACGTGGTGGAAGCATCCGCTCGCTCACACCGTAGGTCACACACGCAAACGTCTGCATGCGCACATGGTTGTTTTAGGAAGCTGTGACACAGTACACCCCCAATGCACAGGCGCGCACACCTGATGGAGCACACACACAGGTGATCAAGGGCACCCAGGGCACAGGCTTCCCTACCCCCAAGCACCCCTAACAAGATGCACAAACATGAGCCCATAGAAGTGATCAGGGGACCCTGGGCACAACTCCTTCTCTCCTCCCCCTCTAGCAGGACGTGGAGTCACACTCCTAGCATGTAGGGACAGGTTGCCTACACACGGGCAGGACATGTACACGCTCAACTGCACAAGGACACCGGGGCTCAGTCTGTGCACACATATTCTCAGGTCACATACAACACAGCACTCTAACCTCGGGTCAGTCCCTCACTTGGTCAGTGAGTACAGTCTGAAGGACCCCCCGGGGCAGGGTGGCTGAGAACCTGCACAGGGCCTGGTCAGGAGGGAGTGAGGGCAGGGCCTGGGCGGTGAGTGATGCAATAAGGCTGGGCGGCACGCTCCCCCCACCCCCACTCCTACTCTAGGCCTCCCACACGGTCAGATCACTAAACAAATCCCAGAGGGCCCAGCCCTGGCTGTCCGGCTTTCCGGGACCAGAGCTCTGTTGGGAACTGCTGCTGCTTGGACAGGTGTGTTCCCGGAAAGCCCTGGGCATGGATGGAATCCTGTTTACCCTCTGGTTTCCACTGATGTGTAAGACACTTAGCTTCTTAGTGGGTGCCTTTGGTGCATCTGAATGAGGGGCTCCAAGCCTCTCTTGTTCCCCCACCATAACCCCTGCAGAGTGATGGGGAGCAGAGGAAAGAGAGGcaaagccttggcctgtggcttccagctgcacagttctggcaggctatttgacctctttgagcctcggtttcctcatctatgaagtgaggctatttccaactgcacagccttgtggcaaggccccatccagcacagacaggtaagaggtgctcagCTGGCTTTCCCTTCTCCCTTCCTTTGGAAAGACAAGACTCATGGTGAGAAGTGATGAGAAATTCATGTTTTGTGGTGAGTTCTGAACTGGGTGGTGGCGGGCACTACCGGCCTTTGAAAACACTGGAGAAACACATGGCATATGTTgtactgcgccaagaacattcacagctgtcatctcattgattccttaaaccaccccacgatgtaggcagggcctgctgttcccatttcacaggggaggaaattagcgctcaggcacaaggatgtgtccagggtgactgctggctggcggcagagctgcgatgagagcccagtgtcctgactgttcgatgcttccactttccctcccttctcccttTCCTCCCCTCCACTACAGAGCTCAGGGGCTCAGAGCAGAGTTGGAAACACAGGTAAAACCTCGTTCCCAAAGCTCATCCTGAGGCTTCTTGGACAGGGGAAGCCCAAACTgaggaggaggggaaggagggaaaaaaaggaggaggaagaggaggacgggaggaggCCAAGAGCCTCAGGGGTTACAGTGGGAATGAACCAGCCGGGGTTCCCTAAAGATAGTCTGAGGTCCTGGTGGGAGAAATATTCAGCCTTCCAAGAGCCAAAGGCCAAGGAAAGGACAGAAGAGCCTGGAAGGGCAGCCTGGCACAGAGGGGTTCTCATTTCCAGGAATGCTTAAGGGGATTAATCCTTAAATAGGCACAGGGTGACTTTTTCTGTGCCAGAGGTGGCTGCATCACAAAATGATTACATAATGACGCAGGTATTAAAATACAGACGCTGAACTGTCATTTTATCATTCCTTTACTAAGGAGCTGGAGGGtgtcgtcatcatcttacagatggggaaactgaggctgcgggaggtcaagtgactagcaagaggcagactggagatgagacctggacgtcctgactGCTGAGCCTGCCAAGGCCCTGTACCTGTGTTCACCGAGAGCTGGCCGCTCTCTCCTGGCCTCTTGTCCTATGGGTCTGGTTTTTGGAGGATCGGCTCATGGCTCTTGGCTCTGCAGGAGCTGGCTCTTGGGGAGGGCTTTGAAGAAGTCAGGTGGAGGGCCCAGCCTCCTAAGCATGGAGCCAGGGAACCCAAGGATGCCCACTGGAGAAACACATGGCATGTTGGGGTCTACCCTTTGGCAGGTGGCAGAGTTGAAATCCACTCTAGGTCTGAATTTGCCCAGCTCTGAGCCCAGCTGAAATGGGGTAGGGCCTCCCCGAGGGATAGAAGTGGATAGAAGTGGTGTTATCTGGGCAAGTGTCCACTTTCtagaaagagcataggtcagggagtaggcagacgtagagtcaaaagtcaggtgtgcctcctcctaagcgtaccggactttgggcaagttgtttaacttaggccctgggtctcatctgcaatgggggagaataagaacaacgttgtgtgagttaaaggtgaggatgggtatacagtgcttagcacagtgcccagtgtgcaactggcactcagggaattgtgattctgttGCCCCTGCCTTCCTGGTGCAAACCGTCCCATTGCAAATCTTCCTGGTGCAAACCGTCGTGGTGCAAACCTTCCTGGTGCAAATCGTCGTGGTGCAAACCTTCCTGGTGCAAACCGTCCCAGTGCAAAACTTCCTGGTGCAAATCGTCCTGGTGCAAACCTTCCTGATGCAAACTGTCATTGCGCTCCCAGTGCCTGCCTTGATTTCTCTACCAGTGACATGTGGTTGGCTGCTCCCTGTCTGCTGCCAGGACCAGGTGAGAAATGGATGCACTTGCCAAGGCTGGGCGCTGGCTGGCATGTGTGGGCATCTCTAAGCAGTTGGATATGTCCAAAGGCTCATCAATCATGTTGCCTTCCATTCCCCATGCTGAGGTGGGCAGCTGGGCAGCTGGACCAGCCTCTGGCAAAGTTAAGTGGATGGAGTCTGCCCTTGGTCACAGTCCTCAGAAGCCCTGTGCCCTTGCCTCTCTGCCTTCTGCTTTTCCCAACAAGCTTCTGAGCTTTCTCCCAACTCCCCACAGACCCTCTCAAAGGCTCCTCCTGTCAAGTGAGATAcagtgggttctttaaaaaatggtccaggtatctgtagcactagcatgtctatttactttgtcactgaggctggagtgcagtggtgtgatcatggctcactgcaacctcgacctcccgaggctcaagtgatcttcccaactcagcctctcaagtatctgggaccaaatgcatgcaccaccatccccagctaatttttaaaatttttgtagagacggggtcttgccatgttaccctggctagtctcaaactcctggctttaaatgattttcctgccctggcctctcaaaatgctgagattacaagcgtgagttgccacacccagcccaggatgtctatcaataatgcagattccaggcccccatctcacacccactgactcagaatatgtgtgtgcacactcaggatgcatcttaacaagcgcccctgatgattctggtgcacagtgaaggttgagactcgctgGGTTAGAGGGTGCTAATGGTTTAACGGTGATTTTCAACCTGCTGGGCTGCCTCTATACTGTTCACATGTGTAACATATGCCCATCAGTGACATCTCTCAATATTTAATGATTCTTCACTGGGGAAGTGAGTGACTAACCCAAGGCAGGTGTTGGAATTTCCAGAGAGGTTTGGCAAAGCCACAGTGGGGAGTCTGATCTCCTCCTGTTTGGGTATTCTGACCTCTTTCCCGGTGGAGAAGtgttgggagcaggccccccaaaatctagccataaactggccccaaaactggccataaacaaaacctctgcagcactgtgacatgttcataatggccctaacgcctccgctggaaggttgtgggtttaccggaatgagggcaaggaacacccggcccgcccaggacggaaaaccccttaaaggcgttcttaagccacaaacaataccgtgagtgatctgtgccttaagaacatgctcctgctgcagttaaccagcccaacctattcctttaattcagcccgtcccttcgtttcccataagggatacttttagttgatttaacatctatagaaacaatgccaatgactggcttgctgttagtaaatacgtgggtaaatctctgttccgggctctcagctctgaaggctgtgagacccctgatttcccactccacacctctatatttctgtgtgtgtgtctttaattcctctagcgccgctgggttagggtctccccgaccaagctggtctcggcaGAGAAGGACCAGTTAATGGCTCTTCTTAGCGTAGGTAACGTGTGTTGTTGAGGAAATGCCCTCTGTGCAGCACTTGGCTGGATGTTCCTTGGTTGGCATCTTGGCTGGTGTCCATGTGCCCGGAGAAGAAGGGCCCTCTCTGACCCAGGCCTAATATGTGTCCCCCGTTTCCACCCTTCCCTGATAACCGGAGGAGATCTTGTCCTGCTGTGCTAATAGAACGTTCTCTGTATTAACAGAAAATTTTAACCCAACAGAGGTGCTTGGATGGAGGAAATCCAAACAATGTTGCTGGTTGATGGAGAGGGGCATTCCAAGGTCTCTTGTCCATTCATTTATTCATGAAGCCACATttcaacaaatatttattgtgcacctgccctgtaacaggccccgagctgtgatctgtagggggctcagtgataaatgaagcCCATTTTTCTTTAGATTCCAGATGATGGAGGGGAAGTCAGGAAGGGAGGTAGGAACCACCACTCTAGGGGTCCCTTGCCTCCTTCTTGGCATACGCTATGGCCTCGGCATGGAGAGCATGATGGAGCACTGATGCCAGCAAGTCAGCTCAACATTTGCAATATAGCGTGTGGCTGCTTAGAGGCCAACTTGATATATTAAGTGATATTAAAAATGCAGCCTATTTCTGATAATTTATGCGCATTCTCATAAACACATTACACTCGTCACAGTACATTTACGAGCACCAGGCCTACCAGTGGGGAAAAGTTAACACCTACCTAATGATGTTGTTTTGCAGATGTCAGATGCTCTGGTGCTGCAGTGATGGGCTCAATGGTGCAAAGAAATGTGAGTCTTTCTTCTATCCTGGGACACCAGAAGGCTGCCCAGGGCTCGGCAGAAGAACCAGGCTCTCTCCCTCCACACTGCCAAGGTCTGTGCTGTGGGATTGGAAAGAGGCCGGGTGAGAGGCTCCCCTGCCAGGAGAGACTTAGCCCAGGATCAGGTTGCCTAAGGCAAGAGATTAGATGTCACAGAGGGATAATGCTGAATCCTCTGGGGAAGCTCTAGAAATGACCTCTGGGAGAAAGAGGGGCAGAGCCTGAGGCTTACATCACAGCACAGATCAGCCTCCCTCCCACAGGCCTGGGGACCACTAGAGCCCAGGGTCTTCATGACTGAAGTGAACATCTCAGCATTATACATGAATTAGAATGAGTAGAAACTCTAACTAGCCCAGGGCTGGTGGAAACCCAGGGCCAACAGCCACAAATGTTCCCTCCAATCGCTTTGAATTCATGATGCAATTCAAAGGAATTAACAATGCAAGTGGTAGCTTTCAATGATAACGTGTTCCACtttttgtgtttgagagagtctcactctgttgcccaggctgaagtgcaatggcacgatctcggttcactgccacctccacctcccgggttcaagcaattttcctgcctcagcctcctgagtagctgggactacaggtgactgccaccatgtttggcgggcgtagttctttgtatttttagtagagatggggtttcaccatgttggccacgctggtcttgaattcctgacctcaggtgatctgcccgcctcagtctcccaaagtgctgggattacaggtgtgaaccaccacacctggcACCCCCCGCCttttttttttttttttttttttttttagagactggaccttgctttgtcacccaggctggaatgccatggagtgatcatggctcactgcagtcttgaactcctgggctcacgtgatcctcccatctcagcctcctcaatagccagaaccacagatatgcaccaccacgtccagttaatctttttatgttttattttttaaagatgggtcttgctatgttgcccaggctTACATGAACTGAGttttacatttactacaaatgtgtgtacctataaacaatatatggtattgttttgcaagctttaaaactttgtaaaaatggcatcatactatacataacctcattaaactcatttttggcttattattatttgtgaagttttttcctgtagatgcatctagctatagtttgtttttattgctgtctactgttccactatgtgaatacgccataatttatttctcttttctcctattgttgatcatttaagctggttcctatgcgtggctatcatcaacatgctgcaatggctattcctgtttcctggtctgtatatgggatttctttttttgagacagggtctcactctattgcccaggctaaagttcagtcgtaccatctcggctcaccacagcctcaacctccctgggctcaggcgatcctcccacctcagcctcccgagtagctgggactgcaggcacataccaccacatctggctaatttttataatttttgtagagaaagggtttgccaggttgcccaggctagtctgaaactcctgagctcaagtgatcctcccacctcagcctcgcaaagtgctgggattacaggcgtgtgccaccacacctggccagggagagtttctttaagagagtttcttggataggaaatttgctgggcctagggtctgtatatctttaccttgactagagagtgcctagctgctccccaaactggttagaaccacggccattctcaccaaccatctctaacagtgccagttcctccttgtccttgtcaacacttggtattgccagacttgaatttttgctagtctggtaggttttttttatgtttaattagttctccttgctaactagtgCTACTTCATTTCTGCTGCTAAGGGTGGGCATGTGCTGTCAATAGATAAATGCAACAGATTAAAAATTGAAGAGCTtccatcaataagggattggctaaatacagtatgcctcacctgtacaatagaatactgcacaatcattaacaaagatgAGTGTGCTGATATGGAAGAGATATTGATATTCTGATGTACTAAATATCTTTTCATCTCCCAGATTTATTGTTACAAAGCAAGAGGCATAAAAAGCATATTCCCTTTGTAAATAAATGAAAAGATATGTATACACATGCATATTTGTATGTATATGCGCAGAATACCTCTGAAAGAATGAACAGGAAACTGGTAACCACAGTTCATCTGGGAAGAGCACTAGAGGACAGGGAAACTTTTTTGCTCTGTGAATTCTTACCACGCATGTGTATTAGCCTGTTGGAAAAAATTAGCcctagaataggcaaattcgtagagactgaaagtagaatagaggttgccagaggttttggggtagagaatagggggtttttatttgatagatgcattttctgtttgagatgatgagagagttctgaaatggatagtggtgatggttgtacaacattgtgattgtacttaatgccactcaactgtacacttaaaagcggttgaaatgggctgggcacggtggctcacacctggaatcccagcgcttcgggaagccaaggtgggcagatcacctgaggtcaggagttcacgaccagcctgaccaacatggtgaaaccccgtctctactaaaaatacaaaaattagctgggcgtggtggtggtcgcctataatcccagctactcaggaggctgaggcaggagaattgcttgaacctgggaggtggaggttgcagtgagccaagatcacgccactgtactccagcctgggcaacagaagtgagacctcatctcaaaaaaaaaaaaTGTTGAAATggcctggcacaatggttcacacctgtaatcccagccctcagggatgccaaggcaagaggatcacttgagcccaggagtttgagaccagcctgggaaagatggtgagactctgtctctacaaaatgttttttaaaaattagctgggtgcagtggtgcacaccctgtggtcccag


=====================================
tests/data/download_gene_fasta.py
=====================================
--- a/tests/data/download_gene_fasta.py
+++ b/tests/data/download_gene_fasta.py
@@ -65,6 +65,11 @@ def fake_chr22(filename):
             chr22_len -= 70
         fake_file.write('N' * mod_70 + '\n')
 
+def bgzip_compress_fasta(filename):
+    from subprocess import call
+    call(' '.join(['bgzip', '-c', filename, '>', filename + '.gz']), shell=True)
+
+
 def fetch_chr22_vcf(filename):
     from subprocess import call
     call(['curl', '-s', 'ftp://ftp-trace.ncbi.nih.gov//1000genomes/ftp/pilot_data/release/2010_07/exon/snps/CEU.exon.2010_03.genotypes.vcf.gz',
@@ -83,3 +88,5 @@ if __name__ == "__main__":
         fetch_chr22_vcf("chr22.vcf.gz")
     if not os.path.isfile("chr22.fasta"):
         fetch_chr22("chr22.fasta")
+    bgzip_compress_fasta("genes.fasta")
+    bgzip_compress_fasta("chr22.fasta")


=====================================
tests/data/gene.bed12
=====================================
--- /dev/null
+++ b/tests/data/gene.bed12
@@ -0,0 +1 @@
+chr17	6010	31420	uc010vpx.1	0	-	6012	31270	0	6	158,127,110,75,80,523,	0,5195,5861,7910,16317,24887,


=====================================
tests/data/gene.bed12.fasta
=====================================
--- /dev/null
+++ b/tests/data/gene.bed12.fasta
@@ -0,0 +1,2 @@
+>chr17:6010-31420(-)
+CGGAGCGGGCAGCGGCCAAGTCAGGGCCGTCCGGGGGCGCGGCCGGCGATGCCCGCAGCCCCCgccgcgccccgccgggcctgctgagccgcccccgggccggggtcgcgccgggccgggccgcgcccggggcggggcggCGCTGCCTGCATGACCCTCCGGCGGCGCGGGGAGAAGGCGACCATCAGCATCCAGGAGCATATGGCCATCGACGTGTGCCCCGGCCCCATCCGTCCCATCAAGCAGATCTCCGACTACTTCCCCCGCTTcccgcggggcctgcccccggacgccgggccccgagccgctgcacccccggacgcccccgcgcgcccggctgtggccggtgccggccgccgcagcccctccgacggcgcccgcgAGGACGACGAGGATGTGGACCAGCTCTTCGGAGCCTAcggctccagcccgggccccagcccgggtcccagccccGCGCGGCCGCCAGCCAAGCCGCCGGAGGACGAGCCGGACGCCGACGGCTACGAGTCGGACGACTGCACTGCCCTGGGCACGCTGGACTTCAGCCTGCTGTATGACCAGGAGAACAACGCCCTCCACTGCACCATCACCAAGGCCAAGGGCCTGAAGCCAATGGACCACAATGGGCTGGCAGACCCCTACGTCAAGCTGCACCTGCTGCCAGGAGCCAGTAAGGCAAATAAGCTCAGAACAAAAACTCTCCGTAACACTCTGAACCCCACATGGAACGAGACCCTCACTTACTACGGGATCACAGATGAAGACATGATCCGCAAGACCCTGCGGATCTCTGTGTGTGACGAGGACAAATTCCGGCACAATGAGTTCATCGGGGAGACACGTGTGCCCCTGAAGAAGCTGAAACCCAACCACACCAAGACCTTCAGCATCTGCCTGGAGAAGCAGCTGCCGGTGGACAAGACTGAAGACAAGTCCCTGGAGGAGCGGGGCCGCATCCTCATCTCCCTCAAGTACAGCTCACAGAAGCAAGGCCTGCTGGTAGGCATCGTGCGGTGCGCCCACCTGGCCGCCATGGACGCCAACGGCTACTCGGACCCCTACGTGAAAAC


=====================================
tests/test_FastaRecord.py
=====================================
--- a/tests/test_FastaRecord.py
+++ b/tests/test_FastaRecord.py
@@ -75,6 +75,30 @@ class TestFastaRecord(TestCase):
         sys.stdout.writelines(tuple(Differ().compare(deflines, long_names)))
         assert deflines == long_names
 
+    def test_unpadded_length(self):
+        filename = "data/padded.fasta"
+        with open(filename, 'w') as padded:
+            padded.write(">test_padded\n")
+            for n in range(10):
+                padded.write("N" * 80)
+                padded.write("\n")
+            padded.write("N" * 30)
+            padded.write("A" * 20)
+            padded.write("N" * 30)
+            padded.write("\n")
+            for n in range(10):
+                padded.write("N" * 80)
+                padded.write("\n")
+
+        fasta = Fasta(filename)
+        expect = 20
+        result = fasta["test_padded"].unpadded_len
+        print (expect, result)
+        assert expect == result
+        os.remove('data/padded.fasta')
+        os.remove('data/padded.fasta.fai')
+
+
 class TestMutableFastaRecord(TestCase):
     def setUp(self):
         with open('data/genes_mutable.fasta', 'wb') as mutable:


=====================================
tests/test_FastaRecord_iter.py
=====================================
--- a/tests/test_FastaRecord_iter.py
+++ b/tests/test_FastaRecord_iter.py
@@ -25,3 +25,10 @@ class TestFastaRecordIter(TestCase):
         fasta = Fasta('data/genes.fasta')
         for record in fasta:
             assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
+
+    def test_reverse_iter(self):
+        expect = list(chain(*([line[::-1] for line in record][::-1] for record in Fasta('data/genes.fasta', as_raw=True))))
+        result = list(chain(*([line for line in reversed(record)] for record in Fasta('data/genes.fasta', as_raw=True))))
+        for a, b in zip(expect, result):
+            print(a, b)
+        assert expect == result


=====================================
tests/test_FastaVariant.py
=====================================
--- a/tests/test_FastaVariant.py
+++ b/tests/test_FastaVariant.py
@@ -8,6 +8,9 @@ path = os.path.dirname(__file__)
 os.chdir(path)
 
 class TestFastaVariant(TestCase):
+    
+    def setUp(self):
+        raise SkipTest
 
     def tearDown(self):
         try:


=====================================
tests/test_Fasta_bgzip.py
=====================================
--- /dev/null
+++ b/tests/test_Fasta_bgzip.py
@@ -0,0 +1,258 @@
+import os
+from pyfaidx import Fasta, Faidx, UnsupportedCompressionFormat, FetchError
+from itertools import chain
+try:
+    from unittest import TestCase, expectedFailure
+except ImportError:
+    from unittest import TestCase
+    from nose.plugins.skip import SkipTest as expectedFailure # python2.6
+from nose.tools import raises
+from nose.plugins.skip import SkipTest
+
+path = os.path.dirname(__file__)
+os.chdir(path)
+
+class TestIndexing(TestCase):
+    def setUp(self):
+        try:
+            from Bio import SeqIO
+        except ImportError:
+            raise SkipTest
+
+    def tearDown(self):
+        try:
+            os.remove('data/genes.fasta.gz.fai')
+        except EnvironmentError:
+            pass  # some tests may delete this file
+
+    @expectedFailure
+    def test_build_issue_126(self):
+        """ Samtools BGZF index should be identical to pyfaidx BGZF index """
+        expect_index = ("gi|563317589|dbj|AB821309.1|	3510	114	70	71\n"
+                        "gi|557361099|gb|KF435150.1|	481	3789	70	71\n"
+                        "gi|557361097|gb|KF435149.1|	642	4368	70	71\n"
+                        "gi|543583796|ref|NR_104216.1|	4573	5141	70	71\n"
+                        "gi|543583795|ref|NR_104215.1|	5317	9901	70	71\n"
+                        "gi|543583794|ref|NR_104212.1|	5374	15415	70	71\n"
+                        "gi|543583788|ref|NM_001282545.1|	4170	20980	70	71\n"
+                        "gi|543583786|ref|NM_001282543.1|	5466	25324	70	71\n"
+                        "gi|543583785|ref|NM_000465.3|	5523	30980	70	71\n"
+                        "gi|543583740|ref|NM_001282549.1|	3984	36696	70	71\n"
+                        "gi|543583738|ref|NM_001282548.1|	4113	40851	70	71\n"
+                        "gi|530384540|ref|XM_005249645.1|	2752	45151	70	71\n"
+                        "gi|530384538|ref|XM_005249644.1|	3004	48071	70	71\n"
+                        "gi|530384536|ref|XM_005249643.1|	3109	51246	70	71\n"
+                        "gi|530384534|ref|XM_005249642.1|	3097	54528	70	71\n"
+                        "gi|530373237|ref|XM_005265508.1|	2794	57830	70	71\n"
+                        "gi|530373235|ref|XM_005265507.1|	2848	60824	70	71\n"
+                        "gi|530364726|ref|XR_241081.1|	1009	63849	70	71\n"
+                        "gi|530364725|ref|XR_241080.1|	4884	65009	70	71\n"
+                        "gi|530364724|ref|XR_241079.1|	2819	70099	70	71\n")
+        index_file = Faidx('data/genes.fasta.gz').indexname
+        result_index = open(index_file).read()
+        assert result_index == expect_index
+
+class TestFastaBGZF(TestCase):
+    def setUp(self):
+        try:
+            from Bio import SeqIO
+        except ImportError:
+            raise SkipTest
+
+    def tearDown(self):
+        try:
+            os.remove('data/genes.fasta.gz.fai')
+        except EnvironmentError:
+            pass  # some tests may delete this file
+
+    def test_integer_slice(self):
+        fasta = Fasta('data/genes.fasta.gz')
+        expect = fasta['gi|563317589|dbj|AB821309.1|'][:100].seq
+        result = fasta[0][:100].seq
+        assert expect == result
+
+    def test_integer_index(self):
+        fasta = Fasta('data/genes.fasta.gz')
+        expect = fasta['gi|563317589|dbj|AB821309.1|'][100].seq
+        result = fasta[0][100].seq
+        assert expect == result
+
+    def test_fetch_whole_fasta(self):
+        expect = [line.rstrip('\n') for line in open('data/genes.fasta') if line[0] != '>']
+        result = list(chain(*([line for line in record] for record in Fasta('data/genes.fasta.gz', as_raw=True))))
+        assert expect == result
+
+    def test_line_len(self):
+        fasta = Fasta('data/genes.fasta.gz')
+        for record in fasta:
+            assert len(next(iter(record))) == fasta.faidx.index[record.name].lenc
+
+    @raises(UnsupportedCompressionFormat)
+    def test_mutable_bgzf(self):
+        fasta = Fasta('data/genes.fasta.gz', mutable=True)
+
+    @raises(NotImplementedError)
+    def test_long_names(self):
+        """ Test that deflines extracted using FastaRecord.long_name are
+        identical to deflines in the actual file.
+        """
+        deflines = []
+        with open('data/genes.fasta') as fasta_file:
+            for line in fasta_file:
+                if line[0] == '>':
+                    deflines.append(line[1:-1])
+        fasta = Fasta('data/genes.fasta.gz')
+        long_names = []
+        for record in fasta:
+            long_names.append(record.long_name)
+        assert deflines == long_names
+
+    def test_fetch_whole_entry(self):
+        faidx = Faidx('data/genes.fasta.gz')
+        expect = ('ATGACATCATTTTCCACCTCTGCTCAGTGTTCAACATCTGA'
+                'CAGTGCTTGCAGGATCTCTCCTGGACAAATCAATCAGGTACGACCA'
+                'AAACTGCCGCTTTTGAAGATTTTGCATGCAGCAGGTGCGCAAGG'
+                'TGAAATGTTCACTGTTAAAGAGGTCATGCACTATTTAGGTCAGTACAT'
+                'AATGGTGAAGCAACTTTATGATCAGCAGGAGCAGCATATGGTATATTG'
+                'TGGTGGAGATCTTTTGGGAGAACTACTGGGACGTCAGAGCTTCTCCGTG'
+                'AAAGACCCAAGCCCTCTCTATGATATGCTAAGAAAGAATCTTGTCACTTT'
+                'AGCCACTGCTACTACAGCAAAGTGCAGAGGAAAGTTCCACTTCCAGAAAAA'
+                'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
+                'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             1, 481)
+        assert str(result) == expect
+
+    def test_fetch_middle(self):
+        faidx = Faidx('data/genes.fasta.gz')
+        expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             100, 150)
+        assert str(result) == expect
+
+    def test_fetch_end(self):
+        faidx = Faidx('data/genes.fasta.gz')
+        expect = 'TC'
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             480, 481)
+        assert str(result) == expect
+
+    @raises(FetchError)
+    def test_fetch_border(self):
+        """ Fetch past the end of a gene entry """
+        faidx = Faidx('data/genes.fasta.gz')
+        expect = 'TC'
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             480, 500)
+        print(result)
+        assert str(result) == expect
+
+    def test_rev(self):
+        faidx = Faidx('data/genes.fasta.gz')
+        expect = 'GA'
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             480, 481)
+        assert str(-result) == expect, result
+
+    @raises(FetchError)
+    def test_fetch_past_bounds(self):
+        """ Fetch past the end of a gene entry """
+        faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                                         480, 5000)
+
+    @raises(FetchError)
+    def test_fetch_negative(self):
+        """ Fetch starting with a negative coordinate """
+        faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                                         -10, 10)
+
+    @raises(FetchError)
+    def test_fetch_reversed_coordinates(self):
+        """ Fetch starting with a negative coordinate """
+        faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                                         50, 10)
+
+    @raises(FetchError)
+    def test_fetch_keyerror(self):
+        """ Fetch a key that does not exist """
+        faidx = Faidx('data/genes.fasta.gz', strict_bounds=True)
+        result = faidx.fetch('gi|joe|gb|KF435150.1|',
+                                         1, 10)
+
+    def test_blank_string(self):
+        """ seq[0:0] should return a blank string mdshw5/pyfaidx#53 """
+        fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+        assert fasta['gi|557361099|gb|KF435150.1|'][0:0] == ''
+
+    def test_slice_from_beginning(self):
+        fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+        assert fasta['gi|557361099|gb|KF435150.1|'][:4] == 'ATGA'
+
+    def test_slice_from_end(self):
+        fasta = Fasta('data/genes.fasta.gz', as_raw=True)
+        assert fasta['gi|557361099|gb|KF435150.1|'][-4:] == 'ACTC'
+
+    def test_issue_74_start(self):
+        f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+        f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+        assert f0['gi|557361099|gb|KF435150.1|'][0:90].start == f1['gi|557361099|gb|KF435150.1|'][0:90].start - 1
+
+    def test_issue_74_consistency(self):
+        f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+        f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+        assert str(f0['gi|557361099|gb|KF435150.1|'][0:90]) == str(f1['gi|557361099|gb|KF435150.1|'][0:90])
+
+    def test_issue_74_end_faidx(self):
+        f0 = Faidx('data/genes.fasta.gz', one_based_attributes=False)
+        f1 = Faidx('data/genes.fasta.gz', one_based_attributes=True)
+        end0 = f0.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+        end1 = f1.fetch('gi|557361099|gb|KF435150.1|', 1, 90).end
+        assert end0 == end1
+
+    def test_issue_74_end_fasta(self):
+        f0 = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+        f1 = Fasta('data/genes.fasta.gz', one_based_attributes=True)
+        end0 = f0['gi|557361099|gb|KF435150.1|'][1:90].end
+        end1 = f1['gi|557361099|gb|KF435150.1|'][1:90].end
+        print((end0, end1))
+        assert end0 == end1
+
+    def test_issue_79_fix(self):
+        f = Fasta('data/genes.fasta.gz')
+        s = f['gi|557361099|gb|KF435150.1|'][100:105]
+        print((s.start, s.end))
+        assert (101, 105) == (s.start, s.end)
+
+    def test_issue_79_fix_negate(self):
+        f = Fasta('data/genes.fasta.gz')
+        s = f['gi|557361099|gb|KF435150.1|'][100:105]
+        s = -s
+        print((s.start, s.end))
+        assert (105, 101) == (s.start, s.end)
+
+    def test_issue_79_fix_one_based_false(self):
+        f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+        s = f['gi|557361099|gb|KF435150.1|'][100:105]
+        print((s.start, s.end))
+        assert (100, 105) == (s.start, s.end)
+
+    def test_issue_79_fix_one_based_false_negate(self):
+        f = Fasta('data/genes.fasta.gz', one_based_attributes=False)
+        s = f['gi|557361099|gb|KF435150.1|'][100:105]
+        print(s.__dict__)
+        s = -s
+        print(s.__dict__)
+        assert (105, 100) == (s.start, s.end)
+
+    @raises(FetchError)
+    def test_fetch_border_padded(self):
+        """ Fetch past the end of a gene entry """
+        faidx = Faidx('data/genes.fasta.gz', default_seq='N')
+        expect = 'TCNNNNNNNNNNNNNNNNNNN'
+        result = faidx.fetch('gi|557361099|gb|KF435150.1|',
+                             480, 500)
+        print(result)
+        assert str(result) == expect


=====================================
tests/test_faidx.py
=====================================
--- a/tests/test_faidx.py
+++ b/tests/test_faidx.py
@@ -1,8 +1,10 @@
 import os
+import filecmp
 from pyfaidx import FastaIndexingError, BedError, FetchError
 from pyfaidx.cli import main
 from nose.tools import raises
 from unittest import TestCase
+from tempfile import NamedTemporaryFile
 
 path = os.path.dirname(__file__)
 os.chdir(path)
@@ -36,3 +38,15 @@ class TestCLI(TestCase):
         
     def test_key_warning(self):
         main(['data/genes.fasta', 'foo'])
+        
+    def test_auto_strand(self):
+        """ Test that --auto-strand produces the same output as --reverse --complement"""
+        with NamedTemporaryFile() as auto_strand:
+            with NamedTemporaryFile() as noto_strand:
+                main(['--auto-strand', '-o', auto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:100-1'])
+                main(['--reverse', '--complement', '-o', noto_strand.name, 'data/genes.fasta', 'gi|557361099|gb|KF435150.1|:1-100'])
+                print(auto_strand.read())
+                print()
+                print(noto_strand.read())
+                self.assertTrue(filecmp.cmp(auto_strand.name, noto_strand.name))
+        


=====================================
tests/test_feature_bounds_check.py
=====================================
--- a/tests/test_feature_bounds_check.py
+++ b/tests/test_feature_bounds_check.py
@@ -29,7 +29,7 @@ class TestFeatureBoundsCheck:
                 'GAACTACAGAAGACGATATCCCCACACTGCCTACCTCAGAGCATAAATGCA'
                 'TACATTCTAGAGAAGGTGATTGAAGTGGGAAAAAATGATGACCTGGAGGACTC')
         result = faidx.fetch('gi|557361099|gb|KF435150.1|',
-                             1, 482)
+                             1, 481)
         assert str(result) == expect
 
     def test_fetch_middle(self):
@@ -43,7 +43,7 @@ class TestFeatureBoundsCheck:
         faidx = Faidx('data/genes.fasta')
         expect = 'TC'
         result = faidx.fetch('gi|557361099|gb|KF435150.1|',
-                             480, 482)
+                             480, 481)
         assert str(result) == expect
 
     def test_fetch_border(self):
@@ -58,7 +58,7 @@ class TestFeatureBoundsCheck:
         faidx = Faidx('data/genes.fasta')
         expect = 'GA'
         result = faidx.fetch('gi|557361099|gb|KF435150.1|',
-                             480, 482)
+                             480, 481)
         assert str(-result) == expect, result
 
     @raises(FetchError)


=====================================
tests/test_feature_indexing.py
=====================================
--- a/tests/test_feature_indexing.py
+++ b/tests/test_feature_indexing.py
@@ -1,6 +1,6 @@
 import os
 from os.path import getmtime
-from pyfaidx import Faidx, FastaIndexingError
+from pyfaidx import Faidx, FastaIndexingError, IndexNotFoundError, FastaNotFoundError
 from nose.tools import raises
 from nose.plugins.skip import Skip, SkipTest
 from unittest import TestCase
@@ -32,32 +32,59 @@ class TestIndexing(TestCase):
 
     def test_build(self):
         expect_index = ("gi|563317589|dbj|AB821309.1|	3510	114	70	71\n"
-        "gi|557361099|gb|KF435150.1|	481	3789	70	71\n"
-        "gi|557361097|gb|KF435149.1|	642	4368	70	71\n"
-        "gi|543583796|ref|NR_104216.1|	4573	5141	70	71\n"
-        "gi|543583795|ref|NR_104215.1|	5317	9901	70	71\n"
-        "gi|543583794|ref|NR_104212.1|	5374	15415	70	71\n"
-        "gi|543583788|ref|NM_001282545.1|	4170	20980	70	71\n"
-        "gi|543583786|ref|NM_001282543.1|	5466	25324	70	71\n"
-        "gi|543583785|ref|NM_000465.3|	5523	30980	70	71\n"
-        "gi|543583740|ref|NM_001282549.1|	3984	36696	70	71\n"
-        "gi|543583738|ref|NM_001282548.1|	4113	40851	70	71\n"
-        "gi|530384540|ref|XM_005249645.1|	2752	45151	70	71\n"
-        "gi|530384538|ref|XM_005249644.1|	3004	48071	70	71\n"
-        "gi|530384536|ref|XM_005249643.1|	3109	51246	70	71\n"
-        "gi|530384534|ref|XM_005249642.1|	3097	54528	70	71\n"
-        "gi|530373237|ref|XM_005265508.1|	2794	57830	70	71\n"
-        "gi|530373235|ref|XM_005265507.1|	2848	60824	70	71\n"
-        "gi|530364726|ref|XR_241081.1|	1009	63849	70	71\n"
-        "gi|530364725|ref|XR_241080.1|	4884	65009	70	71\n"
-        "gi|530364724|ref|XR_241079.1|	2819	70099	70	71\n")
+                        "gi|557361099|gb|KF435150.1|	481	3789	70	71\n"
+                        "gi|557361097|gb|KF435149.1|	642	4368	70	71\n"
+                        "gi|543583796|ref|NR_104216.1|	4573	5141	70	71\n"
+                        "gi|543583795|ref|NR_104215.1|	5317	9901	70	71\n"
+                        "gi|543583794|ref|NR_104212.1|	5374	15415	70	71\n"
+                        "gi|543583788|ref|NM_001282545.1|	4170	20980	70	71\n"
+                        "gi|543583786|ref|NM_001282543.1|	5466	25324	70	71\n"
+                        "gi|543583785|ref|NM_000465.3|	5523	30980	70	71\n"
+                        "gi|543583740|ref|NM_001282549.1|	3984	36696	70	71\n"
+                        "gi|543583738|ref|NM_001282548.1|	4113	40851	70	71\n"
+                        "gi|530384540|ref|XM_005249645.1|	2752	45151	70	71\n"
+                        "gi|530384538|ref|XM_005249644.1|	3004	48071	70	71\n"
+                        "gi|530384536|ref|XM_005249643.1|	3109	51246	70	71\n"
+                        "gi|530384534|ref|XM_005249642.1|	3097	54528	70	71\n"
+                        "gi|530373237|ref|XM_005265508.1|	2794	57830	70	71\n"
+                        "gi|530373235|ref|XM_005265507.1|	2848	60824	70	71\n"
+                        "gi|530364726|ref|XR_241081.1|	1009	63849	70	71\n"
+                        "gi|530364725|ref|XR_241080.1|	4884	65009	70	71\n"
+                        "gi|530364724|ref|XR_241079.1|	2819	70099	70	71\n")
         index_file = Faidx('data/genes.fasta').indexname
         result_index = open(index_file).read()
         assert result_index == expect_index
 
+    def test_build_issue_111(self):
+        expect_index = ("gi|563317589|dbj|AB821309	3510	114	70	71\n"
+                        "gi|557361099|gb|KF435150	481	3789	70	71\n"
+                        "gi|557361097|gb|KF435149	642	4368	70	71\n"
+                        "gi|543583796|ref|NR_104216	4573	5141	70	71\n"
+                        "gi|543583795|ref|NR_104215	5317	9901	70	71\n"
+                        "gi|543583794|ref|NR_104212	5374	15415	70	71\n"
+                        "gi|543583788|ref|NM_001282545	4170	20980	70	71\n"
+                        "gi|543583786|ref|NM_001282543	5466	25324	70	71\n"
+                        "gi|543583785|ref|NM_000465	5523	30980	70	71\n"
+                        "gi|543583740|ref|NM_001282549	3984	36696	70	71\n"
+                        "gi|543583738|ref|NM_001282548	4113	40851	70	71\n"
+                        "gi|530384540|ref|XM_005249645	2752	45151	70	71\n"
+                        "gi|530384538|ref|XM_005249644	3004	48071	70	71\n"
+                        "gi|530384536|ref|XM_005249643	3109	51246	70	71\n"
+                        "gi|530384534|ref|XM_005249642	3097	54528	70	71\n"
+                        "gi|530373237|ref|XM_005265508	2794	57830	70	71\n"
+                        "gi|530373235|ref|XM_005265507	2848	60824	70	71\n"
+                        "gi|530364726|ref|XR_241081	1009	63849	70	71\n"
+                        "gi|530364725|ref|XR_241080	4884	65009	70	71\n"
+                        "gi|530364724|ref|XR_241079	2819	70099	70	71\n")
+        index = Faidx(
+            'data/genes.fasta',
+            read_long_names=True,
+            key_function=lambda x: x.split('.')[0])
+        result_index = ''.join(index._index_as_string())
+        assert result_index == expect_index
+
     def test_order(self):
-        order = ("gi|563317589|dbj|AB821309.1|",
-                 "gi|557361099|gb|KF435150.1|",
+        order = ("gi|563317589|dbj|AB821309.1|", "gi|557361099|gb|KF435150.1|",
                  "gi|557361097|gb|KF435149.1|",
                  "gi|543583796|ref|NR_104216.1|",
                  "gi|543583795|ref|NR_104215.1|",
@@ -202,11 +229,14 @@ class TestIndexing(TestCase):
             # Write simple fasta file with inconsistent sequence line lengths,
             # so building an index raises a 'FastaIndexingError'
             with open(fasta_path, 'w') as fasta_out:
-                fasta_out.write(">seq1\nCTCCGGGCCCAT\nAACACTTGGGGGTAGCTAAAGTGAA\nATAAAGCCTAAA\n")
+                fasta_out.write(
+                    ">seq1\nCTCCGGGCCCAT\nAACACTTGGGGGTAGCTAAAGTGAA\nATAAAGCCTAAA\n"
+                )
 
             builtins_open = builtins.open
 
-            opened_files=[]
+            opened_files = []
+
             def test_open(*args, **kwargs):
                 f = builtins_open(*args, **kwargs)
                 opened_files.append(f)
@@ -215,7 +245,9 @@ class TestIndexing(TestCase):
             with mock.patch('six.moves.builtins.open', side_effect=test_open):
                 try:
                     Faidx(fasta_path)
-                    self.assertFail("Faidx construction should fail with 'FastaIndexingError'.")
+                    self.assertFail(
+                        "Faidx construction should fail with 'FastaIndexingError'."
+                    )
                 except FastaIndexingError:
                     pass
             self.assertTrue(all(f.closed for f in opened_files))
@@ -240,7 +272,8 @@ class TestIndexing(TestCase):
 
             builtins_open = builtins.open
 
-            opened_files=[]
+            opened_files = []
+
             def test_open(*args, **kwargs):
                 f = builtins_open(*args, **kwargs)
                 opened_files.append(f)
@@ -249,10 +282,16 @@ class TestIndexing(TestCase):
             with mock.patch('six.moves.builtins.open', side_effect=test_open):
                 try:
                     Faidx(fasta_path)
-                    self.assertFail("Faidx construction should fail with 'ValueError'.")
+                    self.assertFail(
+                        "Faidx construction should fail with 'ValueError'.")
                 except ValueError:
                     pass
             self.assertTrue(all(f.closed for f in opened_files))
         finally:
             shutil.rmtree(tmp_dir)
 
+    @raises(IndexNotFoundError)
+    def test_issue_134_no_build_index(self):
+        """ Ensure that index file is not built when build_index=False. See mdshw5/pyfaidx#134.
+        """
+        faidx = Faidx('data/genes.fasta', build_index=False)


=====================================
tests/test_feature_key_function.py
=====================================
--- a/tests/test_feature_key_function.py
+++ b/tests/test_feature_key_function.py
@@ -34,6 +34,8 @@ def get_duplicated_gene_name(accession):
 
 class TestFeatureKeyFunction(TestCase):
     def setUp(self):
+        genes = Fasta('data/genes.fasta')
+        del genes  # Support feature introduced in #111
         pass
 
     def tearDown(self):
@@ -64,3 +66,15 @@ class TestFeatureKeyFunction(TestCase):
     @raises(ValueError)
     def test_duplicated_keys(self):
         genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name)
+
+    def test_duplicated_keys_shortest(self):
+        genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="shortest")
+        expect = 4573
+        result = len(genes["BARD1"])
+        assert expect == result
+
+    def test_duplicated_keys_longest(self):
+        genes = Fasta('data/genes.fasta', key_function=get_duplicated_gene_name, duplicate_action="longest")
+        expect = 5317
+        result = len(genes["BARD1"])
+        assert expect == result


=====================================
tests/test_feature_spliced_seq.py
=====================================
--- /dev/null
+++ b/tests/test_feature_spliced_seq.py
@@ -0,0 +1,74 @@
+import os
+from pyfaidx import Fasta
+from unittest import TestCase
+
+path = os.path.dirname(__file__)
+os.chdir(path)
+
+class TestFeatureSplicedSeq(TestCase):
+    def setUp(self):
+        pass
+
+    def tearDown(self):
+        fais = [
+                "data/gene.bed12.fasta.fai",
+                "data/chr17.hg19.part.fa.fai"
+                ]
+        for fai in fais:
+            try:    
+                os.remove(fai)
+            except EnvironmentError:
+                pass  # some tests may delete this file
+
+    def test_split_seq(self):
+        """ Fetch sequence by blocks """
+        fa = Fasta('data/chr17.hg19.part.fa')
+        
+        gene = Fasta("data/gene.bed12.fasta")
+        expect = gene[list(gene.keys())[0]][:].seq
+        
+        bed = "data/gene.bed12"
+        with open(bed) as fi:
+            record = fi.readline().strip().split("\t")
+
+        chrom = record[0]
+        start = int(record[1])
+        strand = record[5]
+
+        # parse bed12 format
+        starts = [int(x) for x in record[11].split(",")[:-1]] 
+        sizes = [int(x) for x in record[10].split(",")[:-1]]
+        starts = [start + x  for x in starts]
+        ends = [start + size  for start,size in zip(starts, sizes)] 
+        
+        # bed half-open
+        if strand == "-":
+            starts = [start + 1 for start in starts]
+        else: 
+            ends = [end - 1 for end in ends]
+        
+        intervals = zip(starts, ends) 
+        result = fa.get_spliced_seq(chrom, intervals, rc=True)
+        print(result.seq)
+        print("====")
+        print(expect)
+
+        assert result.seq == expect
+
+    def test_get_seq_rc(self):
+        """ Check get_seq with rc argument """
+        fa = Fasta('data/chr17.hg19.part.fa')
+        
+        result = fa.get_seq("chr17", 11, 20, rc=False)
+        expect = "CCCTGTTCCT"
+        print("normal")
+        print(result.seq)
+        print(expect)
+        assert result.seq == expect
+        
+        result = fa.get_seq("chr17", 11, 20, rc=True)
+        expect = "AGGAACAGGG"
+        assert result.seq == expect
+        print("rc")
+        print(result.seq)
+        print(expect)


=====================================
tests/test_feature_split_char.py
=====================================
--- a/tests/test_feature_split_char.py
+++ b/tests/test_feature_split_char.py
@@ -18,20 +18,24 @@ class TestFeatureSplitChar(TestCase):
             pass  # some tests may delete this file
 
     def test_keys(self):
-        fasta = Fasta('data/genes.fasta', split_char='|')
+        fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
         expect = ['530364724', '530364725', '530364726', '530373235', '530373237', '530384534', '530384536', '530384538', '530384540', '543583738', '543583740', '543583785', '543583786', '543583788', '543583794', '543583795', '543583796', '557361097', '557361099', '563317589', 'AB821309.1', 'KF435149.1', 'KF435150.1', 'NM_000465.3', 'NM_001282543.1', 'NM_001282545.1', 'NM_001282548.1', 'NM_001282549.1', 'NR_104212.1', 'NR_104215.1', 'NR_104216.1', 'XM_005249642.1', 'XM_005249643.1', 'XM_005249644.1', 'XM_005249645.1', 'XM_005265507.1', 'XM_005265508.1', 'XR_241079.1', 'XR_241080.1', 'XR_241081.1', 'dbj']
         result = sorted(fasta.keys())
         assert result == expect
 
     def test_key_function_by_dictionary_get_key(self):
-        fasta = Fasta('data/genes.fasta', split_char='|')
+        fasta = Fasta('data/genes.fasta', split_char='|', duplicate_action="drop")
         expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
         result = fasta['KF435150.1'][100-1:150]
         assert str(result) == expect
 
     def test_key_function_by_fetch(self):
-        faidx = Faidx('data/genes.fasta', split_char='|')
+        faidx = Faidx('data/genes.fasta', split_char='|', duplicate_action="drop")
         expect = 'TTGAAGATTTTGCATGCAGCAGGTGCGCAAGGTGAAATGTTCACTGTTAAA'
         result = faidx.fetch('KF435150.1',
                              100, 150)
         assert str(result) == expect
+
+    @raises(ValueError)
+    def test_stop(self):
+        fasta = Fasta('data/genes.fasta', split_char='|')



View it on GitLab: https://salsa.debian.org/med-team/python-pyfaidx/compare/a09e0d29890acdcc1cbfaaf16b562d5d7227faaa...6ab6bd1ff8665cb44fcbfcb531915cf17349bdda

---
View it on GitLab: https://salsa.debian.org/med-team/python-pyfaidx/compare/a09e0d29890acdcc1cbfaaf16b562d5d7227faaa...6ab6bd1ff8665cb44fcbfcb531915cf17349bdda
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20180416/83a3d558/attachment-0001.html>


More information about the debian-med-commit mailing list