[med-svn] [Git][med-team/resfinder][upstream] 2 commits: New upstream version 4.2.4
Andreas Tille (@tille)
gitlab at salsa.debian.org
Tue Jan 17 16:45:05 GMT 2023
Andreas Tille pushed to branch upstream at Debian Med / resfinder
Commits:
8c3f5cf4 by Andreas Tille at 2023-01-17T17:12:35+01:00
New upstream version 4.2.4
- - - - -
1eb25192 by Andreas Tille at 2023-01-17T17:15:37+01:00
New upstream version 4.2.4
- - - - -
27 changed files:
- CHANGELOG.md
- + DEV_SETUP.md
- debian-tests-data/db_resfinder/VERSION
- debian-tests-data/db_resfinder/macrolide.fsa
- debian-tests-data/db_resfinder/oxazolidinone.fsa
- debian-tests-data/db_resfinder/quinolone.fsa
- dockerfile
- src/resfinder/__init__.py
- src/resfinder/amr_abbreviations.md
- src/resfinder/cge/config.py
- src/resfinder/cge/output/gene_result.py
- src/resfinder/cge/output/phenotype_result.py
- src/resfinder/cge/output/seq_variation_result.py
- src/resfinder/cge/output/std_results.py
- src/resfinder/cge/phenotype2genotype/feature.py
- src/resfinder/cge/phenotype2genotype/isolate.py
- src/resfinder/cge/phenotype2genotype/res_profile.py
- src/resfinder/cge/pointfinder.py
- src/resfinder/run_resfinder.py
- + tests/data/test_isolate_11.fa
- + tests/data/test_isolate_11_1.fq
- + tests/data/test_isolate_11_2.fq
- tests/resfinder/cge/output/test_phenotype_result.md
- tests/resfinder/cge/output/test_seq_variation_result.md
- tests/resfinder/cge/output/test_std_results.md
- tests/resfinder/cge/phenotype2genotype/test_isolate.md
- tests/resfinder/cge/test_config.md
Changes:
=====================================
CHANGELOG.md
=====================================
@@ -1,36 +1,54 @@
# Changelog
+
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [4.2.4] - 2023-01-17
+
+### Added
+
+- Option (--output_aln) that will output alignment and match sequences in the json output ("seq_regions": "query_string" / "alignment_string" / "ref_string" )
+
+### Deprecated
+
+- The Pointfinder_prediction.txt will no longer be a part of the PointFinder output.
+
+### Fixed
+
+- Issue with RNA mutations so the genes in RNA_genes.txt is read correctly and mutations found.
+- Issue with indels such that insertions provided both as nucleotide and amino acids will be found.
## [4.2.3] - 2022-10-13
### Added
+
- The ResFinder databases now has a VERSION file that will be used to determine database versions. Implemented the use of these if the python module cgelib has been updated to at least version 0.7.3
### Fixed
+
- Issue where the application failed when species was specified but did not have a phenotype panel.
- DisinFinder overwriting ResFinder results.
- Issue with phenotype results showing integers or nothing instead of gene names in the 'Genetic background' column.
- Dockerfile
-
## [4.2.2] - 2022-09-19
### Added
+
- ResFinder will now complain and exit if the ResFinder database is not found, as it is necessary, even if only looking for point mutations or disinfectant genes.
### Fixed
+
- Issue where the application failed when run using only the pointfinder option (--point)
- Issue where application would crash if using the --disinfectant and --nanopore flags.
- Changelog version format from d.d.d to [d.d.d]
-
## [4.2.1] - 2022-09-12
### Added
+
- Several environmental variables for ResFinder settings (see README.md)..
- Flags "--ignore_indels" and "--ignore_stop_codons" that will make the point mutation algorithm ignore indels and premature stop codons, respectively.
- Feature to search for genes that provide resistance to disinfectants (--disinfectant).
@@ -38,11 +56,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- New method for calling ResFinder if installed via pip "python -m resfinder -h"
### Changed
+
- Recommended installation method for ResFinder. See README.md.
- It is no longer necessary to have cloned ResFinder via git in order to obtain the version number.
- json output file to contain all output results and enable the user to specify a path for the file.
### Deprecated
+
- It is no longer recommended to clone the repository of ResFinder, unless you are a developer. Instead install via pip is recommended.
- Flag "--databases". ResFinder will now always be run against all databases. Option will be removed in the next major update.
- ResFinder will in the next major update not default to database paths within the application directory. Instead the use of the appropriate environment variables is recommended or the appropriate flags.
@@ -58,9 +78,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- The flag "-t_p" will in the next major update not be supported.Instead use "--threshold_point".
### Fixed
+
- Issue in PointFinder where a phenotype depending on several mutations would not be written in the phenotypes results files.
- Output in PointFinder, where some antibiotics would be listed twice.
- Issue in json ouptput where unknown mutations were listed even if option wasn't enabled.
-
## [4.2.0] - 2022-04-21 [YANKED]
=====================================
DEV_SETUP.md
=====================================
@@ -0,0 +1,153 @@
+# Development documentation
+
+This document is intended for developers working on the ResFinder application. It consists of suggestions on how to set up a development environment and guidelines for doing releases of ResFinder.
+
+## Setup development environment
+
+### 1. Install PDM - Python Development Master
+
+In order to test and build ResFinder as described here, you will need PDM.
+
+*"PDM is a modern Python package manager with PEP 582 support. It installs and manages packages in a similar way to npm that doesn't need to create a virtualenv at all!"* - **PDM website**
+
+* Install PDM as described on the [PDM website](https://daobook.github.io/pdm/).
+
+### 2. Install BLAST and KMA
+
+### 3. Clone ResFinder repository
+
+### 4. Clone Databases and index them
+
+For the tests to work, you need to set the environment variables for the database locations.
+
+```bash
+
+export CGE_RESFINDER_RESGENE_DB="/path/to/resfinder_db/"
+export CGE_RESFINDER_RESPOINT_DB="/path/to/pointfinder_db/"
+
+```
+
+### 5. Setup PDM with ResFinder dependencies
+
+```bash
+
+# Go to ResFinder root directory
+cd /path/to/resfinder/
+
+# Install all python depencies inside the directory (resfinder/__pypackages__)
+pdm install
+
+```
+
+## Running and testing ResFinder
+
+```bash
+
+# Go to ResFinder root directory
+cd /path/to/resfinder/
+
+# Run ResFinder
+pdm run resfinder -h
+
+# Run ResFinder tests
+pdm run tests
+
+```
+
+## Creating / Edit tests
+
+All tests are written as doctests. The doctests are stored in markdown formatted files in the `tests` directory. The `tests` directory mirrors the structure of the `src` directory. Tests are written into files named after a corresponding file in the `src` directory tree, but prefixed with `test_`. A test is written into the test file that corresponds to the file in which the tested code resides.
+
+*Example*
+
+**Code**: `src/resfinder/cge/output/gene_result.py`
+
+**Test file**: `tests/resfinder/cge/output/test_gene_result.md`
+
+If a `*.py` file doesn't have a corresponding test file you need to create it if you need to write tests for the code. Remember to prefix the name with `test_`, as this will automatically include the tests when running `pdm run tests`. Use existing test files to get an idea of how these should be formatted.
+
+**Note**: The configuration for recognizing tests are set in the file `pytest.ini` in the root directory.
+
+## Deploy
+
+1. Change version number in `src/resfinder/__init__.py`.
+2. Make sure CHANGELOG.md is up to date and add the current date of release.
+3. Push changes to repository.
+4. Tag the commit you just pushed with the correct version number.
+5. Build package:
+
+ ```bash
+
+ pdm build
+
+ ```
+
+6. Upload package to pypi:
+
+ ```bash
+
+ twine upload dist/*
+
+ ```
+
+### Docker image
+
+**Note**: We do not guarentee that all ResFinder releases has a coresponding Docker image release. However, we should strive toward making as many as possible. At least Docker image releases should be done whenever significant changes has been released.
+
+A docker image contains both the ResFinder software and the three databases:
+[ResFinder](https://bitbucket.org/genomicepidemiology/resfinder_db/)
+[PointFinder](https://bitbucket.org/genomicepidemiology/pointfinder_db/)
+[DisinFinder](https://bitbucket.org/genomicepidemiology/disinfinder_db/)
+
+#### Versioning
+
+Versioning is done so that a specific version of a Docker image can always track exactely which version of the ResFinder sofware is included and exactely which databases.
+
+* A Docker release version should match the ResFinder version.
+* All databases included in a Docker image release should be tagged with a version number for that database release.
+* The commits that correspond to the database versions used should be tagged with `resfinder-VERSION`, where `VERSION` matches the ResFinder and therfore also the Docker image version. Hence, all database versions/commits included will contain at least two tags: a database version number and `resfinder-VERSION`.
+
+#### Deploy docker image
+
+1. Make sure each database you want to include is tagged with a version number.
+2. Make sure you have a released (versioned) ResFinder repo.
+3. Tag each database with `resfinder-VERSION`, where `VERSION` matches the ResFinder version number (ex.: `resfinder-4.2.3`).
+4. Update the Dockerfile:
+ * (optional) Bump the KMA version (no major version change).
+ * Change RESFINDER_VERSION environment variable.
+5. Build Docker image. Note `<VERSION>` should be replaced with ResFinder version number:
+
+ ```bash
+
+ # Go to ResFinder root directory
+ cd /path/to/resfinder/
+
+ # Build image
+ docker build -t genomicepidemiology/resfinder:<VERSION> .
+
+ ```
+
+6. Push Docker image:
+
+ ```bash
+
+ docker push genomicepidemiology/resfinder:<VERSION>
+
+ ```
+
+7. Update latest tag:
+
+ ```bash
+
+ # Get image id, replace with <ID> later
+ $ docker images
+ REPOSITORY TAG IMAGE ID CREATED SIZE
+ cgetools_front-web latest cf74e74364c9 2 months ago 392MB
+ cgetools_front-celery_worker latest 6524fa33a75a 2 months ago 392MB
+ redis 7-alpine 39267c75a230 2 months ago 28.1MB
+ rf_test latest 6692e77b787f 3 months ago 760MB
+
+ $ docker tag <ID> genomicepidemiology/resfinder:latest
+ $ docker push genomicepidemiology/resfinder:latest
+
+ ```
=====================================
debian-tests-data/db_resfinder/VERSION
=====================================
@@ -1 +1 @@
-2.0.0
+2.0.1
=====================================
debian-tests-data/db_resfinder/macrolide.fsa
=====================================
@@ -2313,44 +2313,6 @@ CGCGAGCGGGTGGACCCGGCGGCCCTGCCCCGCGACCTCAAGGCCGGGCACTGGGCATCC
CTCTACCGGCTCTACCGGGAGGTGGGTACTCGGCCCGCCCCTGCCGGCCGGTCCGTCCGG
GCCCGGCCCGGATCCGTCGGCCCCGACCGCTCGCTCCCTCCGCGCGGCCTGCGATCCGGT
CCGCCGAGGGCTCGACGACGTGGTGGAGGCGCCTGA
->cfr_1_AM408573
-ATGAATTTTAATAATAAAACAAAGTATGGTAAAATACAGGAATTTTTAAGAAGTAATAAT
-GAGCCTGATTATAGAATAAAACAAATAACCAATGCGATTTTTAAACAAAGAATTAGTCGA
-TTTGAGGATATGAAGGTTCTTCCAAAATTACTTAGGGAGGATTTAATAAATAATTTTGGA
-GAAACAGTTTTGAATATCAAGCTCTTAGCAGAGCAAAATTCAGAGCAAGTTACGAAAGTG
-CTTTTTGAAGTATCAAAGAATGAGAGAGTAGAAACGGTAAACATGAAGTATAAAGCAGGT
-TGGGAGTCATTTTGTATATCATCACAATGCGGATGTAATTTTGGGTGTAAATTTTGTGCT
-ACAGGCGACATTGGATTGAAAAAAAACCTAACTGTAGATGAGATAACAGATCAAGTTTTA
-TACTTCCATTTATTAGGTCATCAAATTGATAGCATTTCTTTTATGGGAATGGGTGAAGCT
-CTAGCCAACCGTCAAGTATTTGATGCTCTTGATTCGTTTACGGATCCTAATTTATTTGCA
-TTAAGTCCTCGTAGACTTTCTATATCAACGATTGGTATTATACCTAGTATCAAAAAAATA
-ACCCAGGAATATCCTCAAGTAAATCTTACATTTTCATTACACTCACCTTATAGTGAGGAA
-CGCAGCAAATTGATGCCAATAAATGATAGATACCCAATAGATGAGGTAATGAATATACTC
-GATGAACATATAAGATTAACTTCAAGGAAAGTATATATAGCTTATATCATGTTGCCTGGT
-GTAAATGATTCTCTTGAGCATGCAAACGAAGTTGTTAGCCTTCTTAAAAGTCGCTATAAA
-TCAGGGAAGTTATATCATGTAAATTTGATACGATACAATCCTACAATAAGTGCACCTGAG
-ATGTATGGAGAAGCAAACGAAGGGCAGGTAGAAGCCTTTTACAAAGTTTTGAAGTCTGCT
-GGTATCCATGTCACAATTAGAAGTCAATTTGGGATTGATATTGACGCTGCTTGTGGTCAA
-TTATATGGTAATTATCAAAATAGCCAATAG
->cfr_2_AJ879565
-ATGAATTTTAATAATAAAACAAAGTATGGTAAAATACAGGAATTTTTAAGAAGTAATAAT
-GAGCCTGATTATAGAATAAAACAAATAACCAATGCGATTTTTAAACAAAGAATTAGTCGA
-TTTGAGGATATGAAGGTTCTTCCAAAATTACTTAGGGAGGATTTAATAAATAATTTTGGA
-GAAACAGTTTTGAATATCAAGCTCTTAGCAGAGCAAAATTCAGAGCAAGTTACGAAAGTG
-CTTTTTGAAGTATCAAAGAATGAGAGAGTAGAAACGGTAAACATGAAGTATAAAGCAGGT
-TGGGAGTCATTTTGTATATCATCACAATGCGGATGTAATTTTGGGTGTAAATTTTGTGCT
-ACAGGCGACATTGGATTGAAAAAAAACCTAACTGTAGATGAGATAACAGATCAAGTTTTA
-TACTTCCATTTATTAGGTCATCAAATTGATAGCATTTCTTTTATGGGAATGGGTGAAGCT
-CTAGCCAACCGTCAAGTATTTGATGCTCTTGATTCGTTTACGGATCCTAATTTATTTGCA
-TTAAGTCCTCGTAGACTTTCTATATCAACGATTGGTATTATACCTAGTATCAAAAAAATA
-ACCCAGGAATATCCTCAAGTAAATCTTACATTTTCATTACACTCACCTTATAGTGAGGAA
-CGCAGCAAATTGATGCCAATAAATGATAGATACCCAATAGATGAGGTAATGAATATACTC
-GATGAACATATAAGATTAACTTCAAGGAAAGTATATATAGATTATATCATGTTGCCTGGT
-GTAAATGATTCTCTTGAGCATGCAAACGAAGTTGTTAGCCTTCTTAAAAGTCGCTATAAA
-TCAGGGAAGTTATATCATGTAAATTTGATACGATACAATCCTACAATAAGTGCACCTGAG
-ATGTATGGAGAAGCAAACGAAGGGCAGGTAGAAGCCTTTTACAAAGTTTTGAAGTCTGCT
-GGTATCCATGTCACAATTAGAAGTCAATTTGGGATTGATATTGACGCTGCTTGTGGTCAA
-TTATATGGTAATTATCAAAATAGCCAATAG
>erm(48)_1_LT223129
ATGAATAACAAAAACCCAAAAGATTCACAAAATTTTATAACATCTCAAAAATATATAAAT
GAAATCTTACAAAATACTAATATAGAATCAAATGACAATATCATTGAAATAGGAACAGGC
=====================================
debian-tests-data/db_resfinder/oxazolidinone.fsa
=====================================
@@ -1,41 +1,3 @@
->cfr_1_AM408573
-ATGAATTTTAATAATAAAACAAAGTATGGTAAAATACAGGAATTTTTAAGAAGTAATAAT
-GAGCCTGATTATAGAATAAAACAAATAACCAATGCGATTTTTAAACAAAGAATTAGTCGA
-TTTGAGGATATGAAGGTTCTTCCAAAATTACTTAGGGAGGATTTAATAAATAATTTTGGA
-GAAACAGTTTTGAATATCAAGCTCTTAGCAGAGCAAAATTCAGAGCAAGTTACGAAAGTG
-CTTTTTGAAGTATCAAAGAATGAGAGAGTAGAAACGGTAAACATGAAGTATAAAGCAGGT
-TGGGAGTCATTTTGTATATCATCACAATGCGGATGTAATTTTGGGTGTAAATTTTGTGCT
-ACAGGCGACATTGGATTGAAAAAAAACCTAACTGTAGATGAGATAACAGATCAAGTTTTA
-TACTTCCATTTATTAGGTCATCAAATTGATAGCATTTCTTTTATGGGAATGGGTGAAGCT
-CTAGCCAACCGTCAAGTATTTGATGCTCTTGATTCGTTTACGGATCCTAATTTATTTGCA
-TTAAGTCCTCGTAGACTTTCTATATCAACGATTGGTATTATACCTAGTATCAAAAAAATA
-ACCCAGGAATATCCTCAAGTAAATCTTACATTTTCATTACACTCACCTTATAGTGAGGAA
-CGCAGCAAATTGATGCCAATAAATGATAGATACCCAATAGATGAGGTAATGAATATACTC
-GATGAACATATAAGATTAACTTCAAGGAAAGTATATATAGCTTATATCATGTTGCCTGGT
-GTAAATGATTCTCTTGAGCATGCAAACGAAGTTGTTAGCCTTCTTAAAAGTCGCTATAAA
-TCAGGGAAGTTATATCATGTAAATTTGATACGATACAATCCTACAATAAGTGCACCTGAG
-ATGTATGGAGAAGCAAACGAAGGGCAGGTAGAAGCCTTTTACAAAGTTTTGAAGTCTGCT
-GGTATCCATGTCACAATTAGAAGTCAATTTGGGATTGATATTGACGCTGCTTGTGGTCAA
-TTATATGGTAATTATCAAAATAGCCAATAG
->cfr_2_AJ879565
-ATGAATTTTAATAATAAAACAAAGTATGGTAAAATACAGGAATTTTTAAGAAGTAATAAT
-GAGCCTGATTATAGAATAAAACAAATAACCAATGCGATTTTTAAACAAAGAATTAGTCGA
-TTTGAGGATATGAAGGTTCTTCCAAAATTACTTAGGGAGGATTTAATAAATAATTTTGGA
-GAAACAGTTTTGAATATCAAGCTCTTAGCAGAGCAAAATTCAGAGCAAGTTACGAAAGTG
-CTTTTTGAAGTATCAAAGAATGAGAGAGTAGAAACGGTAAACATGAAGTATAAAGCAGGT
-TGGGAGTCATTTTGTATATCATCACAATGCGGATGTAATTTTGGGTGTAAATTTTGTGCT
-ACAGGCGACATTGGATTGAAAAAAAACCTAACTGTAGATGAGATAACAGATCAAGTTTTA
-TACTTCCATTTATTAGGTCATCAAATTGATAGCATTTCTTTTATGGGAATGGGTGAAGCT
-CTAGCCAACCGTCAAGTATTTGATGCTCTTGATTCGTTTACGGATCCTAATTTATTTGCA
-TTAAGTCCTCGTAGACTTTCTATATCAACGATTGGTATTATACCTAGTATCAAAAAAATA
-ACCCAGGAATATCCTCAAGTAAATCTTACATTTTCATTACACTCACCTTATAGTGAGGAA
-CGCAGCAAATTGATGCCAATAAATGATAGATACCCAATAGATGAGGTAATGAATATACTC
-GATGAACATATAAGATTAACTTCAAGGAAAGTATATATAGATTATATCATGTTGCCTGGT
-GTAAATGATTCTCTTGAGCATGCAAACGAAGTTGTTAGCCTTCTTAAAAGTCGCTATAAA
-TCAGGGAAGTTATATCATGTAAATTTGATACGATACAATCCTACAATAAGTGCACCTGAG
-ATGTATGGAGAAGCAAACGAAGGGCAGGTAGAAGCCTTTTACAAAGTTTTGAAGTCTGCT
-GGTATCCATGTCACAATTAGAAGTCAATTTGGGATTGATATTGACGCTGCTTGTGGTCAA
-TTATATGGTAATTATCAAAATAGCCAATAG
>optrA_1_KP399637
TTGTCCAAAGCCACCTTTGCAATTGCTAGTACTAACGCAAAGGAGGATATGAAAATGCAA
TACAAAATAATTAATGGTGCCGTTTACTATGATGGTAATATGGTGTTGGAAAACATCGGT
@@ -716,25 +678,6 @@ GAGAAGGAGAAGGTTAAGAAGGAGAAACGAATTGAAAAGCTTGAAGTGTTAATAAATCAA
TATGATGAAGAATTAGAAAGATTGAATAAAATCATTTCTGAACCAAACAATTCTTCTGAT
TATATAGTACTGACGGAAATACAAAAATCAATTGATGATGTTAAAAGGTGTCAGGGTAAT
TATTTTAATGAATGGGAACAGTTGATGAGAGAATTGGAAGTTATGTAA
->cfr(B)_3_KR610408
-ATGCAACAAAAAAATAAGTATATAAGAATTCAAGAGTTCTTGAAGCAGAATAAATTTCCT
-AATTATAGAATGAAACAAATTACAAATGCTATATTCCCAGGGAGAATAAATAATTTCAAC
-GAAATAACGGTTCTTCCTAAATCACTAAGAGATATGTTAATTGAGGAGTTTGGAGAATCG
-ATTTTAAATATTGTTCCTTTAAAAGCACAACAATCTACACAAGTTTCAAAAGTCTTATTT
-GGAATTTCAGGAGACGAAAAAATAGAAACGGTAAATATGAAATATAAAGCTGGTTGGGAG
-TCATTTTGTATATCATCGCAGTGCGGTTGTAATTTTGGTTGTAAATTTTGTGCAACTGGA
-GATATAGGTTTAAAACGTAACTTAACTTCAGATGAAATTACTGACCAGATTTTGTACTTT
-CACTTACAAGGGCATTCAATTGACAGTATTTCTTTTATGGGAATGGGAGAAGCATTAGCG
-AATGTACAAGTTTTTGATGCTTTAAATGTACTTACAGATCCTGCGTTGTTTGCTTTAAGT
-CCGCGTAGGTTATCTATATCCACTATAGGAATTATTCCAAACATTAAAAAATTGACTCAA
-AACTATCCGCAGGTCAACCTGACATTTTCATTACATTCTCCTTTTAATGAACAGCGAAGT
-GAGTTAATGCCAATTAATGAACGCTACCCATTATCAGATGTGATGGATACATTAGATGAG
-CATATACGAGTAACCTCAAGAAAAGTTTATATTGCTTATATTATGTTGCACGGAGTTAAT
-GATTCTATTGAACATGCGAAAGAAGTCGTAAACCTTTTAAGAGGTAGATATAGGAGTGGG
-AACTTGTATCATGTGAACATCATTAGATATAACCCGACTGTTAGTTCACGGATGCGGTTT
-GAAGAAGCAAATGAGAAATGTCTTGTCAACTTTTATAAAAAATTAAAGTCAGCAGGAATT
-AAAGTTACCATTAGAAGTCAATTTGGCATTGATATAGACGCTGCTTGCGGTCAATTGTAT
-GGAAACTATCAAAAAACCAATAGCCAGTAA
>poxtA_1_MF095097
ATGAAAGGTAAAAATATGAATTTAGCCTTTGGGTTGGAAGAAATTTATGAGGATGCTGAG
TTTCAAATCGGAGATTTGGATAAGGTCGGTATTGTCGGCGTGAACGGAGCCGGAAAGACC
=====================================
debian-tests-data/db_resfinder/quinolone.fsa
=====================================
@@ -1498,27 +1498,6 @@ GATGCGCTGCAGGCGGCCGGTGCCTCGCTCGGGGGCGCCGTGCACCTGGCCGACACCCTG
CCGGCGTGGCAGGGCGCGGCCTTGCTGGCGGCCGCACGCGCGGGCTTCACCGATGCGCTG
CAGGCCACGGCCTGGGCCGGCGCGGTGCTGGTGCTGGTGGCCGCTGGGCTGGTGGCGCGC
CTGCTGCGCAAGCGCCCAGCGCTCGCATCTGGTTGA
->aac(6')-Ib-cr_1_DQ303918
-ATGAGCAACGCAAAAACAAAGTTAGGCATCACAAAGTACAGCATCGTGACCAACAGCAAC
-GATTCCGTCACACTGCGCCTCATGACTGAGCATGACCTTGCGATGCTCTATGAGTGGCTA
-AATCGATCTCATATCGTCGAGTGGTGGGGCGGAGAAGAAGCACGCCCGACACTTGCTGAC
-GTACAGGAACAGTACTTGCCAAGCGTTTTAGCGCAAGAGTCCGTCACTCCATACATTGCA
-ATGCTGAATGGAGAGCCGATTGGGTATGCCCAGTCGTACGTTGCTCTTGGAAGCGGGGAC
-GGACGGTGGGAAGAAGAAACCGATCCAGGAGTACGCGGAATAGACCAGTTACTGGCGAAT
-GCATCACAACTGGGCAAAGGCTTGGGAACCAAGCTGGTTCGAGCTCTGGTTGAGTTGCTG
-TTCAATGATCCCGAGGTCACCAAGATCCAAACGGACCCGTCGCCGAGCAACTTGCGAGCG
-ATCCGATGCTACGAGAAAGCGGGGTTTGAGAGGCAAGGTACCGTAACCACCCCATATGGT
-CCAGCCGTGTACATGGTTCAAACACGCCAGGCATTCGAGCGAACACGCAGTGATGCCTAA
->aac(6')-Ib-cr_2_EF636461
-ATGACTGAGCATGACCTTGCGATGCTCTATGAGTGGCTAAATCGATCTCATATCGTCGAG
-TGGTGGGGCGGAGAAGAAGCACGCCCGACACTTGCTGACGTACAGGAACAGTACTTGCCA
-AGCGTTTTAGCGCAAGAGTCCGTCACTCCATACATTGCAATGCTGAATGGAGAGCCGATT
-GGGTATGCCCAGTCGTACGTTGCTCTTGGAAGCGGGGACGGAAGGTGGGAAGAAGAAACC
-GATCCAGGAGTACGCGGAATAGACCAGTTACTGGCGAATGCATCACAACTGGGCAAAGGC
-TTGGGAACCAAGCTGGTTCGAGCTCTGGTTGAGTTGCTGTTCAATGATCCCGAGGTCACC
-AAGATCCAAACGGACCCGTCGCCGAGCAACTTGCGAGCGATCCGATGCTACGAGAAAGCG
-GGGTTTGAGAGGCAAGGTACCGTAACCACCCCATATGGTCCAGCCGTGTACATGGTTCAA
-ACACGCCAGGCATTCGAGCGAACACGCAGTGATGCCTAA
>OqxA_1_EU370913
ATGAGCCTGCAAAAAACCTGGGGAAACATTCACCTGACCGCGCTCGGCGCGATGATGCTC
TCCTTTCTGCTCGTCGGCTGCGACGACAGCGTCGCACAGAATGCTGCGCCTCCCGCCCCG
=====================================
dockerfile
=====================================
@@ -14,13 +14,13 @@ RUN apt-get update -qq; \
# Install KMA
RUN cd /usr/src/; \
- git clone --depth 1 -b 1.4.7 https://bitbucket.org/genomicepidemiology/kma.git; \
+ git clone --depth 1 -b 1.4.11 https://bitbucket.org/genomicepidemiology/kma.git; \
cd kma && make; \
mv kma /usr/bin/; \
cd ..; \
rm -rf kma/;
-ENV RESFINDER_VERSION 4.2.3
+ENV RESFINDER_VERSION 4.2.4
# Install ResFinder
RUN pip install --no-cache-dir resfinder==$RESFINDER_VERSION
=====================================
src/resfinder/__init__.py
=====================================
@@ -1 +1 @@
-__version__ = '4.2.3'
+__version__ = '4.2.4'
=====================================
src/resfinder/amr_abbreviations.md
=====================================
@@ -90,6 +90,7 @@ Exception is the ResFinder results "unknown <class>". See bottom of list.
| Lincomycin | LIC |
| Linezolid | LIN |
| Lividomycin | LIV |
+| Maduramicin | MAD |
| Marbofloxacin | MAR |
| Mecillinam | MEC |
| Meropenem | MER |
@@ -99,6 +100,7 @@ Exception is the ResFinder results "unknown <class>". See bottom of list.
| Moxifloxacin | MOX |
| Mupirocin | MUP |
| Nalidixic acid | NAL |
+| Narasin | NAR |
| Neomycin | NEO |
| Netilmicin | NET |
| Nitrofurantoin | NIT |
@@ -132,6 +134,7 @@ Exception is the ResFinder results "unknown <class>". See bottom of list.
| Ribostamycin | RIB |
| Rifampicin | RIF |
| Roxithromycin | ROX |
+| Salinomycin | SAL |
| Sisomicin | SIS |
| Spectinomycin | SPE |
| Spiramycin | SPI |
=====================================
src/resfinder/cge/config.py
=====================================
@@ -40,6 +40,7 @@ class Config():
"threshold_point": 0.8,
"ignore_indels": False,
"ignore_stop_codons": False,
+ "output_aln": False,
"pickle": False
}
@@ -95,7 +96,7 @@ class Config():
.format(self.resfinder_root))
self.amr_abbreviations = LoadersMixin.load_md_table_after_keyword(
amr_abbreviations_file, "## Abbreviations")
-
+ self.output_aln = bool(args.output_aln)
self.pickle = args.pickle
@staticmethod
=====================================
src/resfinder/cge/output/gene_result.py
=====================================
@@ -6,7 +6,7 @@ from ..phenotype2genotype.res_profile import PhenoDB
class GeneResult(dict):
- def __init__(self, res_collection, res, db_name):
+ def __init__(self, res_collection, res, db_name, conf=None):
"""
Input:
res_collection: Result object created by the cgelib package.
@@ -28,7 +28,6 @@ class GeneResult(dict):
self["name"], self.variant, self["ref_acc"] = (
GeneResult._split_sbjct_header(self["ref_id"]))
self["ref_database"] = [res_collection.get_db_key(db_name)[0]]
-
self["identity"] = res["perc_ident"]
self["alignment_length"] = res["HSP_length"]
self["ref_seq_lenght"] = res["sbjct_length"]
@@ -40,6 +39,12 @@ class GeneResult(dict):
self["query_end_pos"] = res["query_end"] # Positional essential
self["pmids"] = []
self["notes"] = []
+
+ if conf and conf.output_aln:
+ self["query_string"] = res["query_string"]
+ self["alignment_string"] = res["homo_string"]
+ self["ref_string"] = res["sbjct_string"]
+
# BLAST coverage formatted results
coverage = res.get("coverage", None)
if(coverage is None):
=====================================
src/resfinder/cge/output/phenotype_result.py
=====================================
@@ -97,7 +97,10 @@ class PhenotypeResult(dict):
feature.unique_id]
elif(isinstance(feature, ResMutation)):
type = "seq_variations"
- ref_id = feature.unique_id
+ if feature.nuc_format == '':
+ ref_id = feature.aa_format
+ else:
+ ref_id = feature.nuc_format
return (ref_id, type)
@staticmethod
=====================================
src/resfinder/cge/output/seq_variation_result.py
=====================================
@@ -26,6 +26,7 @@ class SeqVariationResult(dict):
if(len(self["ref_codon"]) == 3):
self["codon_change"] = ("{}>{}".format(self["ref_codon"],
self["var_codon"]))
+ self["nuc_change"] = mismatch[3].lower()
if(len(mismatch) > 7):
self["ref_aa"] = mismatch[7].lower()
@@ -81,7 +82,7 @@ class SeqVariationResult(dict):
region_name = region_results[0]["ref_id"].replace("_", ";;")
region_name = PhenoDB.if_promoter_rename(region_name)
- if(len(mismatch) > 7):
+ if(len(mismatch) > 7 and mismatch[0] == 'sub'):
self["ref_id"] = ("{id}{deli}{pos}{deli}{var}"
.format(id=region_name,
pos=self["ref_start_pos"],
@@ -93,12 +94,12 @@ class SeqVariationResult(dict):
else:
self["ref_id"] = ("{id}{deli}{pos}{deli}{var}"
.format(id=region_name,
- pos=self["ref_start_pos"],
- var=self["var_codon"], deli="_"))
+ pos=self["ref_end_pos"],
+ var=self["nuc_change"], deli="_"))
minimum_key = ("{id}{deli}{pos}{deli}{var}"
.format(id=region_name,
- pos=self["ref_start_pos"],
- var=self["var_codon"], deli=";;"))
+ pos=self["ref_end_pos"],
+ var=self["nuc_change"], deli=";;"))
gene_key = SeqVariationResult._get_rnd_unique_seqvar_key(
res_collection, minimum_key)
=====================================
src/resfinder/cge/output/std_results.py
=====================================
@@ -15,7 +15,7 @@ from .phenotype_result import PhenotypeResult
class ResFinderResultHandler():
@staticmethod
- def standardize_results(res_collection, res, ref_db_name):
+ def standardize_results(res_collection, res, ref_db_name, conf):
"""
Input:
res_collection: Result object created by the cge core module.
@@ -34,7 +34,8 @@ class ResFinderResultHandler():
for unique_id, hit_db in db.items():
if(unique_id in res["excluded"]):
continue
- gene_result = GeneResult(res_collection, hit_db, ref_db_name)
+ gene_result = GeneResult(res_collection, hit_db, ref_db_name,
+ conf)
if gene_result["key"] is None:
continue
=====================================
src/resfinder/cge/phenotype2genotype/feature.py
=====================================
@@ -94,7 +94,7 @@ class Mutation(Gene):
ref_codon=None, mut_codon=None, ref_aa=None,
ref_aa_right=None, mut_aa=None, isolate=None, insertion=None,
deletion=None, end=None, nuc=False, premature_stop=0,
- frameshift=None, ref_db=None):
+ frameshift=None, ref_db=None, nuc_format=None, aa_format=None):
Gene.__init__(self, unique_id=unique_id, seq_region=seq_region,
start=pos, end=end, hit=hit, isolate=isolate,
ref_db=ref_db)
@@ -111,6 +111,9 @@ class Mutation(Gene):
# Indicate how many percent the region was truncated.
self.premature_stop = Feature.na2none(premature_stop)
self.frameshift = Feature.na2none(frameshift)
+ #key format for indels
+ self.nuc_format = nuc_format
+ self.aa_format = aa_format
# Create mutation description
if(insertion is True and deletion is True):
@@ -223,7 +226,7 @@ class ResMutation(Mutation, Resistance):
ref_aa_right=None, mut_aa=None, isolate=None, insertion=None,
deletion=None, end=None, nuc=False, premature_stop=False,
frameshift=None, ab_class=None, pmids=None, notes=None,
- ref_db=None):
+ ref_db=None, nuc_format=None, aa_format=None):
Mutation.__init__(self, unique_id=unique_id, seq_region=seq_region,
pos=pos, hit=hit, ref_codon=ref_codon,
mut_codon=mut_codon, ref_aa=ref_aa,
@@ -231,5 +234,6 @@ class ResMutation(Mutation, Resistance):
isolate=isolate, insertion=insertion,
deletion=deletion, end=end, nuc=nuc,
premature_stop=premature_stop, frameshift=frameshift,
- ref_db=ref_db)
+ ref_db=ref_db, nuc_format=nuc_format,
+ aa_format=aa_format)
Resistance.__init__(self, ab_class=ab_class, pmids=pmids, notes=notes)
=====================================
src/resfinder/cge/phenotype2genotype/isolate.py
=====================================
@@ -171,7 +171,8 @@ class Isolate(dict):
start_feat = None
end_feat = None
- phenotypes = phenodb.get(unique_id, None)
+ phenotypes, unique_id = self.get_phenotypes(phenodb,
+ unique_id)
ab_class = set()
if(phenotypes):
for p in phenotypes:
@@ -211,21 +212,41 @@ class Isolate(dict):
with the ones from the Result object.
"""
if(type == "seq_variations"):
- ref_id = feat_res_dict["ref_id"]
var_aa = feat_res_dict.get("var_aa", None)
+ var_codon = feat_res_dict.get("var_codon", None)
+ codon_change = feat_res_dict.get("codon_change", None)
+ indels = feat_res_dict['deletion'] or feat_res_dict['insertion']
- # Not Amino acid mutation
- if(var_aa is None):
+ # Not point mutation
+ if(var_aa is None and var_codon is None):
return feat_res_dict["seq_regions"][0]
+ # RNA mutation(single nucleotide)
+ elif(len(var_codon)==1 and codon_change is None):
+ nuc_format = (f"{feat_res_dict['seq_regions'][0]}"
+ f"_{feat_res_dict['ref_start_pos']}"
+ f"_{feat_res_dict['var_codon']}")
+ aa_format = ""
+ return nuc_format, aa_format
+ # indels
+ elif (indels):
+ nuc_format = (f"{feat_res_dict['seq_regions'][0]}"
+ f"_{feat_res_dict['ref_end_pos']}"
+ f"_{feat_res_dict['nuc_change']}")
+ aa_format = (f"{feat_res_dict['seq_regions'][0]}"
+ f"_{feat_res_dict['ref_start_pos']}"
+ f"_{feat_res_dict['var_aa']}")
+ return nuc_format, aa_format
# Amino acid mutation
else:
- return (f"{feat_res_dict['seq_regions'][0]}"
- f"_{feat_res_dict['ref_start_pos']}"
- f"_{feat_res_dict['var_aa']}")
+ nuc_format = ""
+ aa_format = (f"{feat_res_dict['seq_regions'][0]}"
+ f"_{feat_res_dict['ref_start_pos']}"
+ f"_{feat_res_dict['var_aa']}")
+ return nuc_format, aa_format
elif(type == "seq_regions"):
- return "{}_{}".format(feat_res_dict["name"],
- feat_res_dict["ref_acc"])
+ return ("{}_{}".format(feat_res_dict["name"],
+ feat_res_dict["ref_acc"]), "")
def load_finder_results(self, std_table, phenodb, type):
"""
@@ -245,23 +266,60 @@ class Isolate(dict):
for entry in feat_info["ref_database"])):
continue
- unique_id = Isolate.get_phenodb_id(feat_info, type)
+ unique_id_nuc, unique_id_aa = Isolate.get_phenodb_id(feat_info,
+ type)
+ phenotypes, unique_id = self.get_phenotypes(phenodb, unique_id_nuc,
+ unique_id_aa, type)
feat_list = self.get(unique_id, [])
-
- phenotypes = phenodb.get(unique_id, None)
-
if(phenotypes):
for p in phenotypes:
res_feature = self.new_res_feature(type, feat_info,
- unique_id, p)
+ unique_id,
+ unique_id_aa,
+ unique_id_nuc, p)
if(res_feature not in feat_list):
feat_list.append(res_feature)
else:
- res_feature = self.new_res_feature(type, feat_info, unique_id)
+ res_feature = self.new_res_feature(type, feat_info,
+ unique_id,
+ unique_id_aa,
+ unique_id_nuc)
feat_list.append(res_feature)
- self[unique_id] = feat_list
+ if unique_id_nuc == "":
+ self[unique_id_aa] = feat_list
+ else:
+ self[unique_id_nuc] = feat_list
+
+ def get_phenotypes(self, phenodb, unique_id_nuc, unique_id_aa, type):
+ """
+ Input:
+ phenodb: PhenoDB object
+ unique_id: string in key format fitting the phenodb objects
+ Output:
+ returns a phenodb object based on the unique id and the unique
+ id used.
+ Method modifies unique id to account for mutation type (AA/NUC)
+ using the pointfinder database.
+ """
+ if (phenodb.mut_type_is_defined
+ and type == "seq_variations"):
+ unique_id_aa = unique_id_aa + '_AA'
+ unique_id_nuc = unique_id_nuc + '_NUC'
+
+ phenotype_aa = phenodb.get(unique_id_aa, None)
+ phenotype_nuc = phenodb.get(unique_id_nuc, None)
+
+ if phenotype_nuc:
+ phenotypes = phenotype_nuc
+ unique_id_found = unique_id_nuc
+ else:
+ phenotypes = phenotype_aa
+ unique_id_found = unique_id_aa
+
+ return phenotypes, unique_id_found
- def new_res_feature(self, type, feat_info, unique_id, phenotype=None):
+ def new_res_feature(self, type, feat_info, unique_id, aa_format, nuc_format,
+ phenotype=None):
ab_class = set()
if phenotype is None:
@@ -277,13 +335,14 @@ class Isolate(dict):
phenotype)
elif(type == "seq_variations"):
res_feature = self.new_res_mut(feat_info, unique_id, ab_class,
- phenotype)
+ phenotype, nuc_format, aa_format)
ResProfile.update_classes_dict_of_feature_sets(
self.feature_classes, res_feature)
return res_feature
- def new_res_mut(self, feat_info, unique_id, ab_class, phenotype):
+ def new_res_mut(self, feat_info, unique_id, ab_class, phenotype, nuc_format,
+ aa_format):
ref_aa = feat_info.get("ref_aa", None)
if(ref_aa is None or ref_aa.upper() == "NA"):
@@ -294,6 +353,8 @@ class Isolate(dict):
if phenotype:
feat_res = ResMutation(
unique_id=unique_id,
+ nuc_format=nuc_format,
+ aa_format=aa_format,
seq_region=";;".join(feat_info["seq_regions"]),
pos=feat_info["ref_start_pos"],
ref_codon=feat_info["ref_codon"],
@@ -313,6 +374,7 @@ class Isolate(dict):
else:
feat_res = ResMutation(
unique_id=unique_id,
+ nuc_format= nuc_format,
seq_region=";;".join(feat_info["seq_regions"]),
pos=feat_info["ref_start_pos"],
ref_codon=feat_info["ref_codon"],
=====================================
src/resfinder/cge/phenotype2genotype/res_profile.py
=====================================
@@ -51,6 +51,9 @@ class PhenoDB(dict):
self.load_disinfectant_db(disinf_file)
if(point_file):
+ # mut_type_is_defined indicates the new db with AA/NUC
+ # instead of DNA/RNA
+ self.mut_type_is_defined = False
if os.path.basename(point_file) == "resistens-overview.txt":
self.load_point_old_db(point_file)
else:
@@ -210,9 +213,16 @@ class PhenoDB(dict):
line_list = list(map(str.rstrip, line_list))
# ID in DB is Gene-AAMut-Pos and is not unique
gene_id = line_list[0]
+ mut_type = line_list[1]
codon_pos = line_list[3]
res_codon_str = line_list[6].lower()
+ if mut_type == 'AA' or mut_type == 'NUC':
+ self.mut_type_is_defined = True
+
+ if not self.mut_type_is_defined:
+ eprint("Warning: Your PointFinder database is not "
+ "up to date. This may effect the results.")
# Check if the entry is with a promoter
gene_id = PhenoDB.if_promoter_rename(gene_id)
@@ -225,8 +235,13 @@ class PhenoDB(dict):
# TODO: Remove this tuple and its dependencies.
sug_phenotype = ()
- unique_id = "%s_%s_%s" % (gene_name, codon_pos,
- res_codon_str)
+ if self.mut_type_is_defined:
+ unique_id = "%s_%s_%s_%s" % (gene_name, codon_pos,
+ res_codon_str, mut_type)
+ else:
+ unique_id = "%s_%s_%s" % (gene_name, codon_pos,
+ res_codon_str)
+
abs = []
for ab_name in pub_phenotype:
# TODO: Fix database
@@ -312,8 +327,12 @@ class PhenoDB(dict):
res_codon = self.get_csv_tuple(line_list[6].lower())
if(len(res_codon) > 1):
for codon in res_codon:
- unique_id_alt = (gene_name + "_" + codon_pos
- + "_" + codon)
+ if self.mut_type_is_defined:
+ unique_id_alt = (gene_name + "_" + codon_pos
+ + "_" + codon + "_" + mut_type)
+ else:
+ unique_id_alt = (gene_name + "_" + codon_pos
+ + "_" + codon)
self[unique_id_alt] = pheno_lst
except IndexError:
eprint("Error in line " + str(line_counter))
@@ -752,7 +771,6 @@ class Phenotype(object):
self.req_muts = req_muts
self.res_database = res_db
-
class Antibiotics(object):
""" Class is implemented to be key in a dict. The class can be tested
against isinstances of itself and strings.
=====================================
src/resfinder/cge/pointfinder.py
=====================================
@@ -63,7 +63,7 @@ class PointFinder(CGEFinder):
# Creat user defined gene_list if given
if(gene_list is not None):
- self.gene_list = get_user_defined_gene_list(gene_list)
+ self.gene_list = self.get_user_defined_gene_list(gene_list)
# Depends on database format, current or legacy
if os.path.exists(f"{self.specie_path}/{self.PHENOTYPE_FILE}"):
@@ -383,8 +383,6 @@ class PointFinder(CGEFinder):
fh.write(result_str[0])
with open(out_path + "/PointFinder_table.txt", "w") as fh:
fh.write(result_str[1])
- with open(out_path + "/PointFinder_prediction.txt", "w") as fh:
- fh.write(result_str[2])
@staticmethod
def discard_unknown_muts(results_pnt, phenodb):
@@ -410,9 +408,26 @@ class PointFinder(CGEFinder):
gene_db_id = gene_ref_id.replace("_", ";;")
known = []
for mis_match in mis_matches:
- mis_match_key = (f"{gene_db_id}_{mis_match[1]}"
- f"_{mis_match[-1].lower()}")
- if mis_match_key in phenodb:
+ if mis_match[0] != 'sub':
+ mis_match_key_nuc = (f"{gene_db_id}_{mis_match[2]}"
+ f"_{mis_match[3].lower()}")
+ mis_match_key_aa = (f"{gene_db_id}_{mis_match[1]}"
+ f"_{mis_match[-1].lower()}")
+ elif mis_match[4].startswith('r.'):#RNA substitution
+ mis_match_key_aa = ""
+ mis_match_key_nuc = (f"{gene_db_id}_{mis_match[2]}"
+ f"_{mis_match[3].lower()}")
+ else:
+ mis_match_key_aa = (f"{gene_db_id}_{mis_match[2]}"
+ f"_{mis_match[-1].lower()}")
+ mis_match_key_nuc = ""
+
+ if phenodb.mut_type_is_defined:
+ mis_match_key_nuc = mis_match_key_nuc + '_NUC'
+ mis_match_key_aa = mis_match_key_aa + '_AA'
+
+ if (mis_match_key_nuc in phenodb
+ or mis_match_key_aa in phenodb):
known.append(mis_match)
return known
@@ -902,9 +917,10 @@ class PointFinder(CGEFinder):
# Initiate the mis_matches list that will store all found mis matcehs
mis_matches = []
+ gene_name = gene.split('_')[0]
# Find mis matches in RNA genes
- if gene in self.RNA_gene_list:
+ if gene_name in self.RNA_gene_list:
mis_matches += PointFinder.find_nucleotid_mismatches(sbjct_start,
sbjct_seq,
qry_seq)
@@ -1037,16 +1053,17 @@ class PointFinder(CGEFinder):
if sbjct_nuc == "-" or qry_nuc == "-":
if sbjct_nuc == "-":
mut = "ins"
- indel_start_pos = (seq_pos - 1) * factor
- indel_end_pos = seq_pos * factor
+ indel_start_pos = (seq_pos - 1) * factor + sbjct_start
+ indel_end_pos = seq_pos * factor + sbjct_start
indel = PointFinder.find_nuc_indel(sbjct_seq[i:],
qry_seq[i:])
else:
mut = "del"
- indel_start_pos = seq_pos * factor
+ indel_start_pos = seq_pos * factor + (sbjct_start - 1)
indel = PointFinder.find_nuc_indel(qry_seq[i:],
sbjct_seq[i:])
- indel_end_pos = (seq_pos + len(indel) - 1) * factor
+ indel_end_pos = (seq_pos + len(indel) - 1) * factor \
+ + (sbjct_start - 1)
seq_pos += len(indel) - 1
# Shift the index to the end of the indel
@@ -1067,16 +1084,17 @@ class PointFinder(CGEFinder):
mut_name += (str(indel_start_pos) + "_"
+ str(indel_end_pos) + mut + indel)
- mis_matches += [[mut, seq_pos * factor, seq_pos * factor,
+ mis_matches += [[mut, indel_start_pos, indel_end_pos,
indel, mut_name, mut, indel]]
# Check for substitutions mutations
else:
mut = "sub"
- mut_name += (str(seq_pos * factor) + sbjct_nuc + ">"
- + qry_nuc)
+ pos = seq_pos * factor + (sbjct_start - 1)
+ mut_name += (str(pos)
+ + sbjct_nuc + ">" + qry_nuc)
- mis_matches += [[mut, seq_pos * factor, seq_pos * factor,
+ mis_matches += [[mut, pos, pos,
qry_nuc, mut_name, sbjct_nuc, qry_nuc]]
# Increment sequence position
@@ -1444,7 +1462,7 @@ class PointFinder(CGEFinder):
mut = indel_data[0]
codon_no_indel = indel_data[1]
- seq_pos = indel_data[2] + sbjct_start - 1
+ seq_pos = indel_data[2]
indel = indel_data[3]
indel_no += 1
@@ -1799,30 +1817,31 @@ class PointFinder(CGEFinder):
# Go through each mutation
for i in range(len(mis_matches)):
m_type = mis_matches[i][0]
- pos = mis_matches[i][1] # sort on pos?
- look_up_pos = mis_matches[i][2]
+ codon_pos = mis_matches[i][1] # sort on codon_pos?
+ nuc_pos = mis_matches[i][2]
look_up_mut = mis_matches[i][3]
mut_name = mis_matches[i][4]
nuc_ref = mis_matches[i][5]
nuc_alt = mis_matches[i][6]
ref = mis_matches[i][-2]
- alt = mis_matches[i][-1]
+ aa_alt = mis_matches[i][-1]
# First index in list indicates if mutation is known
output_mut += [[]]
# Define output vaiables
codon_change = nuc_ref + " -> " + nuc_alt
- aa_change = ref + " -> " + alt
+ aa_change = ref + " -> " + aa_alt
if RNA is True:
aa_change = "RNA mutations"
- elif pos < 0:
+ elif codon_pos < 0:
aa_change = "Promoter mutations"
# Check if mutation is known
gene_mut_name, resistence, pmid = self.look_up_known_muts(
- gene, look_up_pos, look_up_mut, m_type, gene_name)
+ gene, nuc_pos, aa_alt, m_type, gene_name, nuc_alt, codon_pos,
+ look_up_mut)
# Make lists to strings
if resistence != "Unknown":
@@ -1874,7 +1893,7 @@ class PointFinder(CGEFinder):
if "Premature stop codon" in mut_name:
sbjct_len = hit['sbjct_length']
- qry_len = pos * 3
+ qry_len = codon_pos * 3
prec_truckat = round(
((float(sbjct_len) - qry_len)
/ float(sbjct_len))
@@ -1896,6 +1915,7 @@ class PointFinder(CGEFinder):
resistence_lst = []
for mut in output_mut:
for res in mut[3].split(","):
+ res = res.lstrip()
resistence_lst.append(res)
# Save known mutations
@@ -1931,7 +1951,8 @@ class PointFinder(CGEFinder):
return line_lst[0][0]
return line_lst
- def look_up_known_muts(self, gene, pos, found_mut, mut, gene_name):
+ def look_up_known_muts(self, gene, nuc_pos, found_mut, mut, gene_name,
+ found_nuc, codon_pos, found_nuc_del):
"""
This function uses the known_mutations dictionay, a gene a
string with the gene key name, a gene position as integer,
@@ -1948,41 +1969,70 @@ class PointFinder(CGEFinder):
"""
resistence = "Unknown"
pmid = "-"
- found_mut = found_mut.upper()
gene_ID = gene.split("_")[0]
+ found_nuc = found_nuc.upper()
+ found_mut = found_mut.upper()
+
if mut == "del":
- for i, i_pos in enumerate(range(pos, pos + len(found_mut))):
+ found_mut = found_nuc_del.upper()
+ for i, i_pos in enumerate(range(nuc_pos, nuc_pos + len(found_mut))):
- known_indels = self.known_mutations[gene_ID]["del"].get(i_pos, [])
+ known_indels = self.known_mutations[gene_ID]["del"].get(i_pos,
+ [])
for known_indel in known_indels:
partial_mut = found_mut[i:len(known_indel) + i]
# Check if part of found mut is known and check if
# found mut and known mut is in the same reading
# frame
- if(partial_mut == known_indel
- and len(found_mut) % 3 == len(known_indel) % 3):
-
- resistence = (self.known_mutations[gene_ID]["del"][i_pos]
- [known_indel]['drug'])
+ if (partial_mut == known_indel
+ and len(found_mut) % 3 == len(known_indel) % 3):
+ resistence = (
+ self.known_mutations[gene_ID]["del"][i_pos]
+ [known_indel]['drug'])
pmid = (self.known_mutations[gene_ID]["del"][i_pos]
- [known_indel]['pmid'])
+ [known_indel]['pmid'])
gene_name = (self.known_mutations[gene_ID]["del"][i_pos]
- [known_indel]['gene_name'])
+ [known_indel]['gene_name'])
break
+ elif mut == 'sub':
+ if codon_pos in self.known_mutations[gene_ID][mut]:
+ if found_mut in self.known_mutations[gene_ID][mut][codon_pos]:
+ resistence = (self.known_mutations[gene_ID][mut][codon_pos]
+ [found_mut]['drug'])
+
+ pmid = (self.known_mutations[gene_ID][mut][codon_pos]
+ [found_mut]['pmid'])
+
+ gene_name = (self.known_mutations[gene_ID][mut][codon_pos]
+ [found_mut]['gene_name'])
else:
- if pos in self.known_mutations[gene_ID][mut]:
- if found_mut in self.known_mutations[gene_ID][mut][pos]:
+ if (nuc_pos in self.known_mutations[gene_ID][mut]
+ or codon_pos in self.known_mutations[gene_ID][mut]):
+ found_mutation = None
+ pos = None
+ if (codon_pos in self.known_mutations[gene_ID][mut]
+ and found_mut in self.known_mutations[gene_ID][mut][
+ codon_pos]):
+ found_mutation = found_mut
+ pos = codon_pos
+ if (nuc_pos in self.known_mutations[gene_ID][mut]
+ and found_nuc in self.known_mutations[gene_ID][mut][
+ nuc_pos]):
+ found_mutation = found_nuc
+ pos = nuc_pos
+ if found_mutation:
resistence = (self.known_mutations[gene_ID][mut][pos]
- [found_mut]['drug'])
+ [found_mutation]['drug'])
- pmid = (self.known_mutations[gene_ID][mut][pos][found_mut]
- ['pmid'])
+ pmid = (self.known_mutations[gene_ID][mut][pos]
+ [found_mutation]['pmid'])
gene_name = (self.known_mutations[gene_ID][mut][pos]
- [found_mut]['gene_name'])
+ [found_mutation]['gene_name'])
+
# Check if stop codons refer resistance
if "*" in found_mut and gene_ID in self.known_stop_codon:
if resistence == "Unknown":
=====================================
src/resfinder/run_resfinder.py
=====================================
@@ -84,6 +84,10 @@ def main():
"is not found. Point mutations will silently "
"be ignored."),
default=False)
+ parser.add_argument("--output_aln",
+ action="store_true",
+ help="will add the alignments in the json output.",
+ default=False)
# Acquired resistance options
parser.add_argument("-db_res", "--db_path_res",
@@ -253,7 +257,8 @@ def main():
ResFinderResultHandler.standardize_results(std_result,
blast_results.results,
- "ResFinder")
+ "ResFinder",
+ conf)
else:
if(conf.nanopore):
@@ -297,7 +302,8 @@ def main():
ResFinderResultHandler.standardize_results(std_result,
kma_run.results,
- "ResFinder")
+ "ResFinder",
+ conf)
##########################################################################
# DisinFinder
##########################################################################
@@ -332,7 +338,8 @@ def main():
ResFinderResultHandler.standardize_results(std_result,
blast_results.results,
- "DisinFinder")
+ "DisinFinder",
+ conf)
else:
if(conf.nanopore):
@@ -376,7 +383,8 @@ def main():
ResFinderResultHandler.standardize_results(std_result,
kma_run.results,
- "DisinFinder")
+ "DisinFinder",
+ conf)
##########################################################################
# PointFinder
##########################################################################
@@ -454,6 +462,7 @@ def main():
results_pnt=results_pnt, phenodb=res_pheno_db)
# DEPRECATED
+ # mutations in raw reads is only found with write_results.
# TODO: make a write method that depends on the json output
finder.write_results(
out_path=conf.outputPath, result=results, res_type=method,
=====================================
tests/data/test_isolate_11.fa
=====================================
@@ -0,0 +1,2 @@
+>test_16S_rrsB_A523C
+AAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTGGTCTTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAATGGCGCATACAAAGAGAAGCGACCTCGCGAGAGCAAGCGGACCTCATAAAGTGCGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTGGATCAGAATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTA
\ No newline at end of file
=====================================
tests/data/test_isolate_11_1.fq
=====================================
@@ -0,0 +1,200 @@
+ at test_16S_rrsB_A523C-100/1
+GCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTA
++
+CCCCGGCGG=GG=JJJJJGCJJJGGGGJGGJJGJJJJJGJJJGJJJGJJGCJCJGJJJJCJGCGGCCCGGGCCGGGGGGGGGCG=GGGGGC8G8G8GG=GJCGCGGGGCG=GGCCCGG=GG8GGC=CGGCCGGGCGCGCCGGGG=CCCGC
+ at test_16S_rrsB_A523C-98/1
+GGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGC
++
+CCCG=GGGGGGGGJG=JGJJCJGJJJCGJJJGJJJJGJJJGJJJJJJJCGJGJJJGGGGJ=GCCGGGGGGGGCCCGCGGGGGCGC=GGCCCGG1G=GCGCJGGCCGCGCGGGGGG8GG1GGGGGG=GGGG=CGC=GGCCGCG1GGGGGCG
+ at test_16S_rrsB_A523C-96/1
+TTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAACCACCGGCTAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAG
++
+=C1GCGGG1GGGG1GJGJJGGJJ=JJJJJJJJGJJJJJJJJGJGJGJGJJGGGGCGGGGGJCJ8GCJGGGC1GG1GC=CGGGGGGCGGG8GGGCGGCGGCJCCGG8GGGGGGGGG8GGGGC=GGCGCGGCGGGGGGGG1=GGGGG=G==C
+ at test_16S_rrsB_A523C-94/1
+AAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGA
++
+C=CGGG8GGGGGGJGGG8JJGJJJCJJJGJC88JGJJJGCGGJGJG1JGJGJGGGCGCGCCGGG=CGGGCCGGCCGGCCG1GGGCG8GCGGGGGCGGCGGJGGGGGGGG=GCC(CCGCGGG1G=GCCGGCGG1GCGCGGGG1CGGGGGGG
+ at test_16S_rrsB_A523C-92/1
+TCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCACGGAGTTAGC
++
+CCCGGGGGGGGGGJJJJCGGJJJJJGJGJJJGJJGJGGJJJJJGCGJJJJJ=GGJJGJJJCGCJGGJJJ==JGGJ=GC=CCGGCGGCGC=GG=C=CGGCGJGCGCCGGGCCGCGGCGGGCCCGGCCGGGGGGCGGGCCGGGCC8GGGGCC
+ at test_16S_rrsB_A523C-90/1
+AACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCAC
++
+=CCGGGGCGGGGGJJ=GGJJJJJJGJJJGJGGJJJJGJJJGGJJGCG=GGCGGCJGJCGGCCGJCGGJCJJGGCCG=G=GGCGGGGGCCGCGGGGCG=CGJCCGGGGCGGCGGGGGGG=GGCG8GGCGCGCGCGGGGGGGGGGGGC8GCC
+ at test_16S_rrsB_A523C-88/1
+TTCACCGTGGCATTCTGATCCACGATTACTAGCGATTCCGACTTTATGGAGTCGGGTTGCAGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTCTCGCGAGGTCGCTTCTCTTTGTATGCGCCATTGTAGCACGTGTG
++
+CCCGGG1GGGGCGCJGJGJJJJJJJCJJCJGJJGJJJJJJJ8JG(GJGJGGGGG(GGGGJ1JGGGJJGCGGGGJ8GGCGCGGGGGGCGGG=GG(1G8GGCCGG8GCCGGGGCGGGCGGGCGGGGGG=GGCGGGGCG=GGGGCG=GG=C1G
+ at test_16S_rrsB_A523C-86/1
+TCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAATGGCGCATACAAAGAGAAGCGACCTCGCGAGAGCAAGCGGACCTC
++
+=CCGCGGGGGGGG8JJJJCJJJJJJGJGJJJJJJGJJJCGJJJJJJJGJJGJGGJJG8JG1GGGJCCGCGGGGGCGGGGGGCC(GGGGG=CG=CGCGGCGJ==CCGGGCCCGGGGGGGGGGGGG=CGGCCCCGCGGGGGCGGGGGCCC=G
+ at test_16S_rrsB_A523C-84/1
+AGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCAAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGG
++
+C==G=GGGGGGGGCJJGGJJJJJCJGC81GGGJJJJGJGJJGJJJGGJJJJJGJGJJJGJJGGGG=CG==GJG=8GGGGG=GGGGGGCCGGGG8CGG1GGCCCGGG=CCGG==CG(CGCCG8GGCGGGGCGGGCGGGGGGCGG=GGCGGG
+ at test_16S_rrsB_A523C-82/1
+TCCAGGGGGCCGCCTTCGCCACCGGTATTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTA
++
+CCCCGGGGGGGGGGJJJGGGJJJJJJJJJGGJJJJJJGJCJJ=GGJJJGGJJGGCCJGGGGCGGGG(GG8GCGG1CCGGCGGGG1GCGGCG11CGGGGGGJCGGGCGGG=G=GCGCGGGGCG=GCGGGCGCGGGGGGCGGGCGGGC=CGC
+ at test_16S_rrsB_A523C-80/1
+CCACCGGTATTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCG
++
+CCC1GCC=GCGGGCJGJ=JJJJJJGJJJGJGJJJJGJJJJJGJGJJGGJJJJGJJGCJC=GG=GJGGJCGC=8GC==8GGJ8GCGGG==GGGGGCCCGCCJGCCGCG=GGCGCGGGGGCGGGGGCGCGCC8GGCGGGGC=GGCGGCG8CG
+ at test_16S_rrsB_A523C-78/1
+CATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCG
++
+CCCGGGGGGGGGG=JGJJJJGGJJGJJGCGGGGJJJG1JGGJGCJJGGJJJCGCJGCG=G=JGCJGGGGGGCGGGJGGGGC=GC8GCGC=CGGCGGGGGGJGCC=G1CGG=GGGGGCGGGG==8GGGGGGGG=GC=GGGGG==GGGCGGC
+ at test_16S_rrsB_A523C-76/1
+ATTCACCGTGGCATTCTGATCCACGATTACTAGCGATTCCGACTTCATGGAGTCGAGTTGCAGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTCTCGCGAGGTCGCTTCTCTTTGTATGCGCCATTGTAGCACGTGT
++
+CC11CGGGGGCGGJJJJJJJGJGGJJJGJJJGJJGJGJCGJGGGJJGGJJGGGJ(JJGGJGCGGJGGJGGGGGGCCGCGCGGG8CGGGGGGG=GCGG8CGJCCCCGCCGCGCGGGCCGG8GGGGG8CGGGCGGGCGCCCG=GCGGGGGGG
+ at test_16S_rrsB_A523C-74/1
+AACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTT
++
+CCCCGGGGGGGGGJJGJ1GJJGJJJGJJJGCGJJGGJGGJJJGJCJJGCJJJJGGJGGGGGGGGGGJCGJGGGJCGGGGGGGCGGG8GGGCGCGGGGGGGJCGGGGGGGGGGG=CCCCC=GG=GGC1GGCGGGGCGGGGCGG=GCGGGGG
+ at test_16S_rrsB_A523C-72/1
+CCCGGCCGGACCGCTGGCAACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATG
++
+1C=GCGGGGGGGGGJJJJJCJJ=JJJJ8JJGJCJJJJJJJJJJGJJGGJGGGJCCGJJJGGGGGCCGGJGGJCGGGJC=GCGGCCGGGGGGGGGGGGGGGCCC=GG=GGGGGCGGGC(GG=GCGG=GCGGGGGCGCGGCG=CGC1GCGGC
+ at test_16S_rrsB_A523C-70/1
+ATGAGAATGTGCCTTCGGGAACCGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGT
++
+CCCGGGGCGGGGGJ=GGCJJJGGJJJCGGGCGJJJJCGGJGGJCGGG=GGJJJJJJGGGCGJJCGCGGJGCC1CCGCGCGGCGCCGCGGGGGCCGCGGGCJGG1GG=CGGC=CCGGCGCG=GGCGG=GCGCGGCGGGG=GGCGGGGGGGG
+ at test_16S_rrsB_A523C-68/1
+TTGAGTTCCCGGCCGGACCGCTGGCAACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCC
++
+CCCGGGGGGGGGGJJJJJJGJJGJJGJCCGGJJGJJJJGJJJJG8JJCGCGJJGJCJJGCJCJGJCGGGGGCGCGGCCGCGCCGGG1GGGGGGGGGGGCGJGCCG==GCCGCCGGGGGGCCC1G=GCGGGCGC1CGCC=GGGGGGGCCCG
+ at test_16S_rrsB_A523C-66/1
+CACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGG
++
+CCC=CGGCGGGGGGJJJCGJGJJJJG8GJGJJJGGJJJJJJJJJJJ1GJG=8==GJJGCGGGJJJJGGCGGCGCCCGGCCGG8GCGGGGGGGGCGGGC=GJGGGCCGGGGCGGGGGCGGCGGCGG=GGGCGGGC=GG=CGGGGGGGCGCC
+ at test_16S_rrsB_A523C-64/1
+TAACTTTACTCCCTTCCTCCCCGCTGAAAGTACTTTACAACCCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAG
++
+CC81GGGG1CGGGGJJJJGJG=GGJCGGGCJGJJJJJC=JJCGGCGGJCJJJGGC8JJ=C=GCGGGGGGG=GGGC=CCGGGGCC=GGGCGGCCGCGGGGCJGCGG=GGCG1GGGGGGGGCGGGGG=GGGG=CGGC1C1GGGGGCCCGGCC
+ at test_16S_rrsB_A523C-62/1
+TATTACCGCGGCGGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGCGGGTAACGTCAATGAGCAAAGGTATTAACTTTACTCCCTTCCTCCCCGCTGAAAGTACTTTACAACCCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAG
++
+CCC1GGGGGGGGGJCJJJJJJJJJGJJJGGJJJJJJJGJJ=JJJCGJJCJJJGJJJJJCJGCCG8JJGGGGJJJ1GGJ=CCGGGGGCGGG1GCGCCCGGCJCC1GGCCC=GGCGG1CCG=8GG8GC1CGGG=GGGCGGGGGG8GCGCGGC
+ at test_16S_rrsB_A523C-60/1
+ACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCACC
++
+CCCGGGGGGGGGGJCGJGJJGJGJJGGJJJJJJGJJJJJ8GJCJCGGG=JGGGJGGGGGJCGJGGCJGCGCGGGGG1GGGGC1GGGGC1GCG=GGG==CG88C8CCGCGC8GCG1GG8GCGCGGGGGCGGCG=GGGGGCGGGGCGGGCGG
+ at test_16S_rrsB_A523C-58/1
+TCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCA
++
+CC11GCGGGGG8GJGJJGCJJJCJJGJJ=GGJJGGJGJCJJJGJGJGGJGJGGGGJGCGG=JCGGCGJJGCGGGGG=GCCGGGGGCGG=GGC(CCC8GGGJG8GGCGCCCGGCCGGGGG8GCGGGGGG=GGGGGCGG8CCGGGGCGGGG8
+ at test_16S_rrsB_A523C-56/1
+CCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATCCTCTCAGACCAGCTAGGGATCGTCGCC
++
+CCCGCGGGGCGGCJGCJJJJGJJJJCGGJJJJJGCJGCJGGGJCGGJJJ=CJGJJGCJGJGGG8JGCGGGGGGCGGCGCGCGGCGCGG=GCCGGGG=1GGJCCGG=G=GGCGGCGGGGG=GGGCCC1G1GCCCGGGG8GGCCCCGGGGGG
+ at test_16S_rrsB_A523C-54/1
+GGGGTAGAATTCCAGGTGTAGCGGTGAAGTGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAAC
++
+CCCGGGGGGCGGGJJ=1JJGJGJJJJJJ(JCGCJGJCJJJJJJCGJCJGGJJJGGGG8GJJGCGJGGGGG1GGG=GJG=CGGGGCCGGGCGGG8GCGGGG=CGGCCGGGGCGGGGCGGG8GCG=CGCCGGGGCGGGGGCCCGGGCGGCGC
+ at test_16S_rrsB_A523C-52/1
+TCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCA
++
+CC8GGGGGGGGGGGGJJCJJJJJGJJJCJGGJJJGJ=GGJG=GJJG1=GCGJGJJ=CJJGGGJJJGG=JJGGGGG=JCGGCCGCGGGG===GGGGGCCGGCC88GCCGGGCGCCGGGCGGGGGCCGGC1GGGGGGGGGCGG=CGGGCCGG
+ at test_16S_rrsB_A523C-50/1
+TGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCT
++
+=CCGGGGGGCGGGJ=JJJCJGGJJJJGCGJGJJJJJCJGJJJJGCGJGGJG(JCJCG=GGGJJG8=1(JCGC1G=GGGGGG=GGGGC8CCCC8CGCGGGGJGGCGGGCGG=CCGGG=1GGGGC=CGGGGGGGGCGCCCGG=CCCCCCGGC
+ at test_16S_rrsB_A523C-48/1
+TTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAATGGCGCATACAAAGAGAAGCGACCTCGCGAGA
++
+CCCGGGGGG1GCGJJ(JJJGCJJGJJGJGJJJJJJCCGGJJGCJJ1GGJJCJJJJCJGCGJGGCJGGCGCCGGCCGG8GCGGG=C=CGCG1=CCGCCGGCC=8CCGCGC=C1C=GG=CGCGCCC=GGGCGGGCG8GGCGGGG8GCGCGGG
+ at test_16S_rrsB_A523C-46/1
+TAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGCGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTT
++
+CC=GGGGGGGGGGJGJJJJJJJJJJGJJ8CJJJJJJ=JJJJJCJGGGJJJCJGGJGGGCJJGGG=JJGGGJGJCG(CCGGGCCCG8GCGGCGCGGG=GGGCGG8GGG1GGCCG1CGGG=GG=GGCGCCGCGCGCGCGGGG==CGCCGGGG
+ at test_16S_rrsB_A523C-44/1
+GGTTTAATTCGATGCAACGCGAAGAACCTTACCTGGTCTTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAA
++
+CCCGGGGGGGGGGJGJG=JJJJJJJGJJGJJJJJJGJ=JJJJJJGJJCGG=GCJGJJGGGG8GCJGC=GGCGCGCGGCCGCGGG=CGCCGCCCCGCGGGGJCGGGGGGGGGC8CGGGGGGGCGCCGGCGCGGCCGCCGGGGCGG=GGGGG
+ at test_16S_rrsB_A523C-42/1
+CACCCCAGTCATGAATCACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCGGTGTGTACAAGGCCCGGGAACGTATTCACCGTGGCATTCTGATCCACGATTACTAGCG
++
+CCCGGGGGGGGGGGJJGJGJJ=JJGC1JJJJJJCJJJJJG8GGGCGJJJJGGCGGCCJGGGGCCGCGJG=CJCG=GJCGGGGGGGGCGCCGGGGGG(CGGJCGGCGC8GGGGGCGGGCGG=CGGGGGCG=GC=GG=CGGGG=CGCGCCCC
+ at test_16S_rrsB_A523C-40/1
+TGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATT
++
+CCCGGGGGGGGGGJJJJJJGJJGCJJJCJJGGJJGJGJJJJJJGJJJGJGJJJJGJJC=JJCG=JGJ8G=GGCGGGGGGGGGGCGGGGGGCGCGGGCGG=JGGCG1GGCGGGCGGGGGGGCCGCGGGGGCGCGGCGGGGGGGGGGC=GGC
+ at test_16S_rrsB_A523C-38/1
+GGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCATGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGC
++
+=C=GG(GGGGGGGJGJJJJJJJJCJJJGJJGGJJJJJJGJCJCGJGJ1GGG=JG=JJGGGJGJJ=CC8CGGJCGGJG=GCGGCGCCCG=GCCCGC(GGG=JCGC8GGGGCC=GGCGGG=GG8GC=GCGGCGCCGCGCGGG=GGCCGGGGC
+ at test_16S_rrsB_A523C-36/1
+GCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTA
++
+CCCGGGGGGG=GGJJJ=GCGGJ(JJGGJJGJGJJJJJCJGJJ8GJJJGJGJCGGJGJGJGGGJCJGJ=CJCGGGGGGGCGGGC=GG=GGCGGGCGGGGG1CC=CGCGCCGGGGGGGCCGGGGGGGGGGGCC=GGGGC=GCCG=GGGGCGG
+ at test_16S_rrsB_A523C-34/1
+GTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCCACGCC
++
+=CCGGGGGGGGGGJJJJJJ=JJGJGJGCGJCJJGJJ=JGGGJGJJJJJJGGG8=JJG=GGGJJGCCGGJGGGGGCGGGGG(G=GGG=1=CGGGCCCG8CGCGG1GGGGCGGGCGGG=GGGGCGGGGCGGG1GGGGCCGG=CG18GCG1GG
+ at test_16S_rrsB_A523C-32/1
+AATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCCA
++
+CCCG1GGGGGG=GJCJJJGJJJGGGJJJJGJJJGJJGGJG(JJGJJJGJJGJGCJJGCGCG===GG=C1GGGCCJGJGCCGGGGGGGGGG1GG=GGCGGCCCCGGGCCCGCCGG=GG8GGGCGGGCG8CG=GCCCGGGCGGC8GCCGCCG
+ at test_16S_rrsB_A523C-30/1
+CCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGT
++
+CCCGGGGGGGGGGJJGJJJG=GJGGGJJGCJ=JJJJJJJCGJJJJGGJJGGCGGGCGJGJCGJG=G=GGJCGGGGGCCCGGCGGGCG==GGCGCGGGGGGJC1GGGGCGGGGGGC1(GGGGCGGCGGGCGGGG=CGGGGGGGCGGGGGGG
+ at test_16S_rrsB_A523C-28/1
+TGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAG
++
+CCCGGGCGGGGGGGJJGCJ=JJGJ=JJJGJGJGJCJ=JJGGJJJGJGGGJGGGJ1CGC=GJGGGJGGGGGJGGGGGGGCG==CG=CGGCGGGG=GGGGCGJGCGCGGCGGCG=GGGGGGGGGGGGGCGGGGGGGGCGGGGCG8GCCGGCG
+ at test_16S_rrsB_A523C-26/1
+TGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTA
++
+CCCGGGGGGGGGGJJJGJJJJJJJJJJGCJCJG=GGGJJJGGJGGCJGJJJJ=GJCJGGJCG=(GGGGGCJGCGJGGCGGGG8GGCGCC=GGCCGGGGGCJCCGGGCCGGGGGCGC=GGCGGGGGGGCGGCG=GGGGCGGGGGCCCCGGG
+ at test_16S_rrsB_A523C-24/1
+ACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCACCGGTATTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAATTCT
++
+CC1GCGGGGGGGGJJJJGGCJGJJJJJJGJJJJJJJJJJGCJJJGG1JGG8JJJGGG=JG1GGJCCCCJGCJ=CGGGGGGCG8GGGGCCG=G81GGGCCCJGGGGGGCGGGGGCGG=GCGGCG8CCG8CGC8GGCGGGGG8GGCGGGGCG
+ at test_16S_rrsB_A523C-22/1
+GCATTTCACCGCTACACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAAC
++
+CC1GGGGGGGGGGJJJJJGJGGGJJJJJGJJJJ1GG(JJJJGGJGJGJJJJGJJG=GJGJGGCGJGGGGCCCG=JC18GGCCCGGGG=G=CG=GGGC=8CC8=CGGCGGGGGGGCCCGGGGGGCCG8GGCCCGCGGGCCCCGGGCGGGG8
+ at test_16S_rrsB_A523C-20/1
+AGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGCGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCT
++
+CCCGGGGGGCGG8JGJGJJJGGGJJJJC=JJJJJGJGJJGJJGJJJJC1GG8=GCGGGGJJGGJJCGGGGGGGGGGCGJG=GCGGCGGGGGGGCGGGCCCJCGCGGCGGGGG==1GCGG=GGGGGGGGCGGGGGGG8GGGCGCCCGGCCG
+ at test_16S_rrsB_A523C-18/1
+CCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCA
++
+=C=GGGGGG=GGGJJJJCJJ=JGGJJJGJJJJGJJGGJJJJG8GCJJJJGCJCJGJJGJG=JGJG=G=CCCGGGCGCGGGCGGCGG=CGCGCGC8GGGGGJGGCCG=GCGGGCGGGGGCCGG8CCGCGGGG=GGCGCGGGGCGGCGGGGG
+ at test_16S_rrsB_A523C-16/1
+CGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGTACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGCGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAG
++
+CC1GCGGGGGGGC=GJJGJJJJJJJCGJJJGJJGCJJGGJJJJCJJJGGGGJGJCGJCJGGJGJGGGC=GG=G(1CCGGGGG1GCGCGC==CGGCGGGGCJGCG=GGGGGC(C=GCGGGGGCCGGGCGCC=G8GGGC=GCGGGCGGGCC1
+ at test_16S_rrsB_A523C-14/1
+AGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTCTCGCGAGGTCGCTTCTCTTTGTATGCGCCATTGTAGCACGTGTCTAGCCCTGGTCGTAAGGGCCATGATGACTTGACGTCATCCCCACCTTCCTCCAGTTTATC
++
+CCCGGGGGGGGGG1GJJGJJJGGGJJGJJJJJJCJJJGJGGJJJGGJJCJJCGGGGJCGGGCGJGG8GJGGGJGCCGCCGCGGG=CGCG1GC=GCGGGGGJCGGGCGCGGGGGGCCGG=CGGGCGGCGCCGCGGG=G=CGCGG=GGGGGC
+ at test_16S_rrsB_A523C-12/1
+AAGCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGT
++
+CC1GGGGCGGGGGJJJ=JJJJJJJJ81JJJ8JJJGGGJ8JGJJGJGJJGJJGGGJJGCJGGJJGCJCGGGCGGGCC1CG=GCGGCCGGGCCGGGG8GC=CJ8=GGGGGGGG1GGGGCGGCGGG8CGGG8CCGGGCC=GGGGCGGGGCGG=
+ at test_16S_rrsB_A523C-10/1
+AGTCTCGTAGAGTGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGCAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGT
++
+=CCGCGGGGCGG1JJJJJJJJJGJGJCCGJJGJJGGCGGJGJJGGGJJ=GJJCGGGGJGCGGGGGGGGG=GGJ1GGGGGGGCCGGCGCGGCCGGCGGCGGJG8(GGCCGCGG==GGGC=G=CGGGGGCC8GGGGG=CGGGC=GCGGCCGG
+ at test_16S_rrsB_A523C-8/1
+GATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGC
++
+CCCGGGGCGGGGGJJG=GJJJJJJJ=JGJJGCJJJJ=JJJGCGJCJGJJJGJJGGGCJG8(JGGCGJJG8GGGGCCGCGGGGGGGGGGGCGGCGGGGGGCCCGGGCGGG(GCGG1GGGCGG1C1GCCG=GGGGGCGCGCGCCG1CGGGGG
+ at test_16S_rrsB_A523C-6/1
+CACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGTATCAGATGCAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCG
++
+CC8GGGGGGCGGGGJGJJJCJ=GJGJGJJJ=JJJJJJJJJJJGGGGGJJ8JCJGJGCG==CCJJGJJJGGGJGGJCGG(GGGGGCGGCGGGGGCGGGCGGJGCC=GCCGGGGGCGGGCGCCCGGCGCGGCCCGGGGGG=GGGG1GGCCGG
+ at test_16S_rrsB_A523C-4/1
+CGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGCGGGTAACGTCAATGAGCAAAGGTAT
++
+CCCCGGGGGCGGGJGGGGGGJGJGGJJJGJGJJGJCJJJ8=JGGJGC1CGGJJJJGGJGJGJJCGGGCGCGGGJGGGCJ8C8GCGJ1GGCGGGG=GCGGGCGGCGCGGGGGGGGGCG1GCGCGGGC=GCG(CCGGG=GC1GGGCCCG=C8
+ at test_16S_rrsB_A523C-2/1
+ACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCG
++
+8CCGGGGGGGGGGJGJGJJGJGJJJGJJJGG1JJGGJCJJCCJJGJGJJJ=JCJGGGGGGJJGGGCJGJCG=GCCG=JCGGGGGGCJCGGGG1CCCGCGGC=CGG=GGGGGG=GGCGGGGGG(=GGGGGGG=8GGGCGCG=GGCGCGG8C
=====================================
tests/data/test_isolate_11_2.fq
=====================================
@@ -0,0 +1,200 @@
+ at test_16S_rrsB_A523C-100/2
+TCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGTAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCC
++
+CC1GGGGGGGGGGJGGJJGJJJJJJGC=GJGGJJJJJ8G8GJGJGCJGJGGJJCJJJJCJCCJGGGJG8GG=GG=GCGGGG8=CCGGGCGCCGGGGGGGGGCJ=JJJGCGGGGCCGGGCC1CCGGGCCGGCCGCC=CCCGGGGCGGGGGG
+ at test_16S_rrsB_A523C-98/2
+CTGGCAACAGAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTA
++
+CCCGGGGGG(GGGGJJGGJGGJGJJCJCGJJJGGJ=JGGG=J8GJJGCGJGCCCGJCGGCJGCGG(GGCGGGCJCGCG(GGCGCG=GCGCGG=GGGGGCGGCJJJJJGGCCGGGGG==GGCCCGCGGGCCCGGCCGGGCGGGCCCGGGGC
+ at test_16S_rrsB_A523C-96/2
+GTCGACATCGTTTACGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCACCGGTATTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAA
++
+C=CGCGGGGGGGGGJJJJGJJJJ1GGJJJJJGGJJ8=JJJGJJJJJJJJCGCC=GGCCG=1=GGGGG(GGGGG8GGGGCGJ1CGCGGGGGCGGGCGG8=CGC=CCJJGGGC8GGG=CG(CGCG(G1C1GGGCGGCGGGGCGGGCGCGGCG
+ at test_16S_rrsB_A523C-94/2
+CCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGAGGGTAACGTCAATGAGCAAAGGTATTAACTTTACTCCCTTCCTCCCCGCTGA
++
+CCCGGGGGCGGGGC1GJGJJGJJJGGJJJJJJJJJJ=GJGGJGGGGJGGCGJJ=GJGGJGGGJCJGGJJCGGGGCGGCGGC8GG8=CG1G=GCG8GG1GGG81JCJJCGGGCG8GCG=CG1GCGCGGGGCGCCGGCGCGGCG=CGGG1=C
+ at test_16S_rrsB_A523C-92/2
+TGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCCGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGT
++
+CC=G=GGGG1GGCJJ=GJJJJJJ==JC=GJJJGJ8JJJJJJJGJGJGGGCGCJ(CGGGGJGJC=GCCGCGG=GGGCGGGGGGGGGCGGGCGGCG=GGGCCGCCJCC=GGGG=GCGGG=GGGGCGCGGGGGGCCCCCGCGGCCGGG==G8C
+ at test_16S_rrsB_A523C-90/2
+GCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAAC
++
+CC=GCGCGGGGGGJJJJJ=GJJCJJJJJCJCCJJJJJ8GG1JCJJGJJGGGJC8JJCCGGCGGGGJCGGJCCC8GGGG1GGCCGCCGCG=CGGG=GGGGCG=C=J=JGGGGGGGGCGG=CGGGCG8=CGG8GCG1CG8GC=CCGCCC=GG
+ at test_16S_rrsB_A523C-88/2
+CCTTACCTGGTCTTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGC
++
+C1=GGGGC1GGGGJJJJJJ1J=1G=GJJ1GJJ1(GJJGJGJJGGJJJJ8G8CG(GJGGJGJGCCJGJJJCCCCC8CCGGGCGGGGCGCGCG8CCCGGCGGC=8C1CJGGGGGCCGC8GCCGGCCGGCC=GCG=GGGGGGGCGGGGGCGGC
+ at test_16S_rrsB_A523C-86/2
+GATCCAACCGCAGGTTCCCCTACGGTTACCTTGTTACGACTTCACCCCAGTCATGAATCACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCGGTGTGTACAAGGCCCG
++
+CCCGGGGGGCG=GCJJJJJCJGJG=JJCJJJ1JJC1JJJGGGCJGCJGCGJJJCGGGJGGJJG1GGGCGJG(JGJGC=GGC=GC=GCG=CGGCGGGG(GCGGJCJ=JGGGGG=GGGCGGGCG8GCCGGG1GGCGGGCCCGG=CGGGGCGG
+ at test_16S_rrsB_A523C-84/2
+CTAATCCTGTTTGCTCCCCACGCTTTCGCACCTGAGCGTCAGTCTTCGTCCAGGGGGCCGCCTTCGCCACCGGTATTCCTCCACATCTCTACGCATTTCACCGCTACACCTGGAATTCTACCCCCCTCTACGAGACTCAAGCTTGCCAGT
++
+CCCCCGGGGGGGGGCCJJJJJJJJJGGJGJJGJJGGJ1JGGJJGJJGJGCJGJCJJJCJCGGGJJ=JJG1GCGGGCC8CGCGC(=GGG8GCG=GGGGGCGCCJJJCCGGGCCGGGCGGGCGGCGCGG=GCG=GGCGGGGCG1=CGGGGCG
+ at test_16S_rrsB_A523C-82/2
+ACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCA
++
+CCCGGCGGCGGGGJGJJCGJJJJGJJJGJJCGGGGCGGJGJJGGGCGGJJGGJGGCGGGCGJGCCGGGJGCGGGCJGGCGGG8CCGGGCCCGCGCCCGGC=CJ=J1JCG8GGGGCGCGCGCC=GGGCGGGGGGGCCGGGCGCGCGCCCGC
+ at test_16S_rrsB_A523C-80/2
+ACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCT
++
+CCCGGGGGG=GGGJJJ(JCJGJJJJJGJJJJ=JJJJJJ=JJG=JGGGGCJ1GGGJGJJCG=GJCJGGGGJGJCGG=GG=G=GCGG=CCCGCGCGGCGCGGGCCCJJJGGCGCC1GGGCCGCG8GCCC1GGCCC=GGGGC=GGGGGGCGGG
+ at test_16S_rrsB_A523C-78/2
+CGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCCTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATA
++
+CCCGGGGGGGGGGJJGJGJJCCCJJ1JJGJGJJJGJJJJJJGGJGJCJGG8JJJCJJJJJJG8GJCGCC(CJCGGGGGCGC=1GGGCCGC(G(=1G(GCGGGC1CCJGCGGCGGGGGCGGG8GCCGGGCGGGGGCCGGGG=GGGCGGCGC
+ at test_16S_rrsB_A523C-76/2
+ACCTTACCTGGTCTTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTG
++
+CC1GGGGGGGCCGGC1GGGG=JJ1JJ=CJGGCGJGJJJJJJGJJGJJG=GCGJ==JGCCJJGG=J=GCCGGGGJGGG=CG=GC=CGG=CCG8GG=GGGGCC1JCCCJGGG=G1GGGG8CGG(CGGG=GGC8GGCGGCCGCGCGGGGGGGG
+ at test_16S_rrsB_A523C-74/2
+GAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTC
++
+CCCGGCGGGCGGGJJGGJGJJJJJJJJJJJCJJJGJJGGJCJJJJJJGGGGCGGGGJGGJGJ(GGGCG=GGJCCGGCCGGCG=G=GGG==GGGC(GGGGCCCJCJCJGGCCG=C8CGCGGCCC=GGCCGCGGGGGG=GCCCGCGGCGGC=
+ at test_16S_rrsB_A523C-72/2
+GAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGGTACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACG
++
+CCC1GGGGGGGGGCCJGJGGGJJJJJJ(JJGJJGJGJJJ=GCCGCC8(GGJ=GJGJJJ8JGG=GCJGJGGCGGCGGGGGGG(8GGGGCGCCCG1GCGGGGGCCJC=CGCC(==G=CGGG=GGGGGCGGGGGGGGGCGCGCCGGGC8GCCG
+ at test_16S_rrsB_A523C-70/2
+ATGGTGTGACGGGCGGTGTGTACAAGGCCCGGGAACGTATTCACCGTGGCATTCTGATCCACGATTACTAGCGATTCCGACTTCATGGAGTCGAGTTGCAGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTCTCGCG
++
+CCCGGGGGGGGCGJJJGJJJ=JGJGGJGJJJJCGGCJJJJ8J8JGJGJJJCJGJJGJJCGGCCGGJCC=CJCJ8GG=CGGGGGCGGGGGCCGCGCG=CGGC=CCJJJGC=GGCGGGG=GCGGCCG1CGGGGGCGGCGGGGGGGCGGG=GG
+ at test_16S_rrsB_A523C-68/2
+ACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGAACGGCCGCAAG
++
+CCCGGGCCGGGGGJCJGJJJCJJGJJJJJGJJ1JJJGJCG=JJJGJJJJGJJ(=GGGGJ=JGJGG8=CGGCGGCJGGGGGJGGGCG=GGGGGCGCG=CGGGCJCJJJCCCGG=CGCGGGGG81GCGGCGGGCCCG1GG1GCGCCGCGGGC
+ at test_16S_rrsB_A523C-66/2
+TCCTCCCCGCTGAAAGTACTTTACAACCCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTCCAGTGTGGCTGGTCATC
++
+CCCCGGGCGGGGGCGJGCGCJCJJJJGCJJGJJGJJJGGJGJGGJGJJGJJCGGGJ=CC=GCCCGGJC(CGCGGGGGCGGGGCCG=G==GGGCGGG1GGGG=JCJJJGCGGCG1CG=GGCGCGCCGCCGCGGCCG1GGCGGGGGGCCGG=
+ at test_16S_rrsB_A523C-64/2
+CGGTAACGGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCT
++
+C=CGGGG(GGGGCJJJCJGGJGGJJGJJCGJJCJJJJGJ1JJJJGCGJJJGGJGCGGCG=CJ=G=JJJGGCJJGGCGGG=G=CC=CGGGC8GCCG(CGG8GC=CCJCC=CCGCGC8GGGG1GCGCGGGC(CG=GCGGGGGCG8CGC=GGC
+ at test_16S_rrsB_A523C-62/2
+TGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCC
++
+CCCGCCGGGCGGGGGJJGJJGGJCJJJJGJ8JJGGJJJJGGJCCGGJGCGG=CJCJGGGGGG=GJGCCGCGCGCCG=C=1GG8CGGGG=CCGGGGCGG=CGGJJCJJCGGGCCGCCGGGGGGGGC1GGGGGGCGG8C188GCGCGCCGGC
+ at test_16S_rrsB_A523C-60/2
+TTTGCTCATTGACGTTACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTC
++
+CCCGGGGGGCGGCJGJGJJJJJJJJGJJJCC=GGJGJJJGGJJJCGJCGGJGCGJJGCGCGJJGGJCGCGGCGGGGGGG=CGGGCGCCGCC==C8CGGGG1CCJJJ1CGC=GGGG1G1GGGC=GCCCGGGG=GCGGG(GCCGGCCGGGC=
+ at test_16S_rrsB_A523C-58/2
+CTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACTATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAG
++
+CCCGCG8GGGGG=GJ8CGJCGCJJGGJGJG8JJJJJJJCJJGJG8JGJJJJGCJGCJGGC=JJJG=JJCJGGGCGC=CCCGC=G8CGGCCGGG(GG=GGGCGCCJC1CCGCGG=GGGCCCGGCGGGCGG=GGGGGGGGGGGGGGGGGCCG
+ at test_16S_rrsB_A523C-56/2
+AGATTGAACGCTGGCGGCAGGCCTAACACATGCCAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGATTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATAC
++
+CCCGG=CGCGCGGJJJJJJCGGGJJGGJJJJJG(GJ=JJGGCGCJJGGCC=CCGJCGCCGCCGGJGGGJGGGJGGGCGCGGGC1CGG(GGG(C=CCCGGC8CJJJ1JGGG1GCGGGCGGGCG=CCCC8GCGGGCGGCCGG8GCGGCG=G=
+ at test_16S_rrsB_A523C-54/2
+GACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTC
++
+1C1GGGGGGGG=GGGJJJJJCJJJJJJJGGJJJGJGJJJJJGJGGJJGJGCGJGCGGGGCGG(JGGGGCGCGCGJGCGG=GCGGGGCGGGGCGGGGCGGCGCCJJ=JCCCGGCCCCGGG8GCGGGGGGGGGGG=GCCCCGC=GGCGCGGG
+ at test_16S_rrsB_A523C-52/2
+AACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCAGTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTC
++
+CCCGGGGGGGGGGJGJJGJGJJGJJJGGJCJJGJGJJGGCJJGJGJGCJGGGGGGCGGGJCJJJGGGCCGCCGJCGGCJCG8GGG=GCCGCCC(CGGGCGGGCJJCJCGCCGCG=GCCCGGGGGGGCCGCC1CCCC=GGCGGCGGGCGCC
+ at test_16S_rrsB_A523C-50/2
+AGTCATGAATCACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCGGTGTGTACAAGGCCCGGGAACGTATTCACCGTGGCATTCTGATACACGATTACTAGCGATTCCG
++
+CCCGGGGCGGGG1JGC1GJJCGJJGGGJJJJJGJCJ1JGJCGC8CJCJ8=JCJG8JGGJ=GGJJGCGGC8GCJGGGG=G8(CCGCGGGGC=CGGGCCGGGGCJCJJCCCCGGGCCGGGGCGGGCCCGGG1GGCCC8GCCG(GCGGGGG==
+ at test_16S_rrsB_A523C-48/2
+GGTTCCCCTACGGTTACCTTGTTACGACTTCACCCCAGTCATGAATCACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCGGTGTGTACAAGGCCCGGGAACGTATTCA
++
+CCCGGGGGGGGGGJJGJCJJCCCGJJJJGJJ1G=JJGJ1JJGJJJGGGJGJGGGCJGJJJGGGGGGGGCGCGGGCGCGGCG=GCGCGG1GCCCGGGCGCCG1CJJJJG8GG=GGGGCCCGCGC=CGCGCC=CGG=GCCGGC=CCGGGGGC
+ at test_16S_rrsB_A523C-46/2
+TATTAACTTTACTCCCTTCCTCCCCGCTGAAAGTACTTTACAACCCGAAGGCCTTCTTCATACACGCGGCATGGCTGCATCAGGCTTGCGCCCATTGTGCAATATTCCCCACTGCTGCCTCCCGTAGGAGTCTGGACCGTGTCTCAGTTC
++
+CCCCGGGGGGGGGJJJ1GGJJGJGGGJJJGJJJJJJJJ=JJ=JGJJGJJGJCGJCCJGGCJGGGGGGJJG8CCGG=GGG8GG1GGGGGGGG=GGCGGGGGGGCJCJCCGGCGGCGCCCGGGG=GGGGGGGGCGGGGCGG=GGCCCCGGGG
+ at test_16S_rrsB_A523C-44/2
+ATTACTAGCGATTCCGACTTCATGGAGTCGAGTTGCAGACTCCAATCCGGACTACGACGCACTTTATGAGGTCCGCTTGCTCTCGCGAGGTCGCTTCTCTTTGTATGCGCCATTGTAGCACGTGTGTAGCCCTGGTCGTAAGGGCCATGA
++
+C=CGCG=GGGGGGJJJJJJGGJJJGGJCJGJGJGJGJGJJGGJGCCJJJJJJJCGJGGCGGJGCJJCCCJJCGCGGGGGGCGGGCGCGCCGC=GCGCGGGGCCJCJJGCGC8CCGCCCGGGGGG1G=GG=CGGGGGC=CCCC=GGGCGC=
+ at test_16S_rrsB_A523C-42/2
+TAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGGTCCGGCCGGGAACTCAAAGGAGACTGCCAGTGATAAACTGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGACCAGGGCTACACACGTGCTACAAT
++
+CCCGGGGGGG=GGCGJGGJGJJJ1GJCGGGGGJGJJJJCCGCC8CGJCJCJJGGCGJ=GCGJCGCGGGGJ=GGCGCGGGG11G=GGGCCGGCGCGGGGGCGCCCJCCCGGGCG=GG=GGCCGCGGCGGCGG=GCGCGGGGGGCGG=G=GG
+ at test_16S_rrsB_A523C-40/2
+CCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGCGGGTAACGTCAATGAGCAAAGGT
++
+CCCGGGG=GGGGGJJJJCJJGGJGJCJ(CJGJJJJJJJGJGJCJJJCGGJCCJ1JGGCJJGGCCGGG1GGC=GGGJC1CC8G1GGGGGC=GGG1CGGGGGGGJCJ8JCGGGCGGCGGGCGGC=11GGCGC8CC=GGCGGC=C1CCCGGGC
+ at test_16S_rrsB_A523C-38/2
+TTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAAGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCG
++
+C1=GGGGGG1GGCJCJJJJJGJJJJGJJJJJGJJGJJJGGJGJCJJGJGC1GGGJJ=JGCGCGGGGGGJGGCGGC=GGCGGGG(GGGGGGGGG8GGGGGCCGJJCJJCCGGGGCCGG1CGCGGGGGGGCC1GG8G(GGGGGGGCGGGGCC
+ at test_16S_rrsB_A523C-36/2
+GTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCTGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTC
++
+CCCGGGGGCGGGGG(JGJGJJGJJG=CJGJG1JGJJJJJJJGJGCJGGJCJJ=JJGJJ(1G8GJG(=JCGJGJCGCG=GG=GCGGGGGGGGCGGCCGCC=GCJJJCJGCCGGGCGCG=8GGGGCGGGGGGGCCGGGCGGGGG=CCCGCCC
+ at test_16S_rrsB_A523C-34/2
+CAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAG
++
+CCCGGGG=GGCGGJJJJ=CGGJJGJJJG(GJJCGGJG8GJJ8JJGJJGCCGGCGJ=GGGGG8C8=GGGG=JJGCGGGGGG8GCGCGGGGGCCGGGG=GGGGCCJJJJC=G8GGGC=GGGCGGCGGCGGGGGGGGG=GGCGGCCCCCGCG8
+ at test_16S_rrsB_A523C-32/2
+CCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGG
++
+C1CGGCGGGGGGGGJJ=GJJCJ8GGJGGG8JJJ=GGGGJJJGGJJJJGJJGJJGJJCCJGJ=G=GGCGG=CGGGC=G1GGGCGGGCGCG8G8CCGCGGG81CCJJCJCCGGCGGGGGCGGGGGCCCCG(GCGCCGG=CGGCGGCCCGGCC
+ at test_16S_rrsB_A523C-30/2
+TCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTT
++
+CCCGGGGGGGGGGC=JJJ=GJJGJGJGGJCJJJJJJCGJJGJJ=GJJJJGGGGC=JJJGGCJJGGGGCG=GJGGCJGGGGC1GCGCGGGGCCCCGCGGG8GG=8JCJGGGGCCGGCCCGCCGGCGCCCCG=1GGG8GCGG=C1CGG=CGG
+ at test_16S_rrsB_A523C-28/2
+ACTTAACCCAACATTTCACAACACGAGCTGACGACAGCCATGCAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCT
++
+CCCGG1GGGCGGGJJJJJJJGJJ=JJJGJCJJJJJJJ8=JGGJJGJ1GGJCJC=JG=C=JGJCCJGGJCJGGCCCGGCCGGCGGGCCCGCGGGCGCGGCGGG=CJJJGGCGGGGGCCGGCGCGGCGCGCCG8GCGCGGG==G=GGCGGGG
+ at test_16S_rrsB_A523C-26/2
+CACAAAGTGGTAAGCGCCCTCCCGAAGGTTAAGCTACCTACTTCTTTTGCAACCCACTCCCATGGTGTGACGGGCGTTGTGTACAAGGCCCGGGAACGTATTCACCGTGGCATTCTGATCCACGATTACTAGCGATTCCGACTTCATGGA
++
+CCCGGGGGGGGG=JJJJ=CJJCCJJJJGJJGJGCJJJJJJJJGGGGGGCGJGCGGGJJGGJJGCCJ==GJ1GGGGGCGCGCGGGGGGGCCGG=CCGGGCGGGJJCJ8GGG8GCGCGGGGGGGCGGCGGCCGGGGGG=GGG8GG=CGGGGC
+ at test_16S_rrsB_A523C-24/2
+CGGGTTGTAAAGTACTTTCAGCGGGGAGGAAGGGAGTAAAGTTAATACCTTTGCTCATTGACGTGACCCGCAGAAGAAGCACCGGCTAACTCCGTGCCAGCCGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGT
++
+CCC1GGCC1GGGGJGJCJJJCJGJJJGJJJJJJCJGGJGCGGJGJCJGCJGJGGG8CGJGJJGG(GGGGG=GCGGCGCCGGGGGGCGCG=G1CGC(GG1CGG=JJCCGGGCCGGCCGC=GGGCG(GG8CCGGGG1GC8C=GCGGG=G8GC
+ at test_16S_rrsB_A523C-22/2
+GGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGG
++
+CCCGGGGGGCCGGJJJJJJJJGJJJJGGJJCJJJJJCGGCGCJCJCJJJGGJJCG1JGGGJGCG8JGJG88JGJ8CCGCGCCCGGGC1GGCGGCGGGGGGGCJJJJJ=GGGCGG8CGGGGCGGCCCG8CGGGC=GGGCGGGCGGG1GCGC
+ at test_16S_rrsB_A523C-20/2
+ACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGCTGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTTAATTCATTTGAGTTTTAACCTT
++
+CCCGGGGGGGGG=JGJ==JGGJGGGJJG=JJJJJJGGJCJJCCJGGGJJJGG((JGGG8GGJGCGGGGGCGCGCG8GGG=G=CCGGGCGGGGCCCGG=GGGCCC8CJCG1=GGGG=1GG=GGG=GGG11CCGGGGGC=CGG=GGCGGCG1
+ at test_16S_rrsB_A523C-18/2
+CAGTTCCCAGGTTGAGCCCGGGGATTTCACATCTGACTTAACAAACCGCCTGCGTGCGCTTTACGCCCAGTAATTCCGATTAACGCTTGCACCCTCCGTATTACCGCGGCGGCTGGCACGGAGTTAGCCGGTGCTTCTTCTGCGGGTAAC
++
+=CCGGGGGGGGGGC=JJJJJJGGJGJJJCJJGJGJJGGGJ=GJG=GJCGJJGJJ=GCCGCCGGC1CGGCCJGCGGCGGGGGG=CGCCGG1GC=GGGGCCGGGCJCJJCG8=CCC1(=GG8GGGGGGCGGC8GGGGG=GG=CGGGGG(=GG
+ at test_16S_rrsB_A523C-16/2
+CCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAACCTTGCGGCCGTACTCCCCAGGCGGTCGACTTAACGCGTTAGCTCCGGAAGCCACGCCTCAAGGGCACAACCTCCAAGTCGACATCGTTTACGGCGTGG
++
+CCCGGGGGGCGGCJCJGJJJGJJGJJGJJJJCG(JJGGJGJGJ(JGGGGJJJJJGCCGGGGJGGGCGCGGGJCCGGGCCGGCGGCG8CCCCCGGGGGGCGGGJ8JJCGCG88GGGGGCGGGCGGGGGGCGGGGGGC1G(GGGG=CCGGCC
+ at test_16S_rrsB_A523C-14/2
+GAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCTGGTCTTGACATCCACGGAAGTTTTCAGAGATGAGAATGTGCCTTCGGGAACCGTGACACAGGTGCTGCATGGCTGTCGTC
++
+CCCGGGGCGG=GCJGJJGGJGJJJJJJGJJG==GGJCGGJGJGJGJJGJJJ(CJGJJGG(GCJJGGJGGGCJGJCJGGCGGGGCGGCGGGCGG=GGGGGCGCCCCJJC=GCGGGG8=CGGGGCGCG(CCCGGGC=GGCGCGCGCCGGGGG
+ at test_16S_rrsB_A523C-12/2
+TCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTTGTTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGA
++
+CCCGGGGGGGGGGJJGJJGJCJJJ=JGGJJCJJJJGJJJJ8CJCJGGGGGGJGJGJ8JCCJ=GJGGCCG=JGGGCCG8GGCGCCGCGGGGGCGGGCGGCCCCCJJJCGGGGGGGG1GC(GCC=GCGGG=CGG1GGGGGGGC8GCGGGCCC
+ at test_16S_rrsB_A523C-10/2
+CAGCACCTGTCTCACGGTTCCCGAAGGCACATTCTCATCTCTGAAAACTTCCGTGGATGTCAAGACCAGGTAAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGGCCCCCGTCAATTCATTTGAGTTTTAA
++
+=1CGGGGG=CGGGJCJCJ1JGJJJ1JGGJGCCJCJ=GGJJJCGGGGJ=JJGCGCGCGGJJJJCJJGJG==GCGGC=CCC8=GGCGCCGGGCCCCGGGCGGGCJCCJJGGGGGGG8=GG=CG=GCCGGGG1GGCCCGCG8CGC=GCGC=CG
+ at test_16S_rrsB_A523C-8/2
+GGTCGTAAGGGCCATGATGACTTGACGTCATCCCCACCTTCCTCCAGTTTATCACTGGCAGTCTCCTGTGAGTTCCCGGCCGGACCGCTGGCAACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACACGAG
++
+CCCGGGGGGGGGGJJ=GJJJJJJJJJJCJJJJGJJGJJJ(JGJGJGGGGGJJJGGGGGGJJ=CGJGG1GC=GGCGGGC88GCG===CGGGGGC=GGCG8GCC88CCJC8C=GGCCGCGGGGG=C8CGGCGG8GCCGGGGCGGCGGCCGGG
+ at test_16S_rrsB_A523C-6/2
+CCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAAGAAGGCCTTCGGGTCGTAAAGT
++
+CCCGGG1GGGGGG1JJJJJ11GJGJCJGJ1GJJJGJG1GJJ=GGGJCJGGGJJCGGCCGGJG=CGGJGGGGGGGGCGGGGGCGCCCG=CGCCG=GGGGGCG=JCJCCGGGGGGG=GCC=GCC=CCGC=GGGG=GGCCGCGGC1GGGCCGC
+ at test_16S_rrsB_A523C-4/2
+CTTGCCATCGGATGTGCCCAGATGGGATTAGATAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGGCCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATA
++
+CC1GGGGGG=GGGG=8GJCJJJJJGJGJJJJ1GJGGCJGGCJJJGJG8CJCJGGJC=JGGCG8JCGGGGGGJGGGG=GGCCGGGG=CCG(GGGCGGGGGCG=JJJCC8GGGCGCCCCCGGGGC=8GGGGGGGCCC=CGGGG=CGGG=GGC
+ at test_16S_rrsB_A523C-2/2
+GGAACTGCATCTGATACTGGCAAGCTTGAGTCTCGTAGAGGGGGGTAGAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACGAAGACTGACGCTCAGGTGCGAAAGCGTGGGG
++
+CCCGGGGGGGGGGJ1JGJJJGJCJJJJJJJJJGGG==JJGJGJCCGCGJGCGJCJC8CCJ8CGJCGGGJJCJ=GG1GGCGG8G1G8CGGGGG=GGGGCG=GGJJJ1CCCGGGGCGGGGC=CGG=GGGGGCGCGG=GGGCGGG1GCGGCGG
=====================================
tests/resfinder/cge/output/test_phenotype_result.md
=====================================
@@ -11,10 +11,11 @@
>>> rg = ResGene(unique_id="blaOXA-384_KF986263", start=1, end=90,
... isolate="isolateA", ab_class=["beta-lactam"],
... ref_db="ResFinder")
->>> rm = ResMutation(unique_id="gyrA_81_d", seq_region="gyrA", pos=81,
+>>> rm = ResMutation(unique_id="gyrA;;1;;CP073768.1_81_d_AA", seq_region="gyrA", pos=81,
... ref_codon="ggt", mut_codon="gat", ref_aa="g",
... mut_aa="d", isolate="isolateB", nuc=False,
-... ab_class=["fluoroquinolone"], ref_db="PointFinder")
+... ab_class=["fluoroquinolone"], ref_db="PointFinder",
+... nuc_format="", aa_format="gyrA;;1;;CP073768.1_81_d")
>>> from src.resfinder.cge.phenotype2genotype.res_profile import Antibiotics
>>> ab_m1 = Antibiotics(name="ciprofloxacin", classes=["fluoroquinolone"],
@@ -106,7 +107,7 @@
... 'substitution': True,
... 'deletion': False,
... 'insertion': False,
-... 'ref_id': 'gyrA_81_d',
+... 'ref_id': 'gyrA;;1;;CP073768.1_81_d',
... 'key': 'gyrA;;81;;d',
... 'ref_database': ['PointFinder-a2b2ce4'],
... 'seq_regions': ['gyrA;;1;;CP073768.1']
@@ -180,7 +181,7 @@ True
>>> PhenotypeResult.get_ref_id_and_type(rg, isolate)
('blaOXA-384_1_KF986263', 'seq_regions')
>>> PhenotypeResult.get_ref_id_and_type(rm, isolate)
-('gyrA_81_d', 'seq_variations')
+('gyrA;;1;;CP073768.1_81_d', 'seq_variations')
```
=====================================
tests/resfinder/cge/output/test_seq_variation_result.md
=====================================
@@ -62,6 +62,7 @@ hit.
'ref_codon': 'ggt',
'var_codon': 'gat',
'codon_change': 'ggt>gat',
+ 'nuc_change': 'd',
'ref_aa': 'g',
'var_aa': 'd',
'ref_start_pos': 81,
@@ -71,7 +72,7 @@ hit.
'insertion': False,
'ref_id': 'gyrA;;1;;CP073768.1_81_d',
'key': 'gyrA;;1;;CP073768.1;;81;;d',
- 'ref_database': 'PointFinder-...',...
+ 'ref_database': 'PointFinder-...',...,
'seq_regions': ['gyrA;;1;;CP073768.1']...}
```
@@ -97,13 +98,14 @@ hit.
'ref_codon': 'ggt',
'var_codon': 'gat',
'codon_change': 'ggt>gat',
+ 'nuc_change': 'd',
'ref_start_pos': 81,
'ref_end_pos': 81,
'substitution': True,
'deletion': False,
'insertion': False,
- 'ref_id': 'gyrA;;1;;CP073768.1_81_gat',
- 'key': 'gyrA;;1;;CP073768.1;;81;;gat',
+ 'ref_id': 'gyrA;;1;;CP073768.1_81_d',
+ 'key': 'gyrA;;1;;CP073768.1;;81;;d',
'ref_database': 'PointFinder-...',...
'seq_regions': ['gyrA;;1;;CP073768.1']...}
=====================================
tests/resfinder/cge/output/test_std_results.md
=====================================
@@ -1,5 +1,50 @@
# std_results tests
+## setup
+
+```python
+
+>>> from src.resfinder.cge.config import Config
+
+>>> class DummyArgs():
+... def __init__(self):
+... self.inputfasta = None
+... self.inputfastq = None
+... self.outputPath = "./tests/tmp_out/"
+... self.blastPath = None
+... self.kmaPath = None
+... self.species = None
+... self.ignore_missing_species = None
+... self.db_path_res = None
+... self.db_path_res_kma = None
+... self.databases = None
+... self.acquired = True
+... self.acq_overlap = None
+... self.min_cov = None
+... self.threshold = None
+... self.point = True
+... self.db_path_point = None
+... self.db_path_point_kma = None
+... self.specific_gene = None
+... self.unknown_mut = None
+... self.min_cov_point = None
+... self.threshold_point = None
+... self.ignore_indels = None
+... self.ignore_stop_codons = None
+... self.pickle = False
+... self.nanopore = False
+... self.out_json = None
+... self.disinfectant = False
+... self.db_path_disinf = None
+... self.db_path_disinf_kma = None
+... self.output_aln = False
+... self.species = "ecoli"
+
+>>> args = DummyArgs()
+>>> conf = Config(args)
+
+```
+
## initialize
First part just creates some dummy objects needed for testing the class. A
@@ -151,7 +196,8 @@ Create the phenoDB object.
>>> from src.resfinder.cge.output.std_results import ResFinderResultHandler
>>> ResFinderResultHandler.standardize_results(res,
... rf_custom_kma,
-... "ResFinder")
+... "ResFinder",
+... conf)
>>> for k in res["databases"]:
... print(k)
=====================================
tests/resfinder/cge/phenotype2genotype/test_isolate.md
=====================================
@@ -24,6 +24,45 @@
... acquired_file=acquired_file,
... point_file=point_file)
+>>> from src.resfinder.cge.config import Config
+
+>>> class DummyArgs():
+... def __init__(self):
+... self.inputfasta = None
+... self.inputfastq = None
+... self.outputPath = "./tests/tmp_out/"
+... self.blastPath = None
+... self.kmaPath = None
+... self.species = None
+... self.ignore_missing_species = None
+... self.db_path_res = None
+... self.db_path_res_kma = None
+... self.databases = None
+... self.acquired = True
+... self.acq_overlap = None
+... self.min_cov = None
+... self.threshold = None
+... self.point = True
+... self.db_path_point = None
+... self.db_path_point_kma = None
+... self.specific_gene = None
+... self.unknown_mut = None
+... self.min_cov_point = None
+... self.threshold_point = None
+... self.ignore_indels = None
+... self.ignore_stop_codons = None
+... self.pickle = False
+... self.nanopore = False
+... self.out_json = None
+... self.disinfectant = False
+... self.db_path_disinf = None
+... self.db_path_disinf_kma = None
+... self.output_aln = False
+... self.species = "ecoli"
+
+>>> args = DummyArgs()
+>>> conf = Config(args)
+
```
### Result object
@@ -82,7 +121,8 @@ std_results test documentation.
>>> from src.resfinder.cge.output.std_results import ResFinderResultHandler
>>> ResFinderResultHandler.standardize_results(res,
... rf_custom_kma,
-... "ResFinder")
+... "ResFinder",
+... conf)
>>> from src.resfinder.cge.output.std_results import PointFinderResultHandler
>>> PointFinderResultHandler.standardize_results(res,
@@ -173,11 +213,11 @@ False
>>> feat_res_dict = res["seq_regions"]["blaOXA-384;;1;;KF986263"]
>>> Isolate.get_phenodb_id(feat_res_dict, "seq_regions")
-'blaOXA-384_KF986263'
+('blaOXA-384_KF986263', '')
>>> feat_res_dict = res["seq_variations"]["gyrA;;1;;CP073768.1;;81;;d"]
>>> Isolate.get_phenodb_id(feat_res_dict, "seq_variations")
-'gyrA;;1;;CP073768.1_81_d'
+('', 'gyrA;;1;;CP073768.1_81_d')
```
=====================================
tests/resfinder/cge/test_config.md
=====================================
@@ -22,7 +22,7 @@
... self.kmaPath = None
... self.species = None
... self.ignore_missing_species = None
-... self.db_path_res = "resfinder_db"
+... self.db_path_res = None
... self.db_path_res_kma = None
... self.databases = None
... self.acquired = True
@@ -30,7 +30,7 @@
... self.min_cov = None
... self.threshold = None
... self.point = True
-... self.db_path_point = "pointfinder_db"
+... self.db_path_point = None
... self.db_path_point_kma = None
... self.specific_gene = None
... self.unknown_mut = None
@@ -44,6 +44,7 @@
... self.disinfectant = False # Tested?
... self.db_path_disinf = None # Tested?
... self.db_path_disinf_kma = None # Tested?
+... self.output_aln = False
>>> args = DummyArgs()
>>> args1 = DummyArgs()
View it on GitLab: https://salsa.debian.org/med-team/resfinder/-/compare/14dc0adf109f7a5fcbc9ee5152fb07c72f2d0952...1eb251926030c03dde9ebc62074516faa25ba36c
--
View it on GitLab: https://salsa.debian.org/med-team/resfinder/-/compare/14dc0adf109f7a5fcbc9ee5152fb07c72f2d0952...1eb251926030c03dde9ebc62074516faa25ba36c
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20230117/af1e24c2/attachment-0001.htm>
More information about the debian-med-commit
mailing list