[med-svn] [Git][med-team/kaptive][upstream] New upstream version 0.7.3

Andreas Tille gitlab at salsa.debian.org
Sun Sep 13 07:32:07 BST 2020



Andreas Tille pushed to branch upstream at Debian Med / kaptive


Commits:
356ee2d6 by Andreas Tille at 2020-09-13T07:09:13+02:00
New upstream version 0.7.3
- - - - -


4 changed files:

- README.md
- kaptive.py
- reference_database/Acinetobacter_baumannii_OC_locus_primary_reference.gbk
- reference_database/Klebsiella_k_locus_primary_reference.gbk


Changes:

=====================================
README.md
=====================================
@@ -12,7 +12,7 @@ Given a novel genome and a database of known loci (K, O or OC), Kaptive will hel
 In cases where your input assembly closely matches a known locus, Kaptive should make that obvious. When your assembly has a novel type, that too should be clear. However, Kaptive cannot reliably extract or annotate locus sequences for totally novel types – if it indicates a novel locus is present then extracting and annotating the sequence is up to you! Very poor assemblies can confound the results, so be sure to closely examine any case where the locus sequence in your assembly is broken into multiple pieces.
 If you think you have found a novel locus that should be added to one of the databases distributed with Kaptive please [contact us](mailto:kaptive.typing at gmail.com).
 
-Read more about Kaptive, Kaptive Web and the locus databases in [our papers](#citation).
+For citation info and details about Kaptive, Kaptive Web and the locus databases, see [our papers](#citation) below.
 
 
 ## Table of Contents
@@ -91,7 +91,7 @@ kaptive.py -h
 
 #### Other dependencies
 
-Regardless of how you download/install Kaptive, it requires that [BLAST+](http://www.ncbi.nlm.nih.gov/books/NBK279690/) is available on the command line (specifically the commands `makeblastdb`, `blastn` and `tblastn`). BLAST+ can usually be easily installed using a package manager such as [Homebrew](http://brew.sh/) (on Mac) or [apt-get](https://help.ubuntu.com/community/AptGet/Howto) (on Ubuntu and related Linux distributions).
+Regardless of how you download/install Kaptive, it requires that [BLAST+](http://www.ncbi.nlm.nih.gov/books/NBK279690/) is available on the command line (specifically the commands `makeblastdb`, `blastn` and `tblastn`). BLAST+ can usually be easily installed using a package manager such as [Homebrew](http://brew.sh/) (on Mac) or [apt-get](https://help.ubuntu.com/community/AptGet/Howto) (on Ubuntu and related Linux distributions). Some later versions of BLAST+ have been associated with sporadic crashes when running tblastn with multiple threads; to avoid this problem we recommend running Kaptive with BLAST+ v 2.3.0 or using the "--threads 1" option (see below for full command argument details).
 
 
 ## Input files
@@ -308,7 +308,18 @@ WARNING: If you use the variant database please inspect your results carefully a
 
 Database versions:
 * Kaptive releases v0.5.1 and below include the original _Klebsiella_ K locus databases, as described in [Wyres, K. et al. Microbial Genomics (2016).](http://mgen.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000102)
-* Kaptive v0.6.0 includes four novel primary _Klebsiella_ K locus references defined on the basis of gene content (KL162-KL165) in this [paper.](https://www.biorxiv.org/content/10.1101/557785v1)
+* Kaptive v0.6.0 and above include four novel primary _Klebsiella_ K locus references defined on the basis of gene content (KL162-KL165) in this [paper.](https://www.biorxiv.org/content/10.1101/557785v1)
+* Kaptive v0.7.1 and above contain updated versions of the KL53 and KL126 loci (see table below for details). The updated KL126 locus sequence will be described in McDougall, F. et al. 2020. _Klebsiella pneumoniae_ diversity and detection of _Klebsiella africana_ in Australian Fruit Bats (_Pteropus policephalus_). _In prep._
+* Kaptive v0.7.2 and above include a novel primary _Klebsiella_ K locus reference defined on the basis of gene content (KL166), which will be described in Li, M. et al. 2020. Characterization of clinically isolated hypermucoviscous _Klebsiella pneumoniae_ in Japan. _In prep._
+* Kaptive v0.7.3 and above include four novel primary _Klebsiella_ K locus references defined on the basis of gene content (KL167-KL170), which will be described in Gorrie, C. et al. 2020. Opportunity and diversity: A year of _Klebsiella pneumoniae_ infections in hospital. _In prep._
+
+
+Changes to the _Klebsiella_ K locus primary reference database:
+
+| Locus  | Change | Reason | Date of change | Kaptive version no. |
+| ------------- | ------------- | ------------- | ------------- | ------------- |
+| KL53  | Annotation update: _wcaJ_ changed to _wbaP_ | Error in original annotation | 21 July 2020 | v 0.7.1 | 
+| KL126  | Sequence update: new sequence from isolate FF923 includes _rmlBADC_ genes between _gnd_ and _ugd_ | Assembly scaffolding error in original sequence from isolate A-003-I-a-1 | 21 July 2020 | v 0.7.1 |
 
 #### _Klebsiella_ O locus database
 
@@ -330,7 +341,7 @@ The _A. baumannii_ OC (lipooligosaccharide outer core) locus reference database
 WARNING: These databases have been developed and tested specifically for _A. baumannii_ and may not be suitable for screening other _Acinetobacter_ species. You can check that your assembly is a true _A. baumannii_ by screening for the _oxaAB_ gene e.g. using blastn.
 
  Database versions:
-* Kaptive v0.7.0 and above include the original _A. baumannii_ K and OC locus databases, as described in Wyres, KL. et al. _In prep_ 2019.
+* Kaptive v0.7.0 and above include the original _A. baumannii_ K and OC locus databases, as described in [Wyres, KL. et al. Microbial Genomics, 2020.](https://doi.org/10.1099/mgen.0.000339)
 
 
 
@@ -349,6 +360,21 @@ Kaptive uses 'tblastn' to screen for the presence of each locus gene with a cove
 
 A small number of the original _Klebsiella_ K locus references are truncated, containing only a partial <i>ugd</i> sequence. The reference annotations for these loci do not include <i>ugd</i>, so are not identified by the 'tblastn' search. Instead <b>Kaptive</b> reports the closest match to the partial sequence (if it exceeds the 90% coverage threshold). 
 
+#### Why has the best matching locus changed after I reran my analysis with an updated version of the database? ####
+
+The databases are updated as novel loci are discovered and curated. If your previous match had a confidence call of 'Low' or 'None' but your new match has higher confidence, this indicates that your genome contains a locus that was absent in the older version of the database! So nothing to worry about here.
+
+But what if your old match and your new match have 'Good' or better confidence levels?
+
+If your old match had 'Perfect' or 'Very High' confidence, please post an issue to the issues page, as this may indicate a problem with the new database!
+
+If your old match had 'Good' or 'High' confidence please read on...
+
+Polysaccharide loci are subject to frequent recombinations and rearrangements, which generates new variants. As a result, a small number of pairs of loci share large regions of homology e.g. the _Klebsiella_ K-locus KL170 is very similar to KL101, and in fact seems to be a hybrid of KL101 plus a small region from KL106. 
+Kaptive can accurately distinguish the KL101 and KL170 loci when it is working with high quality genome assemblies, but this task is much trickier if the assembly is fragmented. This means that matches to KL101 that were reported using an early version of the K-locus database might be reported as KL170 when using a later version of the database.
+However, this should only occur in instances where the K-locus is fragmented in the genome assembly and in that case Kaptive will have indicated 'problems' with the matches (e.g. '?' indicating fragmented assembly or '-' indicating that an expected gene is missing), and the corresponding confidence level will be at the lower end of the scale (i.e. 'Good' or 'High', but not 'Very High' or 'Perfect').
+You may want to try to figure out the correct locus manually, e.g. using [Bandage](https://rrwick.github.io/Bandage/) to BLAST the corresponding loci in your genome assembly graph. 
+
 
 ## Citation
 
@@ -359,7 +385,7 @@ If you use [Kaptive Web](http://kaptive.holtlab.net/) and/or the _Klebsiella_ O
 [Kaptive Web: user-friendly capsule and lipopolysaccharide serotype prediction for _Klebsiella_ genomes. Journal of Clinical Microbiology (2018).](http://jcm.asm.org/content/56/6/e00197-18)
 
 If you use the _A. baumannii_ K or OC locus database(s) in your research please cite this paper:
-Identification of _Acinetobacter baumannii_ loci for capsular polysaccharide (KL) and lipooligosaccharide outer core (OCL) synthesis in genome assemblies using curated reference databases compatible with Kaptive. Wyres KL, Cahill SM, Holt KE, Hall RM and Kenyon JJ. _In preparation_.  
+[Identification of _Acinetobacter baumannii_ loci for capsular polysaccharide (KL) and lipooligosaccharide outer core (OCL) synthesis in genome assemblies using curated reference databases compatible with Kaptive. Microbial Genomics (2020).](https://doi.org/10.1099/mgen.0.000339)  
 Lists of papers describing each of the individual _A. baumannii_ reference loci can be found [here](https://github.com/katholt/Kaptive/tree/master/extras).
 
 


=====================================
kaptive.py
=====================================
@@ -52,7 +52,7 @@ import random
 from collections import OrderedDict
 from Bio import SeqIO
 
-__version__ = '0.5.1'
+__version__ = '0.7.3'
 
 
 def main():


=====================================
reference_database/Acinetobacter_baumannii_OC_locus_primary_reference.gbk
=====================================
@@ -1446,7 +1446,7 @@ FEATURES             Location/Qualifiers
                      /note="sequence from NCBI GenBank accession number
                      KF030679 REGION: complement(28675..37977)"
      CDS             1..888
-                     /gene="gtrOC1""
+                     /gene="gtrOC1"
                      /codon_start=1
                      /transl_table=11
                      /product="GtrOC1 glycosyltransferase"
@@ -1749,7 +1749,7 @@ FEATURES             Location/Qualifiers
                      /note="sequence from NCBI WGS accession number
                      AMTB01000038 REGION: 221522..230586"
      CDS             1..888
-                     /gene="gtrOC1""
+                     /gene="gtrOC1"
                      /codon_start=1
                      /transl_table=11
                      /product="GtrOC1 glycosyltransferase"
@@ -2038,7 +2038,7 @@ FEATURES             Location/Qualifiers
                      /note="sequence from NCBI WGS accession number
                      AMFY01000013 REGION: 222496..228777"
      CDS             1..888
-                     /gene="gtrOC1""
+                     /gene="gtrOC1"
                      /codon_start=1
                      /transl_table=11
                      /product="GtrOC1 glycosyltransferase"
@@ -2245,7 +2245,7 @@ FEATURES             Location/Qualifiers
                      /note="sequence from NCBI WGS accession number
                      AMFI01000027 REGION: complement(34336..40843)"
      CDS             1..888
-                     /gene="gtrOC1""
+                     /gene="gtrOC1"
                      /codon_start=1
                      /transl_table=11
                      /product="GtrOC1 glycosyltransferase"


=====================================
reference_database/Klebsiella_k_locus_primary_reference.gbk
=====================================
The diff for this file was not included because it is too large.


View it on GitLab: https://salsa.debian.org/med-team/kaptive/-/commit/356ee2d6a6615f6d98011f8c1164404464c427ca

-- 
View it on GitLab: https://salsa.debian.org/med-team/kaptive/-/commit/356ee2d6a6615f6d98011f8c1164404464c427ca
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20200913/693ddfc2/attachment-0001.html>


More information about the debian-med-commit mailing list