[med-svn] [metaphlan2] 01/04: Imported Upstream version 2.6.0+ds
Andreas Tille
tille at debian.org
Tue Sep 13 07:30:35 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a commit to branch master
in repository metaphlan2.
commit 9c3291e16a90dd1b2fa6048505995fd3ea435da1
Author: Andreas Tille <tille at debian.org>
Date: Tue Sep 13 08:42:14 2016 +0200
Imported Upstream version 2.6.0+ds
---
.hg_archival.txt | 4 +
.hgsub | 2 +
.hgsubstate | 2 +
.hgtags | 9 +
README.md | 764 ++
changeset.txt | 13 +
db_v20/mpa_v20_m200.pkl | Bin 0 -> 39902847 bytes
license.txt | 7 +
metaphlan2.py | 1282 ++++
strainphlan.py | 1538 ++++
strainphlan_src/add_metadata_tree.py | 109 +
strainphlan_src/build_tree_single_strain.py | 146 +
strainphlan_src/compute_distance.py | 195 +
strainphlan_src/dump_file.py | 77 +
strainphlan_src/extract_markers.py | 45 +
strainphlan_src/fastx_len_filter.py | 17 +
strainphlan_src/fix_AF1.py | 36 +
strainphlan_src/logging.ini | 22 +
strainphlan_src/mixed_utils.py | 99 +
strainphlan_src/ooSubprocess.py | 300 +
strainphlan_src/plot_tree_graphlan.py | 177 +
strainphlan_src/sam_filter.py | 59 +
strainphlan_src/sample2markers.py | 421 ++
strainphlan_src/which.py | 25 +
strainphlan_tutorial/step1_download.sh | 3 +
strainphlan_tutorial/step2_fastq2sam.sh | 8 +
strainphlan_tutorial/step3_sam2marker.sh | 5 +
strainphlan_tutorial/step4_extract_db_marker.sh | 4 +
strainphlan_tutorial/step5_build_tree.sh | 4 +
.../step6_build_tree_single_strain.sh | 3 +
utils/extract_markers.py | 49 +
utils/markers_info.txt.bz2 | Bin 0 -> 33258288 bytes
utils/merge_metaphlan_tables.py | 103 +
utils/metaphlan2krona.py | 49 +
utils/metaphlan_hclust_heatmap.py | 483 ++
utils/plot_bug.py | 254 +
utils/species2genomes.txt | 7678 ++++++++++++++++++++
37 files changed, 13992 insertions(+)
diff --git a/.hg_archival.txt b/.hg_archival.txt
new file mode 100644
index 0000000..adfcf40
--- /dev/null
+++ b/.hg_archival.txt
@@ -0,0 +1,4 @@
+repo: b4e7c5505112b08d33dd30f4788429ba023e67f0
+node: c43e40a443edbd3c4cac7349d2679540578096f5
+branch: default
+tag: 2.6.0
diff --git a/.hgsub b/.hgsub
new file mode 100644
index 0000000..c9a57df
--- /dev/null
+++ b/.hgsub
@@ -0,0 +1,2 @@
+utils/export2graphlan = https://bitbucket.org/cibiocm/export2graphlan
+utils/hclust2 = https://bitbucket.org/nsegata/hclust2
diff --git a/.hgsubstate b/.hgsubstate
new file mode 100644
index 0000000..ace1a92
--- /dev/null
+++ b/.hgsubstate
@@ -0,0 +1,2 @@
+f8823b8162ddea6533866afd27d5ed1ce6ff22e0 utils/export2graphlan
+0d8cb18ce9996e7ce4043a00294aeb2ed9bfa5f2 utils/hclust2
diff --git a/.hgtags b/.hgtags
new file mode 100644
index 0000000..8b490a3
--- /dev/null
+++ b/.hgtags
@@ -0,0 +1,9 @@
+b4e7c5505112b08d33dd30f4788429ba023e67f0 2.0_alpha1
+60d254d499e2dd1a8b1cfe344236efa47f823ec6 2.0_beta1
+1b6df65b5a3e9feed0179f855c11fd197fe9a64f 2.0_beta2
+12cceaad3493085c4497898aaeff691913ddb633 2.0_beta3
+616a7debe7937672940130e6c5b26a9ef9e76fcd 2.0.0
+3959b668bbed6150698b594cbbc30a924e5d30e1 2.1.0
+0ef29ae841f52b53176ca264fb9f52f98713eb3c 2.2.0
+5424bb911dfcdb7212ea0949d4faeb6e69cfa61f 2.3.0
+6f2a1673af8565e93fb8e69238141889b7c87361 2.5.0
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..47f9b40
--- /dev/null
+++ b/README.md
@@ -0,0 +1,764 @@
+[TOC]
+
+#**MetaPhlAn 2: Metagenomic Phylogenetic Analysis**#
+
+AUTHORS: Duy Tin Truong (duytin.truong at unitn.it), Nicola Segata (nicola.segata at unitn.it)
+
+##**Description**##
+MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data with species level resolution. From version 2.0, MetaPhlAn is also able to identify specific strains (in the not-so-frequent cases in which the sample contains a previously sequenced strains) and to track strains across samples for all species.
+
+MetaPhlAn 2 relies on ~1M unique clade-specific marker genes ([the marker information file can be found at src/utils/markers_info.txt.bz2 or here](https://bitbucket.org/biobakery/metaphlan2/src/473a41eba501df5f750da032d4f04b38db98dde1/utils/markers_info.txt.bz2?at=default)) identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
+
+* unambiguous taxonomic assignments;
+* accurate estimation of organismal relative abundance;
+* species-level resolution for bacteria, archaea, eukaryotes and viruses;
+* strain identification and tracking
+* orders of magnitude speedups compared to existing methods.
+* metagenomic strain-level population genomics
+
+If you use this software, please cite :
+
+[**MetaPhlAn2 for enhanced metagenomic taxonomic profiling.**](http://www.nature.com/nmeth/journal/v12/n10/pdf/nmeth.3589.pdf)
+ *Duy Tin Truong, Eric A Franzosa, Timothy L Tickle, Matthias Scholz, George Weingart, Edoardo Pasolli, Adrian Tett, Curtis Huttenhower & Nicola Segata*.
+Nature Methods 12, 902–903 (2015)
+
+-------------
+
+##**Pre-requisites**##
+
+MetaPhlAn requires *python 2.7* or higher with argparse, tempfile and [*numpy*](http://www.numpy.org/) libraries installed
+ (apart for numpy they are usually installed together with the python distribution).
+ Python3 is also now supported.
+
+**If you provide the SAM output of [BowTie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) as input, there are no additional prerequisite.**
+
+* If you would like to use the BowTie2 integrated in MetaPhlAn, you need to have BowTie2 version 2.0.0 or higher and perl installed (bowtie2 needs to be in the system path with execute _and_ read permission)
+
+* If you use the "utils/metaphlan_hclust_heatmap.py" script to plot and hierarchical cluster multiple MetaPhlAn-profiled samples you will also need the following python libraries: [matplotlib](http://matplotlib.org/index.html), [scipy](http://www.scipy.org/), [pylab](http://wiki.scipy.org/PyLab) (if not installed together with MatPlotLib).
+
+* If you want to produce the output as "biom" file you also need [biom](http://biom-format.org/) installed
+
+* MetaPhlAn is not tightly integrated with advanced heatmap plotting with [hclust2](https://bitbucket.org/nsegata/hclust2) and cladogram visualization with [GraPhlAn](https://bitbucket.org/nsegata/graphlan/wiki/Home). If you use such visualization tool please refer to their prerequisites.
+
+----------------------
+
+##**Installation**##
+
+MetaPhlAn 2.0 can be obtained by either
+
+[Downloading MetaPhlAn v2.0](https://bitbucket.org/biobakery/metaphlan2/get/default.zip)
+
+**OR**
+
+Cloning the repository via the following commands
+``$ hg clone https://bitbucket.org/biobakery/metaphlan2``
+
+--------------------------
+
+
+##**Basic Usage**##
+
+This section presents some basic usages of MetaPhlAn2, for more advanced usages, please see at [its wiki](https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2).
+
+We assume here that ``metaphlan2.py`` is in the system path and that ``mpa_dir`` bash variable contains the main MetaPhlAn folder. You can set this two variables moving to your MetaPhlAn2 local folder and type:
+```
+#!cmd
+$ export PATH=`pwd`:$PATH
+$ export mpa_dir=`pwd`
+```
+
+Here is the basic example to profile a metagenome from raw reads (requires BowTie2 in the system path with execution and read permissions, Perl installed).
+
+```
+#!cmd
+$ metaphlan2.py metagenome.fastq --input_type fastq > profiled_metagenome.txt
+```
+
+It is highly recommended to save the intermediate BowTie2 output for re-running MetaPhlAn extremely quickly (--bowtie2out), and use multiple CPUs (--nproc) if available:
+
+```
+#!cmd
+$ metaphlan2.py metagenome.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq > profiled_metagenome.txt
+```
+
+If you already mapped your metagenome against the marker DB (using a previous MetaPhlAn run), you can obtain the results in few seconds by using the previously saved --bowtie2out file and specifying the input (--input_type bowtie2out):
+
+```
+#!cmd
+$ metaphlan2.py metagenome.bowtie2.bz2 --nproc 5 --input_type bowtie2out > profiled_metagenome.txt
+```
+
+You can also provide an externally BowTie2-mapped SAM if you specify this format with --input_type. Two steps here: first map your metagenome with BowTie2 and then feed MetaPhlAn2 with the obtained sam:
+
+```
+#!cmd
+$ bowtie2 --sam-no-hd --sam-no-sq --no-unal --very-sensitive -S metagenome.sam -x ${mpa_dir}/db_v20/mpa_v20_m200 -U metagenome.fastq
+$ metaphlan2.py metagenome.sam --input_type sam > profiled_metagenome.txt
+```
+
+In order to make MetaPhlAn 2 easily compatible with complex metagenomic pipeline, there are now multiple alternative ways to pass the input:
+
+```
+#!cmd
+$ cat metagenome.fastq | metaphlan2.py --input_type fastq > profiled_metagenome.txt
+```
+
+```
+#!cmd
+$ tar xjf metagenome.tar.bz2 --to-stdout | metaphlan2.py --input_type fastq --bowtie2db ${mpa_dir}/db_v20/mpa_v20_m200 > profiled_metagenome.txt
+```
+
+```
+#!cmd
+$ metaphlan2.py --input_type fastq < metagenome.fastq > profiled_metagenome.txt
+```
+
+```
+#!cmd
+$ metaphlan2.py --input_type fastq <(bzcat metagenome.fastq.bz2) > profiled_metagenome.txt
+```
+
+```
+#!cmd
+$ metaphlan2.py --input_type fastq <(zcat metagenome_1.fastq.gz metagenome_2.fastq.gz) > profiled_metagenome.txt
+```
+
+MetaPhlAn 2 can also natively **handle paired-end metagenomes** (but does not use the paired-end information), and, more generally, metagenomes stored in multiple files (but you need to specify the --bowtie2out parameter):
+
+```
+#!cmd
+$ metaphlan2.py metagenome_1.fastq,metagenome_2.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq > profiled_metagenome.txt
+```
+
+For advanced options and other analysis types (such as strain tracking) please refer to the full command-line options.
+
+##**Full command-line options**##
+
+
+```
+usage: metaphlan2.py --input_type
+ {fastq,fasta,multifasta,multifastq,bowtie2out,sam}
+ [--mpa_pkl MPA_PKL] [--bowtie2db METAPHLAN_BOWTIE2_DB]
+ [--bt2_ps BowTie2 presets] [--bowtie2_exe BOWTIE2_EXE]
+ [--bowtie2out FILE_NAME] [--no_map] [--tmp_dir]
+ [--tax_lev TAXONOMIC_LEVEL] [--min_cu_len]
+ [--min_alignment_len] [--ignore_viruses]
+ [--ignore_eukaryotes] [--ignore_bacteria]
+ [--ignore_archaea] [--stat_q]
+ [--ignore_markers IGNORE_MARKERS] [--avoid_disqm]
+ [--stat] [-t ANALYSIS TYPE] [--nreads NUMBER_OF_READS]
+ [--pres_th PRESENCE_THRESHOLD] [--clade] [--min_ab] [-h]
+ [-o output file] [--sample_id_key name]
+ [--sample_id value] [-s sam_output_file]
+ [--biom biom_output] [--mdelim mdelim] [--nproc N] [-v]
+ [INPUT_FILE] [OUTPUT_FILE]
+
+DESCRIPTION
+ MetaPhlAn version 2.1.0 (28 April 2015):
+ METAgenomic PHyLogenetic ANalysis for metagenomic taxonomic profiling.
+
+AUTHORS: Nicola Segata (nicola.segata at unitn.it), Duy Tin Truong (duytin.truong at unitn.it)
+
+COMMON COMMANDS
+
+ We assume here that metaphlan2.py is in the system path and that mpa_dir bash variable contains the
+ main MetaPhlAn folder. Also BowTie2 should be in the system path with execution and read
+ permissions, and Perl should be installed.
+
+========== MetaPhlAn 2 clade-abundance estimation =================
+
+The basic usage of MetaPhlAn 2 consists in the identification of the clades (from phyla to species and
+strains in particular cases) present in the metagenome obtained from a microbiome sample and their
+relative abundance. This correspond to the default analysis type (--analysis_type rel_ab).
+
+* Profiling a metagenome from raw reads:
+$ metaphlan2.py metagenome.fastq --input_type fastq
+
+* You can take advantage of multiple CPUs and save the intermediate BowTie2 output for re-running
+ MetaPhlAn extremely quickly:
+$ metaphlan2.py metagenome.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq
+
+* If you already mapped your metagenome against the marker DB (using a previous MetaPhlAn run), you
+ can obtain the results in few seconds by using the previously saved --bowtie2out file and
+ specifying the input (--input_type bowtie2out):
+$ metaphlan2.py metagenome.bowtie2.bz2 --nproc 5 --input_type bowtie2out
+
+* You can also provide an externally BowTie2-mapped SAM if you specify this format with
+ --input_type. Two steps: first apply BowTie2 and then feed MetaPhlAn2 with the obtained sam:
+$ bowtie2 --sam-no-hd --sam-no-sq --no-unal --very-sensitive -S metagenome.sam -x ${mpa_dir}/db_v20/mpa_v20_m200 -U metagenome.fastq
+$ metaphlan2.py metagenome.sam --input_type sam > profiled_metagenome.txt
+
+* Multiple alternative ways to pass the input are also available:
+$ cat metagenome.fastq | metaphlan2.py --input_type fastq
+$ tar xjf metagenome.tar.bz2 --to-stdout | metaphlan2.py --input_type fastq
+$ metaphlan2.py --input_type fastq < metagenome.fastq
+$ metaphlan2.py --input_type fastq <(bzcat metagenome.fastq.bz2)
+$ metaphlan2.py --input_type fastq <(zcat metagenome_1.fastq.gz metagenome_2.fastq.gz)
+
+* We can also natively handle paired-end metagenomes, and, more generally, metagenomes stored in
+ multiple files (but you need to specify the --bowtie2out parameter):
+$ metaphlan2.py metagenome_1.fastq,metagenome_2.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq
+
+-------------------------------------------------------------------
+
+
+========== MetaPhlAn 2 strain tracking ============================
+
+MetaPhlAn 2 introduces the capability of charaterizing organisms at the strain level using non
+aggregated marker information. Such capability comes with several slightly different flavours and
+are a way to perform strain tracking and comparison across multiple samples.
+Usually, MetaPhlAn 2 is first ran with the default --analysis_type to profile the species present in
+the community, and then a strain-level profiling can be performed to zoom-in into specific species
+of interest. This operation can be performed quickly as it exploits the --bowtie2out intermediate
+file saved during the execution of the default analysis type.
+
+* The following command will output the abundance of each marker with a RPK (reads per kil-base)
+ higher 0.0. (we are assuming that metagenome_outfmt.bz2 has been generated before as
+ shown above).
+$ metaphlan2.py -t marker_ab_table metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt
+ The obtained RPK can be optionally normalized by the total number of reads in the metagenome
+ to guarantee fair comparisons of abundances across samples. The number of reads in the metagenome
+ needs to be passed with the '--nreads' argument
+
+* The list of markers present in the sample can be obtained with '-t marker_pres_table'
+$ metaphlan2.py -t marker_pres_table metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt
+ The --pres_th argument (default 1.0) set the minimum RPK value to consider a marker present
+
+* The list '-t clade_profiles' analysis type reports the same information of '-t marker_ab_table'
+ but the markers are reported on a clade-by-clade basis.
+$ metaphlan2.py -t clade_profiles metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt
+
+* Finally, to obtain all markers present for a specific clade and all its subclades, the
+ '-t clade_specific_strain_tracker' should be used. For example, the following command
+ is reporting the presence/absence of the markers for the B. fragulis species and its strains
+$ metaphlan2.py -t clade_specific_strain_tracker --clade s__Bacteroides_fragilis metagenome_outfmt.bz2 db_v20/mpa_v20_m200.pkl --input_type bowtie2out > marker_abundance_table.txt
+ the optional argument --min_ab specifies the minimum clade abundance for reporting the markers
+
+-------------------------------------------------------------------
+
+positional arguments:
+ INPUT_FILE the input file can be:
+ * a fastq file containing metagenomic reads
+ OR
+ * a BowTie2 produced SAM file.
+ OR
+ * an intermediary mapping file of the metagenome generated by a previous MetaPhlAn run
+ If the input file is missing, the script assumes that the input is provided using the standard
+ input, or named pipes.
+ IMPORTANT: the type of input needs to be specified with --input_type
+ OUTPUT_FILE the tab-separated output file of the predicted taxon relative abundances
+ [stdout if not present]
+
+Required arguments:
+ --mpa_pkl MPA_PKL the metadata pickled MetaPhlAn file
+ --input_type {fastq,fasta,multifasta,multifastq,bowtie2out,sam}
+ set whether the input is the multifasta file of metagenomic reads or
+ the SAM file of the mapping of the reads against the MetaPhlAn db.
+ [default 'automatic', i.e. the script will try to guess the input format]
+
+Mapping arguments:
+ --bowtie2db METAPHLAN_BOWTIE2_DB
+ The BowTie2 database file of the MetaPhlAn database.
+ Used if --input_type is fastq, fasta, multifasta, or multifastq
+ --bt2_ps BowTie2 presets
+ presets options for BowTie2 (applied only when a multifasta file is provided)
+ The choices enabled in MetaPhlAn are:
+ * sensitive
+ * very-sensitive
+ * sensitive-local
+ * very-sensitive-local
+ [default very-sensitive]
+ --bowtie2_exe BOWTIE2_EXE
+ Full path and name of the BowTie2 executable. This option allows
+ MetaPhlAn to reach the executable even when it is not in the system
+ PATH or the system PATH is unreachable
+ --bowtie2out FILE_NAME
+ The file for saving the output of BowTie2
+ --no_map Avoid storing the --bowtie2out map file
+ --tmp_dir the folder used to store temporary files
+ [default is the OS dependent tmp dir]
+
+Post-mapping arguments:
+ --tax_lev TAXONOMIC_LEVEL
+ The taxonomic level for the relative abundance output:
+ 'a' : all taxonomic levels
+ 'k' : kingdoms (Bacteria and Archaea) only
+ 'p' : phyla only
+ 'c' : classes only
+ 'o' : orders only
+ 'f' : families only
+ 'g' : genera only
+ 's' : species only
+ [default 'a']
+ --min_cu_len minimum total nucleotide length for the markers in a clade for
+ estimating the abundance without considering sub-clade abundances
+ [default 2000]
+ --min_alignment_len The sam records for aligned reads with the longest subalignment
+ length smaller than this threshold will be discarded.
+ [default None]
+ --ignore_viruses Do not profile viral organisms
+ --ignore_eukaryotes Do not profile eukaryotic organisms
+ --ignore_bacteria Do not profile bacterial organisms
+ --ignore_archaea Do not profile archeal organisms
+ --stat_q Quantile value for the robust average
+ [default 0.1]
+ --ignore_markers IGNORE_MARKERS
+ File containing a list of markers to ignore.
+ --avoid_disqm Descrivate the procedure of disambiguating the quasi-markers based on the
+ marker abundance pattern found in the sample. It is generally recommended
+ too keep the disambiguation procedure in order to minimize false positives
+ --stat EXPERIMENTAL! Statistical approach for converting marker abundances into clade abundances
+ 'avg_g' : clade global (i.e. normalizing all markers together) average
+ 'avg_l' : average of length-normalized marker counts
+ 'tavg_g' : truncated clade global average at --stat_q quantile
+ 'tavg_l' : trunated average of length-normalized marker counts (at --stat_q)
+ 'wavg_g' : winsorized clade global average (at --stat_q)
+ 'wavg_l' : winsorized average of length-normalized marker counts (at --stat_q)
+ 'med' : median of length-normalized marker counts
+ [default tavg_g]
+
+Additional analysis types and arguments:
+ -t ANALYSIS TYPE Type of analysis to perform:
+ * rel_ab: profiling a metagenomes in terms of relative abundances
+ * rel_ab_w_read_stats: profiling a metagenomes in terms of relative abundances and estimate the number of reads comming from each clade.
+ * reads_map: mapping from reads to clades (only reads hitting a marker)
+ * clade_profiles: normalized marker counts for clades with at least a non-null marker
+ * marker_ab_table: normalized marker counts (only when > 0.0 and normalized by metagenome size if --nreads is specified)
+ * marker_pres_table: list of markers present in the sample (threshold at 1.0 if not differently specified with --pres_th
+ [default 'rel_ab']
+ --nreads NUMBER_OF_READS
+ The total number of reads in the original metagenome. It is used only when
+ -t marker_table is specified for normalizing the length-normalized counts
+ with the metagenome size as well. No normalization applied if --nreads is not
+ specified
+ --pres_th PRESENCE_THRESHOLD
+ Threshold for calling a marker present by the -t marker_pres_table option
+ --clade The clade for clade_specific_strain_tracker analysis
+ --min_ab The minimum percentage abundace for the clade in the clade_specific_strain_tracker analysis
+ -h, --help show this help message and exit
+
+Output arguments:
+ -o output file, --output_file output file
+ The output file (if not specified as positional argument)
+ --sample_id_key name Specify the sample ID key for this analysis. Defaults to '#SampleID'.
+ --sample_id value Specify the sample ID for this analysis. Defaults to 'Metaphlan2_Analysis'.
+ -s sam_output_file, --samout sam_output_file
+ The sam output file
+ --biom biom_output, --biom_output_file biom_output
+ If requesting biom file output: The name of the output file in biom format
+ --mdelim mdelim, --metadata_delimiter_char mdelim
+ Delimiter for bug metadata: - defaults to pipe. e.g. the pipe in k__Bacteria|p__Proteobacteria
+
+Other arguments:
+ --nproc N The number of CPUs to use for parallelizing the mapping
+ [default 1, i.e. no parallelism]
+ -v, --version Prints the current MetaPhlAn version and exit
+
+
+```
+
+##**Utility Scripts**##
+
+MetaPhlAn's repository features a few utility scripts to aid in manipulation of sample output and its visualization. These scripts can be found under the ``utils`` folder in the metaphlan2 directory.
+
+###**Merging Tables**###
+
+The script **merge_metaphlan_tables.py** allows to combine MetaPhlAn output from several samples to be merged into one table Bugs (rows) vs Samples (columns) with the table enlisting the relative normalized abundances per sample per bug.
+
+To merge multiple output files, run the script as below
+
+```
+#!cmd
+$ python utils/merge_metaphlan_tables.py metaphlan_output1.txt metaphlan_output2.txt metaphlan_output3.txt > output/merged_abundance_table.txt
+```
+
+Wildcards can be used as needed:
+```
+#!cmd
+$ python utils/merge_metaphlan_tables.py metaphlan_output*.txt > output/merged_abundance_table.txt
+```
+
+**There is no limit to how many files you can merge.**
+
+##**Heatmap Visualization**##
+
+The script **metaphlan_hclust_heatmap.py** allows to visualize the MetaPhlAn results in the form of a hierarchically-clustered heatmap. To generate the heatmap for a merged MetaPhlAn output table (as described above), please run the script as below.
+
+```
+#!cmd
+$ python utils/metaphlan_hclust_heatmap.py -c bbcry --top 25 --minv 0.1 -s log --in output/merged_abundance_table.txt --out output_images/abundance_heatmap.png
+```
+
+For detailed command-line instructions, please refer to below:
+
+
+```
+#!
+
+$ utils/metaphlan_hclust_heatmap.py -h
+usage: metaphlan_hclust_heatmap.py [-h] --in INPUT_FILE --out OUTPUT_FILE
+ [-m {single,complete,average,weighted,centroid,median,ward}]
+ [-d {euclidean,minkowski,cityblock,seuclidean,sqeuclidean,cosine,correlation,hamming,jaccard,chebyshev,canberra,braycurtis,mahalanobis,yule,matching,dice,kulsinski,rogerstanimoto,russellrao,sokalmichener,sokalsneath,wminkowski,ward}]
+ [-f {euclidean,minkowski,cityblock,seuclidean,sqeuclidean,cosine,correlation,hamming,jaccard,chebyshev,canberra,braycurtis,mahalanobis,yule,matching,dice,kulsinski,rogerstanimoto,russellrao,sokalmichener,sokalsneath,wminkowski,ward}]
+ [-s scale norm] [-x X] [-y Y] [--minv MINV]
+ [--maxv max value]
+ [--tax_lev TAXONOMIC_LEVEL] [--perc PERC]
+ [--top TOP] [--sdend_h SDEND_H]
+ [--fdend_w FDEND_W] [--cm_h CM_H]
+ [--cm_ticks label for ticks of the colormap]
+ [--font_size FONT_SIZE]
+ [--clust_line_w CLUST_LINE_W]
+ [-c {Accent,Blues,BrBG,BuGn,BuPu,Dark2,GnBu,Greens,Greys,OrRd,Oranges,PRGn,Paired,Pastel1,Pastel2,PiYG,PuBu,PuBuGn,PuOr,PuRd,Purples,RdBu,RdGy,RdPu,RdYlBu,RdYlGn,Reds,Set1,Set2,Set3,Spectral,YlGn,YlGnBu,YlOrBr,YlOrRd,afmhot,autumn,binary,bone,brg,bwr,cool,copper,flag,gist_earth,gist_gray,gist_heat,gist_ncar,gist_rainbow,gist_stern,gist_yarg,gnuplot,gnuplot2,gray,hot,hsv,jet,ocean,pink,prism,rainbow,seismic,spectral,spring,summer,terrain,winter,bbcyr,bbcry}]
+
+This scripts generates heatmaps with hierarchical clustering of both samples
+and microbial clades. The script can also subsample the number of clades to
+display based on the their nth percentile abundance value in each sample
+
+optional arguments:
+ -h, --help show this help message and exit
+ --in INPUT_FILE The input file of microbial relative abundances. This
+ file is typically obtained with the
+ "utils/merge_metaphlan_tables.py"
+ --out OUTPUT_FILE The output image. The extension of the file determines
+ the image format. png, pdf, and svg are the preferred
+ format
+ -m {single,complete,average,weighted,centroid,median,ward}
+ The hierarchical clustering method, default is
+ "average"
+ -d {euclidean,minkowski,cityblock,seuclidean,sqeuclidean,cosine,correlation,hamming,jaccard,chebyshev,canberra,braycurtis,mahalanobis,yule,matching,dice,kulsinski,rogerstanimoto,russellrao,sokalmichener,sokalsneath,wminkowski,ward}
+ The distance function for samples. Default is
+ "braycurtis"
+ -f {euclidean,minkowski,cityblock,seuclidean,sqeuclidean,cosine,correlation,hamming,jaccard,chebyshev,canberra,braycurtis,mahalanobis,yule,matching,dice,kulsinski,rogerstanimoto,russellrao,sokalmichener,sokalsneath,wminkowski,ward}
+ The distance function for microbes. Default is
+ "correlation"
+ -s scale norm
+ -x X Width of heatmap cells. Automatically set, this option
+ should not be necessary unless for very large heatmaps
+ -y Y Height of heatmap cells. Automatically set, this
+ option should not be necessary unless for very large
+ heatmaps
+ --minv MINV Minimum value to display. Default is 0.0, values
+ around 0.001 are also reasonable
+ --maxv max value Maximum value to display. Default is maximum value
+ present, can be set e.g. to 100 to display the full
+ scale
+ --tax_lev TAXONOMIC_LEVEL
+ The taxonomic level to display: 'a' : all taxonomic
+ levels 'k' : kingdoms (Bacteria and Archaea) only 'p'
+ : phyla only 'c' : classes only 'o' : orders only 'f'
+ : families only 'g' : genera only 's' : species only
+ [default 's']
+ --perc PERC Percentile to be used for ordering the microbes in
+ order to select with --top the most abundant microbes
+ only. Default is 90
+ --top TOP Display the --top most abundant microbes only
+ (ordering based on --perc)
+ --sdend_h SDEND_H Set the height of the sample dendrogram. Default is
+ 0.1
+ --fdend_w FDEND_W Set the width of the microbes dendrogram. Default is
+ 0.1
+ --cm_h CM_H Set the height of the colormap. Default = 0.03
+ --cm_ticks label for ticks of the colormap
+ --font_size FONT_SIZE
+ Set label font sizes. Default is 7
+ --clust_line_w CLUST_LINE_W
+ Set the line width for the dendrograms
+ -c {Accent,Blues,BrBG,BuGn,BuPu,Dark2,GnBu,Greens,Greys,OrRd,Oranges,PRGn,Paired,Pastel1,Pastel2,PiYG,PuBu,PuBuGn,PuOr,PuRd,Purples,RdBu,RdGy,RdPu,RdYlBu,RdYlGn,Reds,Set1,Set2,Set3,Spectral,YlGn,YlGnBu,YlOrBr,YlOrRd,afmhot,autumn,binary,bone,brg,bwr,cool,copper,flag,gist_earth,gist_gray,gist_heat,gist_ncar,gist_rainbow,gist_stern,gist_yarg,gnuplot,gnuplot2,gray,hot,hsv,jet,ocean,pink,prism,rainbow,seismic,spectral,spring,summer,terrain,winter,bbcyr,bbcry}
+ Set the colormap. Default is "jet".
+```
+
+###**GraPhlAn Visualization**###
+
+The tutorial of using GraPhlAn can be found from [the MetaPhlAn2 wiki](https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2).
+
+
+##**Customizing the database**##
+In order to add a marker to the database, the user needs the following steps:
+
+* Reconstruct the marker sequences (in fasta format) from the MetaPhlAn2 bowtie2 database by:
+
+```
+#!bash
+
+bowtie2-inspect metaphlan2/db_v20/mpa_v20_m200 > metaphlan2/markers.fasta
+
+```
+
+
+* Add the marker sequence stored in a file new_marker.fasta to the marker set:
+
+```
+#!bash
+
+cat new_marker.fasta >> metaphlan2/markers.fasta
+
+```
+
+* Rebuild the bowtie2 database:
+
+```
+#!bash
+
+mkdir metaphlan2/db_v21/mpa_v21_m200
+bowtie2-build metaphlan2/markers.fasta metaphlan2/db_v21/mpa_v21_m200
+
+```
+
+* Assume that the new marker was extracted from genome1, genome2. Update the taxonomy file from python console as follows:
+
+```
+#!python
+
+import cPickle as pickle
+import bz2
+
+db = pickle.load(bz2.BZ2File('db_v20/mpa_v20_m200.pkl', 'r'))
+
+# Add the taxonomy of the new genomes
+db['taxonomy']['taxonomy of genome1'] = length of genome1
+db['taxonomy']['taxonomy of genome2'] = length of genome2
+
+# Add the information of the new marker as the other markers
+db['markers'][new_marker_name] = {
+ 'clade': the clade that the marker belongs to,
+ 'ext': {the name of the first external genome where the marker appears,
+ the name of the second external genome where the marker appears,
+ },
+ 'len': length of the marker,
+ 'score': score of the marker,
+ 'taxon': the taxon of the marker}
+# To see an example, try to print the first marker information:
+# print db['markers'].items()[0]
+
+# Save the new mpa_pkl file
+ofile = bz2.BZ2File('metaphlan2/db_v21/mpa_v21_m200.pkl', 'w')
+pickle.dump(db, ofile, pickle.HIGHEST_PROTOCOL)
+ofile.close()
+```
+
+* To use the new database, switch to metaphlan2/db_v21 instead of metaphlan2/db_v20 when running metaphlan2.py with option "--mpa_pkl".
+
+
+##**Metagenomic strain-level population genomics**##
+
+StrainPhlAn is a computational tool for tracking individual strains across large set of samples. **The input** of StrainPhlAn is a set of metagenomic samples and for each species, **the output** is a multiple sequence alignment (MSA) file of all species strains reconstructed directly from the samples. From this MSA, StrainPhlAn calls RAxML (or other phylogenetic tree builders) to build the phylogenetic tree showing the strain evolution of the sample strains.
+For each sample, StrainPhlAn extracts the strain of a specific species by merging and concatenating all reads mapped against that species markers in the MetaPhlAn2 database.
+
+In detail, let us start from a toy example with 6 HMP gut metagenomic samples (SRS055982-subjectID_638754422, SRS022137-subjectID_638754422, SRS019161-subjectID_763496533, SRS013951-subjectID_763496533, SRS014613-subjectID_763840445, SRS064276-subjectID_763840445) from 3 three subjects (each was sampled at two time points) and one *Bacteroides caccae* genome G000273725.
+**We would like to**:
+
+* extract the *Bacteroides caccae* strains from these samples and compare them with the reference genome in a phylogenetic tree.
+* know how many snps between those strains and the reference genome.
+
+Running StrainPhlAn on these samples, we will obtain the *Bacteroides caccae* phylogentic tree and its multiple sequence alignment in the following figure (produced with [ete2](http://etetoolkit.org/) and [Jalview](http://www.jalview.org/)):
+
+![tree_alignment.png](https://bitbucket.org/repo/rM969K/images/476974413-tree_alignment.png)
+
+We can see that the strains from the same subject are grouped together. The tree also highlights that the strains from subject "763840445" (red color) do not change between two sampling time points whereas the strains from the other subjects have slightly evolved. From the tree, we also know that the strains of subject "763496533" is closer to the reference genome than those of the others.
+In addition, the table below shows the number of snps between the sample strains and the reference genome based on the strain alignment returned by MetaPhlAN_Strainer.
+
+![snp_distance.png](https://bitbucket.org/repo/rM969K/images/1771497600-snp_distance.png)
+
+In the next sections, we will illustrate step by step how to run MetaPhlAn_Strainer on this toy example to reproduce the above figures.
+
+### Pre-requisites ###
+StrainPhlAn requires *python 2.7* and the libraries [pysam](http://pysam.readthedocs.org/en/latest/) (tested on **version 0.8.3**), [biopython](http://biopython.org/wiki/Main_Page), [msgpack](https://pypi.python.org/pypi/msgpack-python) and [numpy](http://www.numpy.org/), [dendropy](https://pythonhosted.org/DendroPy/) (tested on version **3.12.0**). Besides, StrainPhlAn also needs the following programs in the executable path:
+
+* [bowtie2](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) for mapping reads against the marker database.
+
+* [MUSCLE](http://www.drive5.com/muscle/) for the alignment step.
+
+* [samtools, bcftools and vcfutils.pl](http://samtools.sourceforge.net/) which can be downloaded from [here](https://github.com/samtools) for building consensus markers. Note that vcfutils.pl is included in bcftools and **StrainPhlAn only works with samtools version 0.1.19** as samtools has changed the output format after this version.
+
+* [blastn](ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) for adding reference genomes to the phylogenetic tree.
+
+* [raxmlHPC and raxmlHPC-PTHREADS-SSE3](http://sco.h-its.org/exelixis/web/software/raxml/index.html) for building the phylogenetic trees.
+
+All dependence binaries on Linux 64 bit can be downloaded in the folder "bin" from [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+
+The script files in folder "strainphlan_src" should be changed to executable mode by:
+
+
+```
+#!python
+
+chmod +x strainphlan_src/*.py
+```
+
+and add to the executable path:
+
+```
+#!python
+
+export PATH=$PATH:$(pwd -P)/strainphlan_src
+```
+
+### Usage ###
+
+Let's reproduce the toy example result in the introduction section. Note that all the commands to run the below steps are in the "strainer_tutorial/step?*.sh" files (? corresponding to the step number). All the below steps are excuted under the "strainer_tutorial" folder.
+The steps are as follows:
+
+Step 1. Download 6 HMP gut metagenomic samples, the metadata.txt file and one reference genome from the folder "fastqs" and "reference_genomes" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0) and put these folders under the "strainer_tutorial" folder.
+
+Step 2. Obtain the sam files from these samples by mapping them against MetaPhlAn2 database:
+
+This step will run MetaPhlAn2 to map all metagenomic samples against the MetaPhlAn2 marker database and produce the sam files (\*.sam.bz2).
+Each sam file (in SAM format) corresponding to each sample contains the reads mapped against the marker database of MetaPhlAn2.
+The commands to run are:
+
+```
+#!python
+
+mkdir -p sams
+for f in $(ls fastqs/*.bz2)
+do
+ echo "Running metaphlan2 on ${f}"
+ bn=$(basename ${f} | cut -d . -f 1)
+ tar xjfO ${f} | ../metaphlan2.py --bowtie2db ../db_v20/mpa_v20_m200 --mpa_pkl ../db_v20/mpa_v20_m200.pkl --input_type multifastq --nproc 10 -s sams/${bn}.sam.bz2 --bowtie2out sams/${bn}.bowtie2_out.bz2 -o sams/${bn}.profile
+done
+```
+
+After this step, you will have a folder "sams" containing the sam files (\*.sam.bz2) and other MetaPhlAn2 output files.
+This step will take around 270 minutes. If you want to skip this step, you can download the sam files from the folder "sams" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+
+Step 3. Produce the consensus-marker files which are the input for StrainPhlAn:
+
+For each sample, this step will reconstruct all species strains found in it and store them in a marker file (\*.markers). Those strains are referred as *sample-reconstructed strains*. Additional details in generating consensus sequences can be found [here](http://samtools.sourceforge.net/mpileup.shtml).
+The commands to run are:
+
+
+```
+#!python
+
+mkdir -p consensus_markers
+cwd=$(pwd -P)
+export PATH=${cwd}/../strainphlan_src:${PATH}
+python ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 --input_type sam --output_dir consensus_markers --nprocs 10 &> consensus_markers/log.txt
+```
+
+The result is the same if you want run several sample2markers.py scripts in parallel with each run for a sample (this maybe useful for some cluster-system settings).
+After this step, you will have a folder "consensus_markers" containing all sample-marker files (\*.markers).
+This steps will take around 44 minutes. If you want to skip this step, you can download the consensus marker files from the folder "consensus_markers" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+
+Step 4. Extract the markers of *Bacteroides_caccae* from MetaPhlAn2 database (to add its reference genome later):
+
+This step will extract the markers of *Bacteroides_caccae* in the database and then StrainPhlAn will identify the sequences in the reference genomes that are closet to them (in the next step by using blast). Those will be concatenated and referred as *reference-genome-reconstructed strains*.
+The commands to run are:
+
+```
+#!python
+
+mkdir -p db_markers
+bowtie2-inspect ../db_v20/mpa_v20_m200 > db_markers/all_markers.fasta
+python ../strainphlan_src/extract_markers.py --mpa_pkl ../db_v20/mpa_v20_m200.pkl --ifn_markers db_markers/all_markers.fasta --clade s__Bacteroides_caccae --ofn_markers db_markers/s__Bacteroides_caccae.markers.fasta
+```
+
+Note that the "all_markers.fasta" file consists can be reused for extracting other reference genomes.
+After this step, you should have two files in folder "db_markers": "all_markers.fasta" containing all marker sequences, and "s__Bacteroides_caccae.markers.fasta" containing the markers of *Bacteroides caccae*.
+This step will take around 1 minute and can skipped if you do not need to add the reference genomes to the phylogenetic tree. Those markers can be found in the folder "db_markers" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+
+Before building the trees, we should get the list of all clades detected from the samples and save them in the "output/clades.txt" file by the following command:
+```
+#!python
+
+python ../strainphlan.py --mpa_pkl ../db_v20/mpa_v20_m200.pkl --ifn_samples consensus_markers/*.markers --output_dir output --nprocs_main 10 --print_clades_only > output/clades.txt
+```
+
+The clade names in the output file "clades.txt" will be used for the next step.
+
+Step 5. Build the multiple sequence alignment and phylogenetic tree:
+
+This step will align and clean the *sample-reconstructed strains* (stored in the marker files produced in step 3) and *reference-genome-reconstructed strains* (extracted based on the database markers in step 4) to produce a multiple sequence alignment (MSA) and store it in the file "clade_name.fasta". From this MSA file, StrainPhlAn will call RAxML to build the phylogenetic tree.
+Note that: all marker files (\*.markers) **must be used together** as the input for the strainphlan.py script because StrainPhlAn needs to align all of the strains at once.
+
+The commands to run are:
+
+```
+#!python
+
+mkdir -p output
+python ../strainphlan.py --mpa_pkl ../db_v20/mpa_v20_m200.pkl --ifn_samples consensus_markers/*.markers --ifn_markers db_markers/s__Bacteroides_caccae.markers.fasta --ifn_ref_genomes reference_genomes/G000273725.fna.bz2 --output_dir output --nprocs_main 10 --clades s__Bacteroides_caccae &> output/log_full.txt
+```
+
+This step will take around 2 minutes. After this step, you will find the tree "output/RAxML_bestTree.s__Bacteroides_caccae.tree". All the output files can be found in the folder "output" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+You can view it by [Archaeopteryx](https://sites.google.com/site/cmzmasek/home/software/archaeopteryx) or any other viewers.
+
+By default, if you do not specify reference genomes (by --ifn_ref_genomes) and any specific clade (by --clades), strainphlan.py will build the phylogenetic trees for all species that it can detect.
+
+In order to add the metadata, we also provide a script called "add_metadata_tree.py" which can be used as follows:
+
+```
+#!python
+
+python ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
+```
+
+The script "add_metadata_tree.py" can accept multiple metadata files (space separated, wild card can also be used) and multiple trees. A metadata file is a tab separated file where the first row is the meta-headers, and the following rows contain the metadata for each sample. Multiple metadata files are used in the case where your samples come from more than one dataset and you do not want to merge the metadata files.
+For more details of using "add_metadata_tree.py", please see its help (with option "-h").
+An example of a metadata file is the "fastqs/metadata.txt" file with the below content:
+
+```
+#!python
+
+sampleID subjectID
+SRS055982 638754422
+SRS022137 638754422
+SRS019161 763496533
+SRS013951 763496533
+SRS014613 763840445
+SRS064276 763840445
+G000273725 ReferenceGenomes
+```
+
+Note that "sampleID" is a compulsory field.
+
+After adding the metadata, you will obtain the tree files "*.tree.metadata" with metadata and view them by [Archaeopteryx](https://sites.google.com/site/cmzmasek/home/software/archaeopteryx) as in the previous step.
+
+If you have installed [graphlan](https://bitbucket.org/nsegata/graphlan/wiki/Home), you can plot the tree with the command:
+
+
+```
+#!python
+
+python ../strainphlan_src/plot_tree_graphlan.py --ifn_tree output/RAxML_bestTree.s__Bacteroides_caccae.tree.metadata --colorized_metadata subjectID
+```
+
+and obtain the following figure (output/RAxML_bestTree.s__Bacteroides_caccae.tree.metadata.png):
+
+![RAxML_bestTree.s__Bacteroides_caccae.tree.metadata.png](https://bitbucket.org/repo/rM969K/images/1574126761-RAxML_bestTree.s__Bacteroides_caccae.tree.metadata.png)
+
+Step 6. If you want to remove the samples with high-probability of containing multiple strains, you can rebuild the tree by removing the multiple strains:
+
+```
+#!python
+
+python ../strainphlan_src/build_tree_single_strain.py --ifn_alignments output/s__Bacteroides_caccae.fasta --nprocs 10 --log_ofn output/build_tree_single_strain.log
+python ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
+```
+
+You will obtain the refined tree "output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree.metadata". This tree can be found in the folder "output" in [this link](https://www.dropbox.com/sh/m4na8wefp53j8ej/AABA3yVsG26TbB0t1cnBS9-Ra?dl=0).
+
+### Some useful options ###
+All option details can be viewed by strainphlan.py help:
+```
+#!python
+
+python ../strainphlan.py -h
+```
+
+The default setting can be stringent for some cases where you have very few samples left in the phylogenetic tree. You can relax some parameters to add more samples back:
+
+1. *marker_in_clade*: In each sample, the clades with the percentage of present markers less than this threshold are removed. Default "0.8". You can set this parameter to "0.5" to add some more samples.
+2. *sample_in_marker*: If the percentage of samples that a marker present in is less than this threhold, that marker is removed. Default "0.8". You can set this parameter to "0.5" to add some more samples.
+3. *N_in_marker*: The consensus markers with the percentage of N nucleotides greater than this threshold are removed. Default "0.2". You can set this parameter to "0.5" to add some more samples.
+4. *gap_in_sample*: The samples with full sequences concatenated from all markers and having the percentage of gaps greater than this threshold will be removed. Default 0.2. You can set this parameter to "0.5" to add some more samples.
+5. *relaxed_parameters*: use this option to automatically set the above parameters to add some more samples by accepting some more gaps, Ns, etc. This option is equivalent to set: marker_in_clade=0.5, sample_in_marker=0.5, N_in_marker=0.5, gap_in_sample=0.5. Default "False".
+6. *relaxed_parameters2*: use this option to add more samples by accepting some noise. This is equivalent to set marker_in_clade=0.2, sample_in_marker=0.2, N_in_marker=0.8, gap_in_sample=0.8. Default "False".
+
+### Some other useful output files ###
+In the output folder, you can find the following files:
+
+1. clade_name.fasta: the alignment file of all metagenomic strains.
+3. *.marker_pos: this file shows the starting position of each marker in the strains.
+3. *.info: this file shows the general information like the total length of the concatenated markers (full sequence length), number of used markers, etc.
+4. *.polymorphic: this file shows the statistics on the polymorphic site, where "sample" is the sample name, "percentage_of_polymorphic_sites" is the percentage of sites that are suspected to be polymorphic, "avg_freq" is the average frequency of the dominant alleles on all polymorphic sites, "avg_coverage" is the average coverage at all polymorphic sites.
\ No newline at end of file
diff --git a/changeset.txt b/changeset.txt
new file mode 100644
index 0000000..02d1341
--- /dev/null
+++ b/changeset.txt
@@ -0,0 +1,13 @@
+== Version 2.2.0
+- added option "marker_counts" (by Nicola)
+
+=== Version 2.1.0
+- added min_alignment_len option to filter out short alignments in local mode. For long reads (>150) it is now recommended to use local mapping together with "--min_alignment_len 100" to filter out very short alignments. (by Tin)
+- added "--samout" option to store the mapping file in SAM format (the SAM will be compressed if the extension of the specified output file ends with ".bz2") (by Tin)
+- fix: MetaPhlAn2 now ingores about ~300 markers that were a-specific (thanks to Eric)
+
+=== Version 2.0.0
+- fix: Biom >= 2.0.0 has the clade IDs second and the sample ids third'
+- added extract_markers.py
+- fix: #5; revamp biom generation; set clade IDs as enumeration
+- added utils/metaphlan2krona.py
diff --git a/db_v20/mpa_v20_m200.pkl b/db_v20/mpa_v20_m200.pkl
new file mode 100644
index 0000000..b409019
Binary files /dev/null and b/db_v20/mpa_v20_m200.pkl differ
diff --git a/license.txt b/license.txt
new file mode 100644
index 0000000..1596b63
--- /dev/null
+++ b/license.txt
@@ -0,0 +1,7 @@
+Copyright (c) 2015, Duy Tin Truong, Nicola Segata and Curtis Huttenhower
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/metaphlan2.py b/metaphlan2.py
new file mode 100755
index 0000000..cae0ced
--- /dev/null
+++ b/metaphlan2.py
@@ -0,0 +1,1282 @@
+#!/usr/bin/env python
+
+from __future__ import with_statement
+
+# ==============================================================================
+# MetaPhlAn v2.x: METAgenomic PHyLogenetic ANalysis for taxonomic classification
+# of metagenomic data
+#
+# Authors: Nicola Segata (nicola.segata at unitn.it),
+# Duy Tin Truong (duytin.truong at unitn.it)
+#
+# Please type "./metaphlan2.py -h" for usage help
+#
+# ==============================================================================
+
+__author__ = 'Nicola Segata (nicola.segata at unitn.it), Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '2.6.0'
+__date__ = '19 August 2016'
+
+
+import sys
+import os
+import stat
+import re
+from binascii import b2a_uu
+
+try:
+ import numpy as np
+except ImportError:
+ sys.stderr.write("Error! numpy python library not detected!!\n")
+ sys.exit(1)
+import tempfile as tf
+import argparse as ap
+import subprocess as subp
+import multiprocessing as mp
+from collections import defaultdict as defdict
+import bz2
+import itertools
+from distutils.version import LooseVersion
+try:
+ import cPickle as pickle
+except:
+ import pickle
+
+
+#*************************************************************
+#* Imports related to biom file generation *
+#*************************************************************
+try:
+ import biom
+ import biom.table
+ import numpy as np
+except ImportError:
+ sys.stderr.write("Warning! Biom python library not detected!"
+ "\n Exporting to biom format will not work!\n")
+try:
+ import json
+except ImportError:
+ sys.stderr.write("Warning! json python library not detected!"
+ "\n Exporting to biom format will not work!\n")
+
+# This set contains the markers that after careful validation are found to have low precision or recall
+# We esclude the markers here to avoid generating a new marker DB when changing just few markers
+markers_to_exclude = \
+ set([
+ 'NC_001782.1','GeneID:17099689','gi|419819595|ref|NZ_AJRE01000517.1|:1-118',
+ 'GeneID:10498696', 'GeneID:10498710', 'GeneID:10498726', 'GeneID:10498735',
+ 'GeneID:10498757', 'GeneID:10498760', 'GeneID:10498761', 'GeneID:10498763',
+ 'GeneID:11294465', 'GeneID:14181982', 'GeneID:14182132', 'GeneID:14182146',
+ 'GeneID:14182148', 'GeneID:14182328', 'GeneID:14182639', 'GeneID:14182647',
+ 'GeneID:14182650', 'GeneID:14182663', 'GeneID:14182683', 'GeneID:14182684',
+ 'GeneID:14182691', 'GeneID:14182803', 'GeneID:14296322', 'GeneID:1489077',
+ 'GeneID:1489080', 'GeneID:1489081', 'GeneID:1489084', 'GeneID:1489085',
+ 'GeneID:1489088', 'GeneID:1489089', 'GeneID:1489090', 'GeneID:1489528',
+ 'GeneID:1489530', 'GeneID:1489531', 'GeneID:1489735', 'GeneID:1491873',
+ 'GeneID:1491889', 'GeneID:1491962', 'GeneID:1491963', 'GeneID:1491964',
+ 'GeneID:1491965', 'GeneID:17099689', 'GeneID:1724732', 'GeneID:17494231',
+ 'GeneID:2546403', 'GeneID:2703374', 'GeneID:2703375', 'GeneID:2703498',
+ 'GeneID:2703531', 'GeneID:2772983', 'GeneID:2772989', 'GeneID:2772991',
+ 'GeneID:2772993', 'GeneID:2772995', 'GeneID:2773037', 'GeneID:2777387',
+ 'GeneID:2777399', 'GeneID:2777400', 'GeneID:2777439', 'GeneID:2777493',
+ 'GeneID:2777494', 'GeneID:3077424', 'GeneID:3160801', 'GeneID:3197323',
+ 'GeneID:3197355', 'GeneID:3197400', 'GeneID:3197428', 'GeneID:3783722',
+ 'GeneID:3783750', 'GeneID:3953004', 'GeneID:3959334', 'GeneID:3964368',
+ 'GeneID:3964370', 'GeneID:4961452', 'GeneID:5075645', 'GeneID:5075646',
+ 'GeneID:5075647', 'GeneID:5075648', 'GeneID:5075649', 'GeneID:5075650',
+ 'GeneID:5075651', 'GeneID:5075652', 'GeneID:5075653', 'GeneID:5075654',
+ 'GeneID:5075655', 'GeneID:5075656', 'GeneID:5075657', 'GeneID:5075658',
+ 'GeneID:5075659', 'GeneID:5075660', 'GeneID:5075661', 'GeneID:5075662',
+ 'GeneID:5075663', 'GeneID:5075664', 'GeneID:5075665', 'GeneID:5075667',
+ 'GeneID:5075668', 'GeneID:5075669', 'GeneID:5075670', 'GeneID:5075671',
+ 'GeneID:5075672', 'GeneID:5075673', 'GeneID:5075674', 'GeneID:5075675',
+ 'GeneID:5075676', 'GeneID:5075677', 'GeneID:5075678', 'GeneID:5075679',
+ 'GeneID:5075680', 'GeneID:5075681', 'GeneID:5075682', 'GeneID:5075683',
+ 'GeneID:5075684', 'GeneID:5075685', 'GeneID:5075686', 'GeneID:5075687',
+ 'GeneID:5075688', 'GeneID:5075689', 'GeneID:5075690', 'GeneID:5075691',
+ 'GeneID:5075692', 'GeneID:5075693', 'GeneID:5075694', 'GeneID:5075695',
+ 'GeneID:5075696', 'GeneID:5075697', 'GeneID:5075698', 'GeneID:5075700',
+ 'GeneID:5075701', 'GeneID:5075702', 'GeneID:5075703', 'GeneID:5075704',
+ 'GeneID:5075705', 'GeneID:5075707', 'GeneID:5075708', 'GeneID:5075709',
+ 'GeneID:5075710', 'GeneID:5075711', 'GeneID:5075712', 'GeneID:5075713',
+ 'GeneID:5075714', 'GeneID:5075715', 'GeneID:5075716', 'GeneID:5176189',
+ 'GeneID:6803896', 'GeneID:6803915', 'GeneID:7944151', 'GeneID:927334',
+ 'GeneID:927335', 'GeneID:927337', 'GeneID:940263', 'GeneID:9538324',
+ 'NC_003977.1', 'gi|103485498|ref|NC_008048.1|:1941166-1942314',
+ 'gi|108802856|ref|NC_008148.1|:1230231-1230875',
+ 'gi|124806686|ref|XM_001350760.1|',
+ 'gi|126661648|ref|NZ_AAXW01000149.1|:c1513-1341',
+ 'gi|149172845|ref|NZ_ABBW01000029.1|:970-1270',
+ 'gi|153883242|ref|NZ_ABDQ01000074.1|:79-541',
+ 'gi|167031021|ref|NC_010322.1|:1834668-1835168',
+ 'gi|171344510|ref|NZ_ABJO01001391.1|:1-116',
+ 'gi|171346813|ref|NZ_ABJO01001728.1|:c109-1',
+ 'gi|190640924|ref|NZ_ABRC01000948.1|:c226-44',
+ 'gi|223045343|ref|NZ_ACEN01000042.1|:1-336',
+ 'gi|224580998|ref|NZ_GG657387.1|:c114607-114002',
+ 'gi|224993759|ref|NZ_ACFY01000068.1|:c357-1',
+ 'gi|237784637|ref|NC_012704.1|:141000-142970',
+ 'gi|237784637|ref|NC_012704.1|:c2048315-2047083',
+ 'gi|240136783|ref|NC_012808.1|:1928224-1928961',
+ 'gi|255319020|ref|NZ_ACVR01000025.1|:28698-29132',
+ 'gi|260590341|ref|NZ_ACEO02000062.1|:c387-151',
+ 'gi|262368201|ref|NZ_GG704964.1|:733100-733978',
+ 'gi|262369811|ref|NZ_GG704966.1|:c264858-264520',
+ 'gi|288559258|ref|NC_013790.1|:448046-451354',
+ 'gi|288559258|ref|NC_013790.1|:532047-533942',
+ 'gi|294794157|ref|NZ_GG770200.1|:245344-245619',
+ 'gi|304372805|ref|NC_014448.1|:444677-445120',
+ 'gi|304372805|ref|NC_014448.1|:707516-708268',
+ 'gi|304372805|ref|NC_014448.1|:790263-792257',
+ 'gi|304372805|ref|NC_014448.1|:c367313-364470',
+ 'gi|304372805|ref|NC_014448.1|:c659144-658272',
+ 'gi|304372805|ref|NC_014448.1|:c772578-770410',
+ 'gi|304372805|ref|NC_014448.1|:c777901-777470',
+ 'gi|306477407|ref|NZ_GG770409.1|:c1643877-1643338',
+ 'gi|317120849|ref|NC_014831.1|:c891121-890144',
+ 'gi|323356441|ref|NZ_GL698442.1|:560-682',
+ 'gi|324996766|ref|NZ_BABV01000451.1|:10656-11579',
+ 'gi|326579405|ref|NZ_AEGQ01000006.1|:2997-3791',
+ 'gi|326579407|ref|NZ_AEGQ01000008.1|:c45210-44497',
+ 'gi|326579433|ref|NZ_AEGQ01000034.1|:346-3699',
+ 'gi|329889017|ref|NZ_GL883086.1|:586124-586804',
+ 'gi|330822653|ref|NC_015422.1|:2024431-2025018',
+ 'gi|335053104|ref|NZ_AFIL01000010.1|:c33862-32210',
+ 'gi|339304121|ref|NZ_AEOR01000258.1|:c294-1',
+ 'gi|339304277|ref|NZ_AEOR01000414.1|:1-812',
+ 'gi|342211239|ref|NZ_AFUK01000001.1|:790086-790835',
+ 'gi|342211239|ref|NZ_AFUK01000001.1|:c1579497-1578787',
+ 'gi|342213707|ref|NZ_AFUJ01000005.1|:48315-48908',
+ 'gi|355707189|ref|NZ_JH376566.1|:326756-326986',
+ 'gi|355707384|ref|NZ_JH376567.1|:90374-91453',
+ 'gi|355707384|ref|NZ_JH376567.1|:c388018-387605',
+ 'gi|355708440|ref|NZ_JH376569.1|:c80380-79448',
+ 'gi|358051729|ref|NZ_AEUN01000100.1|:c120-1',
+ 'gi|365983217|ref|XM_003668394.1|',
+ 'gi|377571722|ref|NZ_BAFD01000110.1|:c1267-29',
+ 'gi|377684864|ref|NZ_CM001194.1|:c1159954-1159619',
+ 'gi|377684864|ref|NZ_CM001194.1|:c4966-4196',
+ 'gi|378759497|ref|NZ_AFXE01000152.1|:1628-2215',
+ 'gi|378835506|ref|NC_016829.1|:112560-113342',
+ 'gi|378835506|ref|NC_016829.1|:114945-115193',
+ 'gi|378835506|ref|NC_016829.1|:126414-127151',
+ 'gi|378835506|ref|NC_016829.1|:272056-272403',
+ 'gi|378835506|ref|NC_016829.1|:272493-272786',
+ 'gi|378835506|ref|NC_016829.1|:358647-360863',
+ 'gi|378835506|ref|NC_016829.1|:37637-38185',
+ 'gi|378835506|ref|NC_016829.1|:60012-60497',
+ 'gi|378835506|ref|NC_016829.1|:606819-607427',
+ 'gi|378835506|ref|NC_016829.1|:607458-607760',
+ 'gi|378835506|ref|NC_016829.1|:826192-826821',
+ 'gi|378835506|ref|NC_016829.1|:c451932-451336',
+ 'gi|378835506|ref|NC_016829.1|:c460520-459951',
+ 'gi|378835506|ref|NC_016829.1|:c483843-482842',
+ 'gi|378835506|ref|NC_016829.1|:c544660-543638',
+ 'gi|378835506|ref|NC_016829.1|:c556383-555496',
+ 'gi|378835506|ref|NC_016829.1|:c632166-631228',
+ 'gi|378835506|ref|NC_016829.1|:c805066-802691',
+ 'gi|384124469|ref|NC_017160.1|:c2157447-2156863',
+ 'gi|385263288|ref|NZ_AJST01000001.1|:594143-594940',
+ 'gi|385858114|ref|NC_017519.1|:10252-10746',
+ 'gi|385858114|ref|NC_017519.1|:104630-104902',
+ 'gi|385858114|ref|NC_017519.1|:154292-156016',
+ 'gi|385858114|ref|NC_017519.1|:205158-206462',
+ 'gi|385858114|ref|NC_017519.1|:507239-507703',
+ 'gi|385858114|ref|NC_017519.1|:518924-519772',
+ 'gi|385858114|ref|NC_017519.1|:524712-525545',
+ 'gi|385858114|ref|NC_017519.1|:528387-528785',
+ 'gi|385858114|ref|NC_017519.1|:532275-533429',
+ 'gi|385858114|ref|NC_017519.1|:586402-586824',
+ 'gi|385858114|ref|NC_017519.1|:621696-622226',
+ 'gi|385858114|ref|NC_017519.1|:673673-676105',
+ 'gi|385858114|ref|NC_017519.1|:706602-708218',
+ 'gi|385858114|ref|NC_017519.1|:710627-711997',
+ 'gi|385858114|ref|NC_017519.1|:744974-745456',
+ 'gi|385858114|ref|NC_017519.1|:791055-791801',
+ 'gi|385858114|ref|NC_017519.1|:805643-807430',
+ 'gi|385858114|ref|NC_017519.1|:c172050-170809',
+ 'gi|385858114|ref|NC_017519.1|:c334545-333268',
+ 'gi|385858114|ref|NC_017519.1|:c383474-383202',
+ 'gi|385858114|ref|NC_017519.1|:c450880-450389',
+ 'gi|385858114|ref|NC_017519.1|:c451975-451001',
+ 'gi|385858114|ref|NC_017519.1|:c470488-470036',
+ 'gi|385858114|ref|NC_017519.1|:c485596-484598',
+ 'gi|385858114|ref|NC_017519.1|:c58658-58065',
+ 'gi|385858114|ref|NC_017519.1|:c592754-591081',
+ 'gi|385858114|ref|NC_017519.1|:c59590-58820',
+ 'gi|385858114|ref|NC_017519.1|:c601339-600575',
+ 'gi|385858114|ref|NC_017519.1|:c76080-75160',
+ 'gi|385858114|ref|NC_017519.1|:c97777-96302',
+ 'gi|391227518|ref|NZ_CM001514.1|:c1442504-1440237',
+ 'gi|391227518|ref|NZ_CM001514.1|:c3053472-3053023',
+ 'gi|394749766|ref|NZ_AHHC01000069.1|:3978-6176',
+ 'gi|398899615|ref|NZ_AKJK01000021.1|:28532-29209',
+ 'gi|406580057|ref|NZ_AJRD01000017.1|:c17130-15766',
+ 'gi|406584668|ref|NZ_AJQZ01000017.1|:c1397-771',
+ 'gi|408543458|ref|NZ_AJLO01000024.1|:67702-68304',
+ 'gi|410936685|ref|NZ_AJRF02000012.1|:21785-22696',
+ 'gi|41406098|ref|NC_002944.2|:c4468304-4467864',
+ 'gi|416998679|ref|NZ_AEXI01000003.1|:c562937-562176',
+ 'gi|417017738|ref|NZ_AEYL01000489.1|:c111-1',
+ 'gi|417018375|ref|NZ_AEYL01000508.1|:100-238',
+ 'gi|418576506|ref|NZ_AHKB01000025.1|:c7989-7669',
+ 'gi|419819595|ref|NZ_AJRE01000517.1|:1-118',
+ 'gi|421806549|ref|NZ_AMTB01000006.1|:c181247-180489',
+ 'gi|422320815|ref|NZ_GL636045.1|:28704-29048',
+ 'gi|422320874|ref|NZ_GL636046.1|:4984-5742',
+ 'gi|422323244|ref|NZ_GL636061.1|:479975-480520',
+ 'gi|422443048|ref|NZ_GL383112.1|:663738-664823',
+ 'gi|422552858|ref|NZ_GL383469.1|:c216727-215501',
+ 'gi|422859491|ref|NZ_GL878548.1|:c271832-271695',
+ 'gi|423012810|ref|NZ_GL982453.1|:3888672-3888935',
+ 'gi|423012810|ref|NZ_GL982453.1|:4541873-4542328',
+ 'gi|423012810|ref|NZ_GL982453.1|:c2189976-2188582',
+ 'gi|423012810|ref|NZ_GL982453.1|:c5471232-5470300',
+ 'gi|423262555|ref|NC_019552.1|:24703-25212',
+ 'gi|423262555|ref|NC_019552.1|:28306-30696',
+ 'gi|423262555|ref|NC_019552.1|:284252-284581',
+ 'gi|423262555|ref|NC_019552.1|:311161-311373',
+ 'gi|423262555|ref|NC_019552.1|:32707-34497',
+ 'gi|423262555|ref|NC_019552.1|:34497-35237',
+ 'gi|423262555|ref|NC_019552.1|:53691-56813',
+ 'gi|423262555|ref|NC_019552.1|:c388986-386611',
+ 'gi|423262555|ref|NC_019552.1|:c523106-522528',
+ 'gi|423689090|ref|NZ_CM001513.1|:c1700632-1699448',
+ 'gi|423689090|ref|NZ_CM001513.1|:c1701670-1700651',
+ 'gi|423689090|ref|NZ_CM001513.1|:c5739118-5738390',
+ 'gi|427395956|ref|NZ_JH992914.1|:c592682-591900',
+ 'gi|427407324|ref|NZ_JH992904.1|:c2681223-2679463',
+ 'gi|451952303|ref|NZ_AJRB03000021.1|:1041-1574',
+ 'gi|452231579|ref|NZ_AEKA01000123.1|:c18076-16676',
+ 'gi|459791914|ref|NZ_CM001824.1|:c899379-899239',
+ 'gi|471265562|ref|NC_020815.1|:3155799-3156695',
+ 'gi|472279780|ref|NZ_ALPV02000001.1|:33911-36751',
+ 'gi|482733945|ref|NZ_AHGZ01000071.1|:10408-11154',
+ 'gi|483051300|ref|NZ_ALYK02000034.1|:c37582-36650',
+ 'gi|483051300|ref|NZ_ALYK02000034.1|:c38037-37582',
+ 'gi|483993347|ref|NZ_AMXG01000045.1|:251724-253082',
+ 'gi|484100856|ref|NZ_JH670250.1|:600643-602949',
+ 'gi|484115941|ref|NZ_AJXG01000093.1|:567-947',
+ 'gi|484228609|ref|NZ_JH730929.1|:c103784-99021',
+ 'gi|484228797|ref|NZ_JH730960.1|:c16193-12429',
+ 'gi|484228814|ref|NZ_JH730962.1|:c29706-29260',
+ 'gi|484228929|ref|NZ_JH730981.1|:18645-22060',
+ 'gi|484228939|ref|NZ_JH730983.1|:42943-43860',
+ 'gi|484266598|ref|NZ_AKGC01000024.1|:118869-119636',
+ 'gi|484327375|ref|NZ_AKVP01000093.1|:1-1281',
+ 'gi|484328234|ref|NZ_AKVP01000127.1|:c325-110',
+ 'gi|487376144|ref|NZ_KB911257.1|:600445-601482',
+ 'gi|487376194|ref|NZ_KB911260.1|:146228-146533',
+ 'gi|487381776|ref|NZ_KB911485.1|:101242-103083',
+ 'gi|487381776|ref|NZ_KB911485.1|:c32472-31627',
+ 'gi|487381800|ref|NZ_KB911486.1|:39414-39872',
+ 'gi|487381828|ref|NZ_KB911487.1|:15689-17026',
+ 'gi|487381846|ref|NZ_KB911488.1|:13678-13821',
+ 'gi|487382089|ref|NZ_KB911497.1|:23810-26641',
+ 'gi|487382176|ref|NZ_KB911501.1|:c497-381',
+ 'gi|487382213|ref|NZ_KB911502.1|:12706-13119',
+ 'gi|487382247|ref|NZ_KB911505.1|:c7595-6663',
+ 'gi|490551798|ref|NZ_AORG01000011.1|:40110-41390',
+ 'gi|491099398|ref|NZ_KB849654.1|:c720460-719912',
+ 'gi|491124812|ref|NZ_KB849705.1|:1946500-1946937',
+ 'gi|491155563|ref|NZ_KB849732.1|:46469-46843',
+ 'gi|491155563|ref|NZ_KB849732.1|:46840-47181',
+ 'gi|491155563|ref|NZ_KB849732.1|:47165-48616',
+ 'gi|491155563|ref|NZ_KB849732.1|:55055-56662',
+ 'gi|491155563|ref|NZ_KB849732.1|:56662-57351',
+ 'gi|491155563|ref|NZ_KB849732.1|:6101-7588',
+ 'gi|491155563|ref|NZ_KB849732.1|:7657-8073',
+ 'gi|491349766|ref|NZ_KB850082.1|:441-941',
+ 'gi|491395079|ref|NZ_KB850142.1|:1461751-1462554',
+ 'gi|512608407|ref|NZ_KE150401.1|:c156891-156016',
+ 'gi|518653462|ref|NZ_ATLM01000004.1|:c89669-89247',
+ 'gi|520818261|ref|NZ_ATLQ01000015.1|:480744-481463',
+ 'gi|520822538|ref|NZ_ATLQ01000063.1|:103173-103283',
+ 'gi|520826510|ref|NZ_ATLQ01000092.1|:c13892-13563',
+ 'gi|544644736|ref|NZ_KE747865.1|:68388-69722',
+ 'gi|545347918|ref|NZ_KE952096.1|:c83873-81831',
+ 'gi|550735774|gb|AXMM01000002.1|:c743886-743575',
+ 'gi|552875787|ref|NZ_KI515684.1|:c584270-583890',
+ 'gi|552876418|ref|NZ_KI515685.1|:36713-37258',
+ 'gi|552876418|ref|NZ_KI515685.1|:432422-433465',
+ 'gi|552876418|ref|NZ_KI515685.1|:c1014617-1014117',
+ 'gi|552876418|ref|NZ_KI515685.1|:c931935-931327',
+ 'gi|552876815|ref|NZ_KI515686.1|:613740-614315',
+ 'gi|552879811|ref|NZ_AXME01000001.1|:1146402-1146932',
+ 'gi|552879811|ref|NZ_AXME01000001.1|:40840-41742',
+ 'gi|552879811|ref|NZ_AXME01000001.1|:49241-49654',
+ 'gi|552891898|ref|NZ_AXMG01000001.1|:99114-99290',
+ 'gi|552891898|ref|NZ_AXMG01000001.1|:c1460921-1460529',
+ 'gi|552895565|ref|NZ_AXMI01000001.1|:619555-620031',
+ 'gi|552895565|ref|NZ_AXMI01000001.1|:c14352-13837',
+ 'gi|552896371|ref|NZ_AXMI01000002.1|:c148595-146280',
+ 'gi|552897201|ref|NZ_AXMI01000004.1|:c231437-230883',
+ 'gi|552902020|ref|NZ_AXMK01000001.1|:c1625038-1624022',
+ 'gi|556346902|ref|NZ_KI535485.1|:c828278-827901',
+ 'gi|556478613|ref|NZ_KI535633.1|:3529392-3530162',
+ 'gi|560534311|ref|NZ_AYSF01000111.1|:26758-29049',
+ 'gi|564165687|gb|AYLX01000355.1|:10906-11166',
+ 'gi|564169776|gb|AYLX01000156.1|:1-185',
+ 'gi|564938696|gb|AWYH01000018.1|:c75674-75039', 'gi|67993724|ref|XM_664440.1|',
+ 'gi|68059117|ref|XM_666447.1|', 'gi|68062389|ref|XM_668109.1|',
+ 'gi|71730848|gb|AAAM03000019.1|:c14289-12877', 'gi|82753723|ref|XM_722699.1|',
+ 'gi|82775382|ref|NC_007606.1|:2249487-2250014', 'gi|82793634|ref|XM_723027.1|'
+ ])
+
+tax_units = "kpcofgst"
+
+if float(sys.version_info[0]) < 3.0:
+ def read_and_split( ofn ):
+ return (l.strip().split('\t') for l in ofn)
+ def read_and_split_line( line ):
+ return line.strip().split('\t')
+else:
+ def read_and_split( ofn ):
+ return (str(l,encoding='utf-8').strip().split('\t') for l in ofn)
+ def read_and_split_line( line ):
+ return str(line,encoding='utf-8').strip().split('\t')
+
+
+def plain_read_and_split( ofn ):
+ return (l.strip().split('\t') for l in ofn)
+
+def plain_read_and_split_line( l ):
+ return l.strip().split('\t')
+
+
+
+if float(sys.version_info[0]) < 3.0:
+ def mybytes( val ):
+ return val
+else:
+ def mybytes( val ):
+ return bytes(val,encoding='utf-8')
+
+# get the directory that contains this script
+metaphlan2_script_install_folder=os.path.dirname(os.path.abspath(__file__))
+
+def read_params(args):
+ p = ap.ArgumentParser( description=
+ "DESCRIPTION\n"
+ " MetaPhlAn version "+__version__+" ("+__date__+"): \n"
+ " METAgenomic PHyLogenetic ANalysis for metagenomic taxonomic profiling.\n\n"
+ "AUTHORS: "+__author__+"\n\n"
+ "COMMON COMMANDS\n\n"
+ " We assume here that metaphlan2.py is in the system path and that mpa_dir bash variable contains the\n"
+ " main MetaPhlAn folder. Also BowTie2 should be in the system path with execution and read\n"
+ " permissions, and Perl should be installed)\n\n"
+
+ "\n========== MetaPhlAn 2 clade-abundance estimation ================= \n\n"
+ "The basic usage of MetaPhlAn 2 consists in the identification of the clades (from phyla to species and \n"
+ "strains in particular cases) present in the metagenome obtained from a microbiome sample and their \n"
+ "relative abundance. This correspond to the default analysis type (--analysis_type rel_ab).\n\n"
+
+ "* Profiling a metagenome from raw reads:\n"
+ "$ metaphlan2.py metagenome.fastq --input_type fastq\n\n"
+
+ "* You can take advantage of multiple CPUs and save the intermediate BowTie2 output for re-running\n"
+ " MetaPhlAn extremely quickly:\n"
+ "$ metaphlan2.py metagenome.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq\n\n"
+
+ "* If you already mapped your metagenome against the marker DB (using a previous MetaPhlAn run), you\n"
+ " can obtain the results in few seconds by using the previously saved --bowtie2out file and \n"
+ " specifying the input (--input_type bowtie2out):\n"
+ "$ metaphlan2.py metagenome.bowtie2.bz2 --nproc 5 --input_type bowtie2out\n\n"
+
+ "* You can also provide an externally BowTie2-mapped SAM if you specify this format with \n"
+ " --input_type. Two steps: first apply BowTie2 and then feed MetaPhlAn2 with the obtained sam:\n"
+ "$ bowtie2 --sam-no-hd --sam-no-sq --no-unal --very-sensitive -S metagenome.sam -x ${mpa_dir}/db_v20/mpa_v20_m200 -U metagenome.fastq\n"
+ "$ metaphlan2.py metagenome.sam --input_type sam > profiled_metagenome.txt\n\n"
+
+ "* Multiple alternative ways to pass the input are also available:\n"
+ "$ cat metagenome.fastq | metaphlan2.py --input_type fastq \n"
+ "$ tar xjf metagenome.tar.bz2 --to-stdout | metaphlan2.py --input_type fastq \n"
+ "$ metaphlan2.py --input_type fastq < metagenome.fastq\n"
+ "$ metaphlan2.py --input_type fastq <(bzcat metagenome.fastq.bz2)\n"
+ "$ metaphlan2.py --input_type fastq <(zcat metagenome_1.fastq.gz metagenome_2.fastq.gz)\n\n"
+
+ "* We can also natively handle paired-end metagenomes, and, more generally, metagenomes stored in \n"
+ " multiple files (but you need to specify the --bowtie2out parameter):\n"
+ "$ metaphlan2.py metagenome_1.fastq,metagenome_2.fastq --bowtie2out metagenome.bowtie2.bz2 --nproc 5 --input_type fastq\n\n"
+ "\n------------------------------------------------------------------- \n \n\n"
+
+
+ "\n========== MetaPhlAn 2 strain tracking ============================ \n\n"
+ "MetaPhlAn 2 introduces the capability of charachterizing organisms at the strain level using non\n"
+ "aggregated marker information. Such capability comes with several slightly different flavours and \n"
+ "are a way to perform strain tracking and comparison across multiple samples.\n"
+ "Usually, MetaPhlAn 2 is first ran with the default --analysis_type to profile the species present in\n"
+ "the community, and then a strain-level profiling can be performed to zoom-in into specific species\n"
+ "of interest. This operation can be performed quickly as it exploits the --bowtie2out intermediate \n"
+ "file saved during the execution of the default analysis type.\n\n"
+
+ "* The following command will output the abundance of each marker with a RPK (reads per kil-base) \n"
+ " higher 0.0. (we are assuming that metagenome_outfmt.bz2 has been generated before as \n"
+ " shown above).\n"
+ "$ metaphlan2.py -t marker_ab_table metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt\n"
+ " The obtained RPK can be optionally normalized by the total number of reads in the metagenome \n"
+ " to guarantee fair comparisons of abundances across samples. The number of reads in the metagenome\n"
+ " needs to be passed with the '--nreads' argument\n\n"
+
+ "* The list of markers present in the sample can be obtained with '-t marker_pres_table'\n"
+ "$ metaphlan2.py -t marker_pres_table metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt\n"
+ " The --pres_th argument (default 1.0) set the minimum RPK value to consider a marker present\n\n"
+
+ "* The list '-t clade_profiles' analysis type reports the same information of '-t marker_ab_table'\n"
+ " but the markers are reported on a clade-by-clade basis.\n"
+ "$ metaphlan2.py -t clade_profiles metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt\n\n"
+
+ "* Finally, to obtain all markers present for a specific clade and all its subclades, the \n"
+ " '-t clade_specific_strain_tracker' should be used. For example, the following command\n"
+ " is reporting the presence/absence of the markers for the B. fragulis species and its strains\n"
+ " the optional argument --min_ab specifies the minimum clade abundance for reporting the markers\n\n"
+ "$ metaphlan2.py -t clade_specific_strain_tracker --clade s__Bacteroides_fragilis metagenome_outfmt.bz2 --input_type bowtie2out > marker_abundance_table.txt\n"
+
+ "\n------------------------------------------------------------------- \n\n"
+ "",
+ formatter_class=ap.RawTextHelpFormatter,
+ add_help=False )
+ arg = p.add_argument
+
+ arg( 'inp', metavar='INPUT_FILE', type=str, nargs='?', default=None, help=
+ "the input file can be:\n"
+ "* a fastq file containing metagenomic reads\n"
+ "OR\n"
+ "* a BowTie2 produced SAM file. \n"
+ "OR\n"
+ "* an intermediary mapping file of the metagenome generated by a previous MetaPhlAn run \n"
+ "If the input file is missing, the script assumes that the input is provided using the standard \n"
+ "input, or named pipes.\n"
+ "IMPORTANT: the type of input needs to be specified with --input_type" )
+
+ arg( 'output', metavar='OUTPUT_FILE', type=str, nargs='?', default=None,
+ help= "the tab-separated output file of the predicted taxon relative abundances \n"
+ "[stdout if not present]")
+
+
+ g = p.add_argument_group('Required arguments')
+ arg = g.add_argument
+ input_type_choices = ['fastq','fasta','multifasta','multifastq','bowtie2out','sam'] # !!!!
+ arg( '--input_type', choices=input_type_choices, required = 'True', help =
+ "set whether the input is the multifasta file of metagenomic reads or \n"
+ "the SAM file of the mapping of the reads against the MetaPhlAn db.\n"
+ "[default 'automatic', i.e. the script will try to guess the input format]\n" )
+
+ g = p.add_argument_group('Mapping arguments')
+ arg = g.add_argument
+ arg( '--mpa_pkl', type=str,
+ default=os.path.join(metaphlan2_script_install_folder,"db_v20","mpa_v20_m200.pkl"),
+ help = "the metadata pickled MetaPhlAn file")
+ arg( '--bowtie2db', metavar="METAPHLAN_BOWTIE2_DB", type=str,
+ default = os.path.join(metaphlan2_script_install_folder,"db_v20","mpa_v20_m200"),
+ help = "The BowTie2 database file of the MetaPhlAn database. \n"
+ "Used if --input_type is fastq, fasta, multifasta, or multifastq")
+ bt2ps = ['sensitive','very-sensitive','sensitive-local','very-sensitive-local']
+ arg( '--bt2_ps', metavar="BowTie2 presets", default='very-sensitive', choices=bt2ps,
+ help = "presets options for BowTie2 (applied only when a multifasta file is provided)\n"
+ "The choices enabled in MetaPhlAn are:\n"
+ " * sensitive\n"
+ " * very-sensitive\n"
+ " * sensitive-local\n"
+ " * very-sensitive-local\n"
+ "[default very-sensitive]\n" )
+ arg( '--bowtie2_exe', type=str, default = None, help =
+ 'Full path and name of the BowTie2 executable. This option allows \n'
+ 'MetaPhlAn to reach the executable even when it is not in the system \n'
+ 'PATH or the system PATH is unreachable\n' )
+ arg( '--bowtie2out', metavar="FILE_NAME", type=str, default = None, help =
+ "The file for saving the output of BowTie2\n" )
+ arg( '--no_map', action='store_true', help=
+ "Avoid storing the --bowtie2out map file\n" )
+ arg( '--tmp_dir', metavar="", default=None, type=str, help =
+ "the folder used to store temporary files \n"
+ "[default is the OS dependent tmp dir]\n" )
+
+
+ g = p.add_argument_group('Post-mapping arguments')
+ arg = g.add_argument
+ stat_choices = ['avg_g','avg_l','tavg_g','tavg_l','wavg_g','wavg_l','med']
+ arg( '--tax_lev', metavar='TAXONOMIC_LEVEL', type=str,
+ choices='a'+tax_units, default='a', help =
+ "The taxonomic level for the relative abundance output:\n"
+ "'a' : all taxonomic levels\n"
+ "'k' : kingdoms\n"
+ "'p' : phyla only\n"
+ "'c' : classes only\n"
+ "'o' : orders only\n"
+ "'f' : families only\n"
+ "'g' : genera only\n"
+ "'s' : species only\n"
+ "[default 'a']" )
+ arg( '--min_cu_len', metavar="", default="2000", type=int, help =
+ "minimum total nucleotide length for the markers in a clade for\n"
+ "estimating the abundance without considering sub-clade abundances\n"
+ "[default 2000]\n" )
+ arg( '--min_alignment_len', metavar="", default=None, type=int, help =
+ "The sam records for aligned reads with the longest subalignment\n"
+ "length smaller than this threshold will be discarded.\n"
+ "[default None]\n" )
+ arg( '--ignore_viruses', action='store_true', help=
+ "Do not profile viral organisms" )
+ arg( '--ignore_eukaryotes', action='store_true', help=
+ "Do not profile eukaryotic organisms" )
+ arg( '--ignore_bacteria', action='store_true', help=
+ "Do not profile bacterial organisms" )
+ arg( '--ignore_archaea', action='store_true', help=
+ "Do not profile archeal organisms" )
+ arg( '--stat_q', metavar="", type = float, default=0.1, help =
+ "Quantile value for the robust average\n"
+ "[default 0.1]" )
+ arg( '--ignore_markers', type=str, default = None, help =
+ "File containing a list of markers to ignore. \n")
+ arg( '--avoid_disqm', action="store_true", help =
+ "Deactivate the procedure of disambiguating the quasi-markers based on the \n"
+ "marker abundance pattern found in the sample. It is generally recommended \n"
+ "too keep the disambiguation procedure in order to minimize false positives\n")
+ arg( '--stat', metavar="", choices=stat_choices, default="tavg_g", type=str, help =
+ "EXPERIMENTAL! Statistical approach for converting marker abundances into clade abundances\n"
+ "'avg_g' : clade global (i.e. normalizing all markers together) average\n"
+ "'avg_l' : average of length-normalized marker counts\n"
+ "'tavg_g' : truncated clade global average at --stat_q quantile\n"
+ "'tavg_l' : trunated average of length-normalized marker counts (at --stat_q)\n"
+ "'wavg_g' : winsorized clade global average (at --stat_q)\n"
+ "'wavg_l' : winsorized average of length-normalized marker counts (at --stat_q)\n"
+ "'med' : median of length-normalized marker counts\n"
+ "[default tavg_g]" )
+
+ arg = p.add_argument
+
+
+
+ g = p.add_argument_group('Additional analysis types and arguments')
+ arg = g.add_argument
+ analysis_types = ['rel_ab', 'rel_ab_w_read_stats', 'reads_map', 'clade_profiles', 'marker_ab_table', 'marker_counts', 'marker_pres_table', 'clade_specific_strain_tracker']
+ arg( '-t', metavar='ANALYSIS TYPE', type=str, choices = analysis_types,
+ default='rel_ab', help =
+ "Type of analysis to perform: \n"
+ " * rel_ab: profiling a metagenomes in terms of relative abundances\n"
+ " * rel_ab_w_read_stats: profiling a metagenomes in terms of relative abundances and estimate the number of reads comming from each clade.\n"
+ " * reads_map: mapping from reads to clades (only reads hitting a marker)\n"
+ " * clade_profiles: normalized marker counts for clades with at least a non-null marker\n"
+ " * marker_ab_table: normalized marker counts (only when > 0.0 and normalized by metagenome size if --nreads is specified)\n"
+ " * marker_counts: non-normalized marker counts [use with extreme caution]\n"
+ " * marker_pres_table: list of markers present in the sample (threshold at 1.0 if not differently specified with --pres_th\n"
+ "[default 'rel_ab']" )
+ arg( '--nreads', metavar="NUMBER_OF_READS", type=int, default = None, help =
+ "The total number of reads in the original metagenome. It is used only when \n"
+ "-t marker_table is specified for normalizing the length-normalized counts \n"
+ "with the metagenome size as well. No normalization applied if --nreads is not \n"
+ "specified" )
+ arg( '--pres_th', metavar="PRESENCE_THRESHOLD", type=int, default = 1.0, help =
+ 'Threshold for calling a marker present by the -t marker_pres_table option' )
+ arg( '--clade', metavar="", default=None, type=str, help =
+ "The clade for clade_specific_strain_tracker analysis\n" )
+ arg( '--min_ab', metavar="", default=0.1, type=float, help =
+ "The minimum percentage abundace for the clade in the clade_specific_strain_tracker analysis\n" )
+ arg( "-h", "--help", action="help", help="show this help message and exit")
+
+ g = p.add_argument_group('Output arguments')
+ arg = g.add_argument
+ arg( '-o', '--output_file', metavar="output file", type=str, default=None, help =
+ "The output file (if not specified as positional argument)\n")
+ arg('--sample_id_key', metavar="name", type=str, default="#SampleID",
+ help =("Specify the sample ID key for this analysis."
+ " Defaults to '#SampleID'."))
+ arg('--sample_id', metavar="value", type=str,
+ default="Metaphlan2_Analysis",
+ help =("Specify the sample ID for this analysis."
+ " Defaults to 'Metaphlan2_Analysis'."))
+ arg( '-s', '--samout', metavar="sam_output_file",
+ type=str, default=None, help="The sam output file\n")
+ #*************************************************************
+ #* Parameters related to biom file generation *
+ #*************************************************************
+ arg( '--biom', '--biom_output_file', metavar="biom_output", type=str, default=None, help =
+ "If requesting biom file output: The name of the output file in biom format \n")
+
+ arg( '--mdelim', '--metadata_delimiter_char', metavar="mdelim", type=str, default="|", help =
+ "Delimiter for bug metadata: - defaults to pipe. e.g. the pipe in k__Bacteria|p__Proteobacteria \n")
+ #*************************************************************
+ #* End parameters related to biom file generation *
+ #*************************************************************
+
+ g = p.add_argument_group('Other arguments')
+ arg = g.add_argument
+ arg( '--nproc', metavar="N", type=int, default=1, help =
+ "The number of CPUs to use for parallelizing the mapping\n"
+ "[default 1, i.e. no parallelism]\n" )
+ arg( '-v','--version', action='version', version="MetaPhlAn version "+__version__+"\t("+__date__+")",
+ help="Prints the current MetaPhlAn version and exit\n" )
+
+
+ return vars(p.parse_args())
+
+def run_bowtie2( fna_in, outfmt6_out, bowtie2_db, preset, nproc,
+ file_format = "multifasta", exe = None,
+ samout = None,
+ min_alignment_len = None,
+ ):
+ try:
+ if not fna_in: # or stat.S_ISFIFO(os.stat(fna_in).st_mode):
+ fna_in = "-"
+ bowtie2_cmd = [ exe if exe else 'bowtie2',
+ "--quiet", "--no-unal",
+ "--"+preset,
+ "-S","-",
+ "-x", bowtie2_db,
+ ] + ([] if int(nproc) < 2 else ["-p",str(nproc)])
+ bowtie2_cmd += ["-U", fna_in] # if not stat.S_ISFIFO(os.stat(fna_in).st_mode) else []
+ bowtie2_cmd += (["-f"] if file_format == "multifasta" else [])
+ p = subp.Popen( bowtie2_cmd, stdout=subp.PIPE )
+ lmybytes, outf = (mybytes,bz2.BZ2File(outfmt6_out, "w")) if outfmt6_out.endswith(".bz2") else (str,open( outfmt6_out, "w" ))
+
+ try:
+ if samout:
+ if samout[-4:] == '.bz2':
+ sam_file = bz2.BZ2File(samout, 'w')
+ else:
+ sam_file = open(samout, 'w')
+ except IOError:
+ sys.stderr.write( "IOError: Unable to open sam output file.\n" )
+ sys.exit(1)
+
+ for line in p.stdout:
+ if samout:
+ sam_file.write(line)
+ if line[0] != '@':
+ o = read_and_split_line(line)
+ if o[2][-1] != '*':
+ if min_alignment_len == None\
+ or max([int(x.strip('M')) for x in\
+ re.findall(r'(\d*M)', o[5])]) >= min_alignment_len:
+ outf.write( lmybytes("\t".join([o[0],o[2]]) +"\n") )
+ #if float(sys.version_info[0]) >= 3:
+ # for o in read_and_split(p.stdout):
+ # if o[2][-1] != '*':
+ # outf.write( bytes("\t".join([o[0],o[2]]) +"\n",encoding='utf-8') )
+ #else:
+ # for o in read_and_split(p.stdout):
+ # if o[2][-1] != '*':
+ # outf.write( "\t".join([o[0],o[2]]) +"\n" )
+ outf.close()
+ if samout:
+ sam_file.close()
+ p.wait()
+
+
+ except OSError:
+ sys.stderr.write( "OSError: fatal error running BowTie2. Is BowTie2 in the system path?\n" )
+ sys.exit(1)
+ except ValueError:
+ sys.stderr.write( "ValueError: fatal error running BowTie2.\n" )
+ sys.exit(1)
+ except IOError:
+ sys.stderr.write( "IOError: fatal error running BowTie2.\n" )
+ sys.exit(1)
+ if p.returncode == 13:
+ sys.stderr.write( "Permission Denied Error: fatal error running BowTie2."
+ "Is the BowTie2 file in the path with execution and read permissions?\n" )
+ sys.exit(1)
+ elif p.returncode != 0:
+ sys.stderr.write("Error while running bowtie2.\n")
+ sys.exit(1)
+
+#def guess_input_format( inp_file ):
+# if "," in inp_file:
+# sys.stderr.write( "Sorry, I cannot guess the format of the input, when "
+# "more than one file is specified. Please set the --input_type parameter \n" )
+# sys.exit(1)
+#
+# with open( inp_file ) as inpf:
+# for i,l in enumerate(inpf):
+# line = l.strip()
+# if line[0] == '#': continue
+# if line[0] == '>': return 'multifasta'
+# if line[0] == '@': return 'multifastq'
+# if len(l.split('\t')) == 2: return 'bowtie2out'
+# if i > 20: break
+# return None
+
+class TaxClade:
+ min_cu_len = -1
+ markers2lens = None
+ stat = None
+ quantile = None
+ avoid_disqm = False
+
+ def __init__( self, name, uncl = False, id_int = 0 ):
+ self.children, self.markers2nreads = {}, {}
+ self.name, self.father = name, None
+ self.uncl, self.subcl_uncl = uncl, False
+ self.abundance, self.uncl_abundance = None, 0
+ self.id = id_int
+
+ def add_child( self, name, id_int ):
+ new_clade = TaxClade( name, id_int=id_int )
+ self.children[name] = new_clade
+ new_clade.father = self
+ return new_clade
+
+
+ def get_terminals( self ):
+ terms = []
+ if not self.children:
+ return [self]
+ for c in self.children.values():
+ terms += c.get_terminals()
+ return terms
+
+
+ def get_full_name( self ):
+ fullname = [self.name]
+ cl = self.father
+ while cl:
+ fullname = [cl.name] + fullname
+ cl = cl.father
+ return "|".join(fullname[1:])
+
+ def get_normalized_counts( self ):
+ return [(m,float(n)*1000.0/self.markers2lens[m])
+ for m,n in self.markers2nreads.items()]
+
+ def compute_abundance( self ):
+ if self.abundance is not None: return self.abundance
+ sum_ab = sum([c.compute_abundance() for c in self.children.values()])
+ rat_nreads = sorted([(self.markers2lens[m],n)
+ for m,n in self.markers2nreads.items()],
+ key = lambda x: x[1])
+
+ rat_nreads, removed = [], []
+ for m,n in self.markers2nreads.items():
+ misidentified = False
+
+ if not self.avoid_disqm:
+ for e in self.markers2exts[m]:
+ toclade = self.taxa2clades[e]
+ m2nr = toclade.markers2nreads
+ tocladetmp = toclade
+ while len(tocladetmp.children) == 1:
+ tocladetmp = list(tocladetmp.children.values())[0]
+ m2nr = tocladetmp.markers2nreads
+
+ nonzeros = sum([v>0 for v in m2nr.values()])
+ if len(m2nr):
+ if float(nonzeros) / len(m2nr) > 0.33:
+ misidentified = True
+ removed.append( (self.markers2lens[m],n) )
+ break
+ if not misidentified:
+ rat_nreads.append( (self.markers2lens[m],n) )
+
+ if not self.avoid_disqm and len(removed):
+ n_rat_nreads = float(len(rat_nreads))
+ n_removed = float(len(removed))
+ n_tot = n_rat_nreads + n_removed
+ n_ripr = 10
+
+ if len(self.get_terminals()) < 2:
+ n_ripr = 0
+
+ if "k__Viruses" in self.get_full_name():
+ n_ripr = 0
+
+ if n_rat_nreads < n_ripr and n_tot > n_rat_nreads:
+ rat_nreads += removed[:n_ripr-int(n_rat_nreads)]
+
+
+ rat_nreads = sorted(rat_nreads, key = lambda x: x[1])
+
+ rat_v,nreads_v = zip(*rat_nreads) if rat_nreads else ([],[])
+ rat, nrawreads, loc_ab = float(sum(rat_v)) or -1.0, sum(nreads_v), 0.0
+ quant = int(self.quantile*len(rat_nreads))
+ ql,qr,qn = (quant,-quant,quant) if quant else (None,None,0)
+
+ if self.name[0] == 't' and (len(self.father.children) > 1 or "_sp" in self.father.name or "k__Viruses" in self.get_full_name()):
+ non_zeros = float(len([n for r,n in rat_nreads if n > 0]))
+ nreads = float(len(rat_nreads))
+ if nreads == 0.0 or non_zeros / nreads < 0.7:
+ self.abundance = 0.0
+ return 0.0
+
+ if rat < 0.0:
+ pass
+ elif self.stat == 'avg_g' or (not qn and self.stat in ['wavg_g','tavg_g']):
+ loc_ab = nrawreads / rat if rat >= 0 else 0.0
+ elif self.stat == 'avg_l' or (not qn and self.stat in ['wavg_l','tavg_l']):
+ loc_ab = np.mean([float(n)/r for r,n in rat_nreads])
+ elif self.stat == 'tavg_g':
+ wnreads = sorted([(float(n)/r,r,n) for r,n in rat_nreads], key=lambda x:x[0])
+ den,num = zip(*[v[1:] for v in wnreads[ql:qr]])
+ loc_ab = float(sum(num))/float(sum(den)) if any(den) else 0.0
+ elif self.stat == 'tavg_l':
+ loc_ab = np.mean(sorted([float(n)/r for r,n in rat_nreads])[ql:qr])
+ elif self.stat == 'wavg_g':
+ vmin, vmax = nreads_v[ql], nreads_v[qr]
+ wnreads = [vmin]*qn+list(nreads_v[ql:qr])+[vmax]*qn
+ loc_ab = float(sum(wnreads)) / rat
+ elif self.stat == 'wavg_l':
+ wnreads = sorted([float(n)/r for r,n in rat_nreads])
+ vmin, vmax = wnreads[ql], wnreads[qr]
+ wnreads = [vmin]*qn+list(wnreads[ql:qr])+[vmax]*qn
+ loc_ab = np.mean(wnreads)
+ elif self.stat == 'med':
+ loc_ab = np.median(sorted([float(n)/r for r,n in rat_nreads])[ql:qr])
+
+ self.abundance = loc_ab
+ if rat < self.min_cu_len and self.children:
+ self.abundance = sum_ab
+ elif loc_ab < sum_ab:
+ self.abundance = sum_ab
+
+ if self.abundance > sum_ab and self.children: # *1.1??
+ self.uncl_abundance = self.abundance - sum_ab
+ self.subcl_uncl = not self.children and self.name[0] not in tax_units[-2:]
+
+ return self.abundance
+
+ def get_all_abundances( self ):
+ ret = [(self.name,self.abundance)]
+ if self.uncl_abundance > 0.0:
+ lchild = list(self.children.values())[0].name[:3]
+ ret += [(lchild+self.name[3:]+"_unclassified",self.uncl_abundance)]
+ if self.subcl_uncl and self.name[0] != tax_units[-2]:
+ cind = tax_units.index( self.name[0] )
+ ret += [( tax_units[cind+1]+self.name[1:]+"_unclassified",
+ self.abundance)]
+ for c in self.children.values():
+ ret += c.get_all_abundances()
+ return ret
+
+
+class TaxTree:
+ def __init__( self, mpa, markers_to_ignore = None ): #, min_cu_len ):
+ self.root = TaxClade( "root" )
+ self.all_clades, self.markers2lens, self.markers2clades, self.taxa2clades, self.markers2exts = {}, {}, {}, {}, {}
+ TaxClade.markers2lens = self.markers2lens
+ TaxClade.markers2exts = self.markers2exts
+ TaxClade.taxa2clades = self.taxa2clades
+ self.id_gen = itertools.count(1)
+
+ clades_txt = ((l.strip().split("|"),n) for l,n in mpa_pkl['taxonomy'].items())
+ for clade,lenc in clades_txt:
+ father = self.root
+ for clade_lev in clade: # !!!!! [:-1]:
+ if not clade_lev in father.children:
+ father.add_child( clade_lev, id_int=next(self.id_gen) )
+ self.all_clades[clade_lev] = father.children[clade_lev]
+ if clade_lev[0] == "t":
+ self.taxa2clades[clade_lev[3:]] = father
+
+ father = father.children[clade_lev]
+ if clade_lev[0] == "t":
+ father.glen = lenc
+
+ def add_lens( node ):
+ if not node.children:
+ return node.glen
+ lens = []
+ for c in node.children.values():
+ lens.append( add_lens( c ) )
+ node.glen = sum(lens) / len(lens)
+ return node.glen
+ add_lens( self.root )
+
+ for k,p in mpa_pkl['markers'].items():
+ if k in markers_to_exclude:
+ continue
+ if k in markers_to_ignore:
+ continue
+ self.markers2lens[k] = p['len']
+ self.markers2clades[k] = p['clade']
+ self.add_reads( k, 0 )
+ self.markers2exts[k] = p['ext']
+
+ def set_min_cu_len( self, min_cu_len ):
+ TaxClade.min_cu_len = min_cu_len
+
+ def set_stat( self, stat, quantile, avoid_disqm = False ):
+ TaxClade.stat = stat
+ TaxClade.quantile = quantile
+ TaxClade.avoid_disqm = avoid_disqm
+
+ def add_reads( self, marker, n,
+ ignore_viruses = False, ignore_eukaryotes = False,
+ ignore_bacteria = False, ignore_archaea = False ):
+ clade = self.markers2clades[marker]
+ cl = self.all_clades[clade]
+ if ignore_viruses or ignore_eukaryotes or ignore_bacteria or ignore_archaea:
+ cn = cl.get_full_name()
+ if ignore_viruses and cn.startswith("k__Viruses"):
+ return ""
+ if ignore_eukaryotes and cn.startswith("k__Eukaryota"):
+ return ""
+ if ignore_archaea and cn.startswith("k__Archaea"):
+ return ""
+ if ignore_bacteria and cn.startswith("k__Bacteria"):
+ return ""
+ while len(cl.children) == 1:
+ cl = list(cl.children.values())[0]
+ cl.markers2nreads[marker] = n
+ return cl.get_full_name()
+
+
+ def markers2counts( self ):
+ m2c = {}
+ for k,v in self.all_clades.items():
+ for m,c in v.markers2nreads.items():
+ m2c[m] = c
+ return m2c
+
+ def clade_profiles( self, tax_lev, get_all = False ):
+ cl2pr = {}
+ for k,v in self.all_clades.items():
+ if tax_lev and not k.startswith(tax_lev):
+ continue
+ prof = v.get_normalized_counts()
+ if not get_all and ( len(prof) < 1 or not sum([p[1] for p in prof]) > 0.0 ):
+ continue
+ cl2pr[v.get_full_name()] = prof
+ return cl2pr
+
+ def relative_abundances( self, tax_lev ):
+ cl2ab_n = dict([(k,v) for k,v in self.all_clades.items()
+ if k.startswith("k__") and not v.uncl])
+
+ cl2ab, cl2glen, tot_ab = {}, {}, 0.0
+ for k,v in cl2ab_n.items():
+ tot_ab += v.compute_abundance()
+
+ for k,v in cl2ab_n.items():
+ for cl,ab in v.get_all_abundances():
+ if not tax_lev:
+ if cl not in self.all_clades:
+ to = tax_units.index(cl[0])
+ t = tax_units[to-1]
+ cl = t + cl.split("_unclassified")[0][1:]
+ cl = self.all_clades[cl].get_full_name()
+ spl = cl.split("|")
+ cl = "|".join(spl+[tax_units[to]+spl[-1][1:]+"_unclassified"])
+ glen = self.all_clades[spl[-1]].glen
+ else:
+ glen = self.all_clades[cl].glen
+ cl = self.all_clades[cl].get_full_name()
+ elif not cl.startswith(tax_lev):
+ if cl in self.all_clades:
+ glen = self.all_clades[cl].glen
+ else:
+ glen = 1.0
+ continue
+ cl2ab[cl] = ab
+ cl2glen[cl] = glen
+
+ ret_d = dict([( k, float(v) / tot_ab if tot_ab else 0.0) for k,v in cl2ab.items()])
+ ret_r = dict([( k, (v,cl2glen[k],float(v)*cl2glen[k])) for k,v in cl2ab.items()])
+ #ret_r = dict([( k, float(v) / tot_ab if tot_ab else 0.0) for k,v in cl2ab.items()])
+ if tax_lev:
+ ret_d[tax_lev+"unclassified"] = 1.0 - sum(ret_d.values())
+ return ret_d, ret_r
+
+def map2bbh( mapping_f, input_type = 'bowtie2out', min_alignment_len = None):
+ if not mapping_f:
+ ras, ras_line, inpf = plain_read_and_split, plain_read_and_split_line, sys.stdin
+ else:
+ if mapping_f.endswith(".bz2"):
+ ras, ras_line, inpf = read_and_split, read_and_split_line, bz2.BZ2File( mapping_f, "r" )
+ else:
+ ras, ras_line, inpf = plain_read_and_split,\
+ plain_read_and_split_line,\
+ open( mapping_f )
+
+ reads2markers, reads2maxb = {}, {}
+ if input_type == 'bowtie2out':
+ for r,c in ras(inpf):
+ reads2markers[r] = c
+ elif input_type == 'sam':
+ for line in inpf:
+ o = ras_line(line)
+ if o[0][0] != '@' and o[2][-1] != '*':
+ if min_alignment_len == None\
+ or max([int(x.strip('M')) for x in\
+ re.findall(r'(\d*M)', o[5])]) >= min_alignment_len:
+ reads2markers[o[0]] = o[2]
+ inpf.close()
+
+ markers2reads = defdict( set )
+ for r,m in reads2markers.items():
+ markers2reads[m].add( r )
+
+ return markers2reads
+
+
+def maybe_generate_biom_file(pars, abundance_predictions):
+ if not pars['biom']:
+ return None
+ if not abundance_predictions:
+ return open(pars['biom'], 'w').close()
+
+ delimiter = "|" if len(pars['mdelim']) > 1 else pars['mdelim']
+ def istip(clade_name):
+ end_name = clade_name.split(delimiter)[-1]
+ return end_name.startswith("t__") or end_name.endswith("_unclassified")
+
+ def findclade(clade_name):
+ if clade_name.endswith('_unclassified'):
+ name = clade_name.split(delimiter)[-2]
+ else:
+ name = clade_name.split(delimiter)[-1]
+ return tree.all_clades[name]
+
+ def to_biomformat(clade_name):
+ return { 'taxonomy': clade_name.split(delimiter) }
+
+ clades = iter( (abundance, findclade(name))
+ for (name, abundance) in abundance_predictions
+ if istip(name) )
+ packed = iter( ([abundance], clade.get_full_name(), clade.id)
+ for (abundance, clade) in clades )
+
+ #unpack that tuple here to stay under 80 chars on a line
+ data, clade_names, clade_ids = zip(*packed)
+ # biom likes column vectors, so we give it an array like this:
+ # np.array([a],[b],[c])
+ data = np.array(data)
+ sample_ids = [pars['sample_id']]
+ table_id='MetaPhlAn2_Analysis'
+ json_key = "MetaPhlAn2"
+
+ if LooseVersion(biom.__version__) < LooseVersion("2.0.0"):
+ biom_table = biom.table.table_factory(
+ data, sample_ids, clade_ids,
+ sample_metadata = None,
+ observation_metadata = map(to_biomformat, clade_names),
+ table_id = table_id,
+ constructor = biom.table.DenseOTUTable
+ )
+ with open(pars['biom'], 'w') as outfile:
+ json.dump( biom_table.getBiomFormatObject(json_key),
+ outfile )
+ else: # Below is the biom2 compatible code
+ biom_table = biom.table.Table(
+ data, clade_ids, sample_ids,
+ sample_metadata = None,
+ observation_metadata = map(to_biomformat, clade_names),
+ table_id = table_id,
+ input_is_dense = True
+ )
+
+ with open(pars['biom'], 'w') as outfile:
+ biom_table.to_json( json_key,
+ direct_io = outfile )
+
+ return True
+
+
+if __name__ == '__main__':
+ pars = read_params( sys.argv )
+ #if pars['inp'] is None and ( pars['input_type'] is None or pars['input_type'] == 'automatic'):
+ # sys.stderr.write( "The --input_type parameter need top be specified when the "
+ # "input is provided from the standard input.\n"
+ # "Type metaphlan.py -h for more info\n")
+ # sys.exit(0)
+
+ if pars['bt2_ps'] in [
+ "sensitive-local",
+ "very-sensitive-local"
+ ]\
+ and pars['min_alignment_len'] == None:
+ pars['min_alignment_len'] = 100
+ sys.stderr.write('Warning! bt2_ps is set to local mode, '\
+ 'and min_alignment_len is None, '
+ 'I automatically set min_alignment_len to 100! '\
+ 'If you do not like, rerun the command and set '\
+ 'min_alignment_len to a specific value.\n'
+ )
+
+ if pars['input_type'] == 'fastq':
+ pars['input_type'] = 'multifastq'
+ if pars['input_type'] == 'fasta':
+ pars['input_type'] = 'multifasta'
+
+ #if pars['input_type'] == 'automatic':
+ # pars['input_type'] = guess_input_format( pars['inp'] )
+ # if not pars['input_type']:
+ # sys.stderr.write( "Sorry, I cannot guess the format of the input file, please "
+ # "specify the --input_type parameter \n" )
+ # sys.exit(1)
+
+ # check for the mpa_pkl file
+ if not os.path.isfile(pars['mpa_pkl']):
+ sys.stderr.write("Error: Unable to find the mpa_pkl file at: " + pars['mpa_pkl'] +
+ "\nExpecting location ${mpa_dir}/db_v20/map_v20_m200.pkl "
+ "\nSelect the file location with the option --mpa_pkl.\n"
+ "Exiting...\n\n")
+ sys.exit(1)
+
+ if pars['ignore_markers']:
+ with open(pars['ignore_markers']) as ignv:
+ ignore_markers = set([l.strip() for l in ignv])
+ else:
+ ignore_markers = set()
+
+ no_map = False
+ if pars['input_type'] == 'multifasta' or pars['input_type'] == 'multifastq':
+ bow = pars['bowtie2db'] is not None
+ if not bow:
+ sys.stderr.write( "No MetaPhlAn BowTie2 database provided\n "
+ "[--bowtie2db options]!\n"
+ "Exiting...\n\n" )
+ sys.exit(1)
+ if pars['no_map']:
+ pars['bowtie2out'] = tf.NamedTemporaryFile(dir=pars['tmp_dir']).name
+ no_map = True
+ else:
+ if bow and not pars['bowtie2out']:
+ if pars['inp'] and "," in pars['inp']:
+ sys.stderr.write( "Error! --bowtie2out needs to be specified when multiple "
+ "fastq or fasta files (comma separated) are provided" )
+ sys.exit(1)
+ fname = pars['inp']
+ if fname is None:
+ fname = "stdin_map"
+ elif stat.S_ISFIFO(os.stat(fname).st_mode):
+ fname = "fifo_map"
+ pars['bowtie2out'] = fname + ".bowtie2out.txt"
+
+ if os.path.exists( pars['bowtie2out'] ):
+ sys.stderr.write(
+ "BowTie2 output file detected: " + pars['bowtie2out'] + "\n"
+ "Please use it as input or remove it if you want to "
+ "re-perform the BowTie2 run.\n"
+ "Exiting...\n\n" )
+ sys.exit(1)
+
+ if bow and not all([os.path.exists(".".join([str(pars['bowtie2db']),p]))
+ for p in ["1.bt2", "2.bt2", "3.bt2","4.bt2","1.bt2","2.bt2"]]):
+ sys.stderr.write( "No MetaPhlAn BowTie2 database found "
+ "[--bowtie2db option]! "
+ "(or wrong path provided)."
+ "\nExpecting location ${mpa_dir}/db_v20/map_v20_m200 "
+ "\nExiting... " )
+ sys.exit(1)
+
+ if bow:
+ run_bowtie2( pars['inp'], pars['bowtie2out'], pars['bowtie2db'],
+ pars['bt2_ps'], pars['nproc'], file_format = pars['input_type'],
+ exe = pars['bowtie2_exe'],
+ samout = pars['samout'],
+ min_alignment_len = pars['min_alignment_len'])
+ pars['input_type'] = 'bowtie2out'
+
+ pars['inp'] = pars['bowtie2out'] # !!!
+
+ with open( pars['mpa_pkl'], 'rb' ) as a:
+ mpa_pkl = pickle.loads( bz2.decompress( a.read() ) )
+
+ tree = TaxTree( mpa_pkl, ignore_markers )
+ tree.set_min_cu_len( pars['min_cu_len'] )
+ tree.set_stat( pars['stat'], pars['stat_q'], pars['avoid_disqm'] )
+
+ markers2reads = map2bbh(
+ pars['inp'],
+ pars['input_type'],
+ pars['min_alignment_len']
+ )
+ if no_map:
+ os.remove( pars['inp'] )
+
+ map_out = []
+ for marker,reads in markers2reads.items():
+ if marker not in tree.markers2lens:
+ continue
+ tax_seq = tree.add_reads( marker, len(reads),
+ ignore_viruses = pars['ignore_viruses'],
+ ignore_eukaryotes = pars['ignore_eukaryotes'],
+ ignore_bacteria = pars['ignore_bacteria'],
+ ignore_archaea = pars['ignore_archaea'],
+ )
+ if tax_seq:
+ map_out +=["\t".join([r,tax_seq]) for r in reads]
+
+ if pars['output'] is None and pars['output_file'] is not None:
+ pars['output'] = pars['output_file']
+
+ with (open(pars['output'],"w") if pars['output'] else sys.stdout) as outf:
+ outf.write('\t'.join((pars["sample_id_key"], pars["sample_id"])) + '\n')
+ if pars['t'] == 'reads_map':
+ outf.write( "\n".join( map_out ) + "\n" )
+ elif pars['t'] == 'rel_ab':
+ cl2ab, _ = tree.relative_abundances(
+ pars['tax_lev']+"__" if pars['tax_lev'] != 'a' else None )
+ outpred = [(k,round(v*100.0,5)) for k,v in cl2ab.items() if v > 0.0]
+ if outpred:
+ for k,v in sorted( outpred, reverse=True,
+ key=lambda x:x[1]+(100.0*(8-x[0].count("|"))) ):
+ outf.write( "\t".join( [k,str(v)] ) + "\n" )
+ else:
+ outf.write( "unclassified\t100.0\n" )
+ maybe_generate_biom_file(pars, outpred)
+ elif pars['t'] == 'rel_ab_w_read_stats':
+ cl2ab, rr = tree.relative_abundances(
+ pars['tax_lev']+"__" if pars['tax_lev'] != 'a' else None )
+ outpred = [(k,round(v*100.0,5)) for k,v in cl2ab.items() if v > 0.0]
+ totl = 0
+ if outpred:
+ outf.write( "\t".join( [ "#clade_name",
+ "relative_abundance",
+ "coverage",
+ "average_genome_length_in_the_clade",
+ "estimated_number_of_reads_from_the_clade" ]) +"\n" )
+
+ for k,v in sorted( outpred, reverse=True,
+ key=lambda x:x[1]+(100.0*(8-x[0].count("|"))) ):
+ outf.write( "\t".join( [ k,
+ str(v),
+ str(rr[k][0]) if k in rr else "-",
+ str(rr[k][1]) if k in rr else "-",
+ str(int(round(rr[k][2],0)) if k in rr else "-")
+ ] ) + "\n" )
+ if "|" not in k:
+ totl += (int(round(rr[k][2],0)) if k in rr else 0)
+
+ outf.write( "#estimated total number of reads from known clades: " + str(totl)+"\n")
+ else:
+ outf.write( "unclassified\t100.0\n" )
+ maybe_generate_biom_file(pars, outpred)
+
+ elif pars['t'] == 'clade_profiles':
+ cl2pr = tree.clade_profiles( pars['tax_lev']+"__" if pars['tax_lev'] != 'a' else None )
+ for c,p in cl2pr.items():
+ mn,n = zip(*p)
+ outf.write( "\t".join( [""]+[str(s) for s in mn] ) + "\n" )
+ outf.write( "\t".join( [c]+[str(s) for s in n] ) + "\n" )
+ elif pars['t'] == 'marker_ab_table':
+ cl2pr = tree.clade_profiles( pars['tax_lev']+"__" if pars['tax_lev'] != 'a' else None )
+ for v in cl2pr.values():
+ outf.write( "\n".join(["\t".join([str(a),str(b/float(pars['nreads'])) if pars['nreads'] else str(b)])
+ for a,b in v if b > 0.0]) + "\n" )
+ elif pars['t'] == 'marker_pres_table':
+ cl2pr = tree.clade_profiles( pars['tax_lev']+"__" if pars['tax_lev'] != 'a' else None )
+ for v in cl2pr.values():
+ strout = ["\t".join([str(a),"1"]) for a,b in v if b > pars['pres_th']]
+ if strout:
+ outf.write( "\n".join(strout) + "\n" )
+
+ elif pars['t'] == 'marker_counts':
+ outf.write( "\n".join( ["\t".join([m,str(c)]) for m,c in tree.markers2counts().items() ]) +"\n" )
+
+ elif pars['t'] == 'clade_specific_strain_tracker':
+ cl2pr = tree.clade_profiles( None, get_all = True )
+ cl2ab, _ = tree.relative_abundances( None )
+ strout = []
+ for cl,v in cl2pr.items():
+ if cl.endswith(pars['clade']) and cl2ab[cl]*100.0 < pars['min_ab']:
+ strout = []
+ break
+ if pars['clade'] in cl:
+ strout += ["\t".join([str(a),str(int(b > pars['pres_th']))]) for a,b in v]
+ if strout:
+ strout = sorted(strout,key=lambda x:x[0])
+ outf.write( "\n".join(strout) + "\n" )
+ else:
+ sys.stderr.write("Clade "+pars['clade']+" not present at an abundance >"+str(round(pars['min_ab'],2))+"%, "
+ "so no clade specific markers are reported\n")
diff --git a/strainphlan.py b/strainphlan.py
new file mode 100755
index 0000000..f0023a0
--- /dev/null
+++ b/strainphlan.py
@@ -0,0 +1,1538 @@
+#!/usr/bin/env python
+# Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '1.0.0'
+__date__ = '2nd August 2016'
+
+import sys
+import os
+import shutil
+ABS_PATH = os.path.abspath(sys.argv[0])
+MAIN_DIR = os.path.dirname(ABS_PATH)
+os.environ['PATH'] += ':' + MAIN_DIR
+os.environ['PATH'] += ':' + os.path.join(MAIN_DIR, 'strainphlan_src')
+sys.path.append(os.path.join(MAIN_DIR, 'strainphlan_src'))
+
+import which
+import argparse as ap
+import cPickle as pickle
+import msgpack
+import glob
+from mixed_utils import statistics
+import ooSubprocess
+from ooSubprocess import trace_unhandled_exceptions
+import bz2
+import gzip
+from collections import defaultdict
+from tempfile import SpooledTemporaryFile, NamedTemporaryFile
+from Bio import SeqIO, Seq, SeqRecord
+from Bio.Alphabet import IUPAC
+import pandas
+import logging
+import logging.config
+import sample2markers
+import copy
+import threading
+import numpy
+import random
+import gc
+#import ipdb
+
+shared_variables = type('shared_variables', (object,), {})
+
+logging.basicConfig(level=logging.DEBUG, stream=sys.stderr,
+ disable_existing_loggers=False,
+ format='%(asctime)s | %(levelname)s | %(name)s | %(funcName)s | %(lineno)d | %(message)s')
+logger = logging.getLogger(__name__)
+
+# get the directory that contains this script
+metaphlan2_script_install_folder=os.path.dirname(os.path.abspath(__file__))
+
+# functions
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument(
+ '--ifn_samples',
+ nargs='+',
+ required=False,
+ default=[],
+ type=str,
+ help='The list of sample files (space separated).'\
+ 'The wildcard can also be used.')
+
+ p.add_argument(
+ '--ifn_second_samples',
+ nargs='+',
+ required=False,
+ default=[],
+ type=str,
+ help='The list of second sample files (space separated).'\
+ 'The wildcard can also be used. '\
+ 'Note that only the markers found in the samples or '\
+ 'reference genomes '
+ 'specified by --ifn_samples or --ifn_representative_sample '\
+ 'or --ifn_ref_genomes with '\
+ 'add_reference_genomes_as_second_samples=False '\
+ 'will be used to build the phylogenetic trees. '
+ )
+
+ p.add_argument(
+ '--ifn_representative_sample',
+ required=False,
+ default=None,
+ type=str,
+ help='The representative sample. The marker list of each species '\
+ 'extracted from this sample will be used for all other samples.')
+
+ p.add_argument(
+ '--mpa_pkl',
+ required=False,
+ default=os.path.join(metaphlan2_script_install_folder,"db_v20","mpa_v20_m200.pkl"),
+ type=str,
+ help='The database of metaphlan3.py.')
+
+ p.add_argument(
+ '--output_dir',
+ required=True,
+ default='strainer_output',
+ type=str,
+ help='The output directory.')
+
+ p.add_argument(
+ '--ifn_markers',
+ required=False,
+ default=None,
+ type=str,
+ help='The marker file in fasta format.')
+
+ p.add_argument(
+ '--nprocs_main',
+ required=False,
+ default=1,
+ type=int,
+ help='The number of processors are used for the main threads. '\
+ 'Default 1.')
+
+ p.add_argument(
+ '--nprocs_load_samples',
+ required=False,
+ default=None,
+ type=int,
+ help='The number of processors are used for loading samples. '\
+ 'Default nprocs_main.')
+
+ p.add_argument(
+ '--nprocs_align_clean',
+ required=False,
+ default=None,
+ type=int,
+ help='The number of processors are used for aligning and cleaning markers. '\
+ 'Default nprocs_main.')
+
+ p.add_argument(
+ '--nprocs_raxml',
+ required=False,
+ default=None,
+ type=int,
+ help='The number of processors are used for running raxml. '\
+ 'Default nprocs_main.')
+
+ p.add_argument(
+ '--bootstrap_raxml',
+ required=False,
+ default=0,
+ type=int,
+ help='The number of runs for bootstraping when building the tree. '\
+ 'Default 0.')
+
+ p.add_argument(
+ '--ifn_ref_genomes',
+ nargs='+',
+ required=False,
+ default=None,
+ type=str,
+ help='The reference genome file names. They are separated by spaces.')
+
+ p.add_argument(
+ '--add_reference_genomes_as_second_samples',
+ required=False,
+ dest='add_reference_genomes_as_second_samples',
+ action='store_true',
+ help='Add reference genomes as second samples. '\
+ 'Default "False". ' \
+ 'Note that only the markers found in the samples or '\
+ 'reference genomes '
+ 'specified by --ifn_samples or --ifn_representative_sample '\
+ 'or --ifn_ref_genomes with '\
+ 'add_reference_genomes_as_second_samples=False '\
+ 'will be used to build the phylogenetic trees. '
+ )
+ p.set_defaults(add_reference_genomes_as_second_samples=False)
+
+ p.add_argument(
+ '--N_in_marker',
+ required=False,
+ default=0.2,
+ type=float,
+ help='The consensus markers with the rate of N nucleotides greater than '\
+ 'this threshold are removed. Default 0.2.')
+
+ p.add_argument(
+ '--marker_strip_length',
+ required=False,
+ default=50,
+ type=int,
+ help='The number of nucleotides will be deleted from each of two ends '\
+ 'of a marker. Default 50.')
+
+ p.add_argument(
+ '--marker_in_clade',
+ required=False,
+ default=0.8,
+ type=float,
+ help='In each sample, the clades with the rate of present markers less than '\
+ 'this threshold are removed. Default 0.8.')
+
+ p.add_argument(
+ '--second_marker_in_clade',
+ required=False,
+ default=0.8,
+ type=float,
+ help='In each sample/reference genomes specified by --ifn_second_samples, '\
+ 'or --add_reference_genomes_as_second_samples, '\
+ 'the clades with the rate of present markers less than '\
+ 'this threshold are removed. Default 0.8.')
+
+ p.add_argument(
+ '--sample_in_clade',
+ required=False,
+ default=2,
+ type=int,
+ help='Only clades present in at least sample_in_clade samples '\
+ 'are kept. Default 2.')
+
+ p.add_argument(
+ '--sample_in_marker',
+ required=False,
+ default=0.8,
+ type=float,
+ help='If the percentage of samples that a marker present in is '\
+ 'less than this threshold, that marker is removed. Default 0.8.')
+
+ p.add_argument(
+ '--gap_in_trailing_col',
+ required=False,
+ default=0.2,
+ type=float,
+ help='If the number of the trailing nucleotide columns in aligned '\
+ 'markers with the percentage of gaps greater than '\
+ 'gap_in_trailing_col is less than gap_trailing_col_limit, '\
+ 'these columns will be removed. '\
+ 'Default 0.2.')
+
+ p.add_argument(
+ '--gap_trailing_col_limit',
+ required=False,
+ default=101,
+ type=float,
+ help='If the number of the trailing nucleotide columns in aligned '\
+ 'markers with the percentage of gaps greater than '\
+ 'gap_in_trailing_col is less than gap_trailing_col_limit, '\
+ 'these columns will be removed. '\
+ 'Default 101.')
+
+ p.add_argument(
+ '--gap_in_internal_col',
+ required=False,
+ default=0.3,
+ type=float,
+ help='The internal nucleotide columns in aligned '\
+ 'markers with the percentage of gaps greater than '\
+ 'gap_in_internal_col will be removed. '\
+ 'Default 0.3.')
+
+ p.add_argument(
+ '--gap_in_sample',
+ required=False,
+ default=0.2,
+ type=float,
+ help='The samples with full sequences from all markers '\
+ 'and having the percentage of gaps greater than this threshold '\
+ 'will be removed. Default 0.2.')
+
+ p.add_argument(
+ '--second_gap_in_sample',
+ required=False,
+ default=0.2,
+ type=float,
+ help='The samples specified by --ifn_second_samples with full sequences from all markers '\
+ 'and having the percentage of gaps greater than this threshold '\
+ 'will be removed. Default 0.2.')
+
+ p.add_argument(
+ '--N_col',
+ required=False,
+ default=0.8,
+ type=float,
+ help='In aligned markers, if the percentage of nucleotide columns '\
+ 'containing more than N_count Ns '\
+ 'less than this threshold, these columns will be removed. '
+ 'Default 0.8.')
+
+ p.add_argument(
+ '--N_count',
+ required=False,
+ default=0,
+ type=int,
+ help='In aligned markers, if the percentage of nucleotide columns '\
+ 'containing more than N_count Ns '\
+ 'less than N_col threshold, these columns will be removed. '\
+ 'Default 0.')
+
+ p.add_argument(
+ '--long_gap_length',
+ required=False,
+ default=2,
+ type=int,
+ help='In each concatenated sequence of a sample, sequential '\
+ 'gap positions is a gap group. '\
+ 'A gap group with length greater than this '\
+ 'threshold is considered as '\
+ 'a long gap group. If the ratio between the number of unique '\
+ 'positions in all long gap groups and the concatenated sequence '\
+ 'length is less than long_gap_percentage, these positions '\
+ 'will be removed from all concatenated sequences. '\
+ 'Default 2.')
+
+ p.add_argument(
+ '--long_gap_percentage',
+ required=False,
+ default=0.8,
+ type=float,
+ help='Combining this threshold with long_gap_length to removed long '\
+ 'gaps. Default 0.8.')
+
+ p.add_argument(
+ '--p_value',
+ required=False,
+ default=0.05,
+ type=float,
+ help='The p_value to reject a non-polymorphic site.'\
+ 'Default 0.05.')
+
+ p.add_argument(
+ '--clades',
+ nargs='+',
+ required=False,
+ default=['all'],
+ type=str,
+ help='The clades (space seperated) for which the script will compute '\
+ 'the marker alignments in fasta format and the phylogenetic '\
+ 'trees. If a file name is specified, the clade list in that '\
+ 'file where each clade name is on a line will be read.'
+ 'Default "automatically identify all clades".')
+
+ p.add_argument(
+ '--marker_list_fn',
+ required=False,
+ default=None,
+ type=str,
+ help='The file name containing the list of considered markers. '\
+ 'The other markers will be discarded. '\
+ 'Default "None".')
+ p.add_argument(
+ '--print_clades_only',
+ required=False,
+ dest='print_clades_only',
+ action='store_true',
+ help='Only print the potential clades and stop without building any '\
+ 'tree. This option is useful when you want to check quickly '\
+ 'all possible clades and rerun only for some specific ones. '\
+ 'Default "False".')
+ p.set_defaults(print_clades_only=False)
+
+ p.add_argument(
+ '--alignment_program',
+ required=False,
+ default='muscle',
+ choices=['muscle', 'mafft'],
+ type=str,
+ help='The alignment program. Default "muscle".')
+
+ p.add_argument(
+ '--relaxed_parameters',
+ required=False,
+ dest='relaxed_parameters',
+ action='store_true',
+ help='Set marker_in_clade=0.5, sample_in_marker=0.5, '\
+ 'N_in_marker=0.5, gap_in_sample=0.5. '\
+ 'Default "False".')
+ p.set_defaults(relaxed_parameters=False)
+
+ p.add_argument(
+ '--relaxed_parameters2',
+ required=False,
+ dest='relaxed_parameters2',
+ action='store_true',
+ help='Set marker_in_clade=0.2, sample_in_marker=0.2, '\
+ 'N_in_marker=0.8, gap_in_sample=0.8. '\
+ 'Default "False".')
+ p.set_defaults(relaxed_parameters2=False)
+
+ p.add_argument(
+ '--relaxed_parameters3',
+ required=False,
+ dest='relaxed_parameters3',
+ action='store_true',
+ help='Set gap_in_trailing_col=0.9, gap_in_internal_col=0.9, '\
+ 'gap_in_sample=0.9, second_gap_in_sample=0.5, '\
+ 'sample_in_marker=0.1, marker_in_clade=0.1, '\
+ 'second_marker_in_clade=0.1, '\
+ 'Default "False".')
+ p.set_defaults(relaxed_parameters3=False)
+
+ p.add_argument(
+ '--keep_alignment_files',
+ required=False,
+ dest='keep_alignment_files',
+ action='store_true',
+ help='Keep the alignment files of all markers before cleaning step.')
+ p.set_defaults(keep_alignment_files=False)
+
+ p.add_argument(
+ '--keep_full_alignment_files',
+ required=False,
+ dest='keep_full_alignment_files',
+ action='store_true',
+ help='Keep the alignment files of all markers before '\
+ 'truncating the starting and ending parts, and cleaning step. '
+ 'This is equivalent to '\
+ '--keep_alignment_files --marker_strip_length 0')
+ p.set_defaults(keep_full_alignment_files=False)
+
+ p.add_argument(
+ '--save_sample2fullfreq',
+ required=False,
+ dest='save_sample2fullfreq',
+ action='store_true',
+ help='Save sample2fullfreq to a msgpack file sample2fullfreq.msgpack.')
+ p.set_defaults(save_sample2fullfreq=False)
+
+ p.add_argument(
+ '--use_threads',
+ required=False,
+ action='store_true',
+ dest='use_threads',
+ help='Use multithreading. Default "Use multiprocessing".')
+ p.set_defaults(use_threads=False)
+
+ return vars(p.parse_args())
+
+
+
+def filter_sequence(sample, marker2seq, marker_strip_length, N_in_marker):
+ '''
+ Filter markers with percentage of N-bases greater than a threshold.
+
+ :param marker2seq: a dictionary containing sequences of a sample.
+ marker2seq[marker]['seq'] should return the sequence of the marker.
+ :returns: a dictionary containing filtered sequences of samples.
+ '''
+ remove_markers = [marker for marker in marker2seq if
+ float(marker2seq[marker]['seq'].count('N')) /
+ len(marker2seq[marker]['seq']) > N_in_marker]
+ for marker in remove_markers:
+ del marker2seq[marker]
+ log_line = 'sample %s, number of markers after N_in_marker: %d\n'\
+ %(sample, len(marker2seq))
+
+ remove_markers = []
+ for marker in marker2seq:
+ if marker_strip_length > 0:
+ marker2seq[marker]['seq'] = \
+ marker2seq[marker]['seq'][marker_strip_length:-marker_strip_length]
+ marker2seq[marker]['freq'] = \
+ marker2seq[marker]['freq'][marker_strip_length:-marker_strip_length]
+ #marker2seq[marker]['seq'] = marker2seq[marker]['seq'].strip('N')
+ if len(marker2seq[marker]['seq']) == 0:
+ remove_markers.append(marker)
+ for marker in remove_markers:
+ del marker2seq[marker]
+ logger.debug(log_line + \
+ 'sample %s, number of markers after marker_strip_length: %d'\
+ %(sample, len(marker2seq)))
+
+ return marker2seq
+
+
+
+
+def get_db_clades(db):
+ # find singleton clades
+ sing_clades = []
+ clade2subclades = defaultdict(set)
+ for tax in db['taxonomy']:
+ tax_clades = tax.split('|')
+ for i, clade in enumerate(tax_clades):
+ if 't__' not in clade and 's__' not in clade:
+ if i < len(tax_clades)-1:
+ if 't__' in tax_clades[-1]:
+ clade2subclades[clade].add('|'.join(tax_clades[i+1:-1]))
+ else:
+ clade2subclades[clade].add('|'.join(tax_clades[i+1:]))
+ sing_clades = [clade for clade in clade2subclades if
+ len(clade2subclades[clade]) == 1]
+
+ # extract species
+ clade2num_markers = defaultdict(int)
+ level = 's__'
+ for marker in db['markers']:
+ clade = db['markers'][marker]['taxon'].split('|')[-1]
+ if level in clade or clade in sing_clades:
+ clade2num_markers[clade] = clade2num_markers[clade] + 1
+ clade2num_markers = dict(clade2num_markers)
+
+ return sing_clades, clade2num_markers, clade2subclades
+
+
+
+
+def align(marker_fn, alignment_program):
+ oosp = ooSubprocess.ooSubprocess()
+ if alignment_program == 'muscle':
+ ifile = open(marker_fn, 'r')
+ alignment_file = oosp.ex(
+ 'muscle',
+ args=['-quiet', '-in', '-', '-out', '-'],
+ in_pipe=ifile,
+ get_out_pipe=True,
+ verbose=False)
+ ifile.close()
+ elif alignment_program == 'mafft':
+ alignment_file = oosp.ex(
+ 'mafft',
+ args=['--auto', marker_fn],
+ get_out_pipe=True,
+ verbose=False)
+ else:
+ raise Exception('Unknown alignment_program %s!'%alignment_program)
+ return alignment_file
+
+
+
+
+def clean_alignment(
+ samples,
+ sample2seq,
+ sample2freq,
+ gap_in_trailing_col,
+ gap_trailing_col_limit,
+ gap_in_internal_col,
+ N_count,
+ N_col):
+
+ length = len(sample2seq[sample2seq.keys()[0]])
+ logger.debug('marker length: %d', length)
+ aligned_samples = sample2seq.keys()
+ for sample in samples:
+ if sample not in aligned_samples:
+ sample2seq[sample] = ['-' for i in range(length)]
+ sample2freq[sample] = [(0.0, 0.0, 0.0) for i in range(length)]
+
+ df_seq = pandas.DataFrame.from_dict(sample2seq, orient='index')
+ df_freq = pandas.DataFrame.from_dict(sample2freq, orient='index')
+
+ # remove trailing gap columns
+ del_cols = []
+ for i in range(len(df_seq.columns)):
+ if float(list(df_seq[df_seq.columns[i]]).count('-')) / len(samples) <= gap_in_trailing_col:
+ break
+ else:
+ del_cols.append(df_seq.columns[i])
+ for i in reversed(range(len(df_seq.columns))):
+ if float(list(df_seq[df_seq.columns[i]]).count('-')) / len(samples) <= gap_in_trailing_col:
+ break
+ else:
+ del_cols.append(df_seq.columns[i])
+ if len(del_cols) < gap_trailing_col_limit:
+ df_seq.drop(del_cols, axis=1, inplace=True)
+ df_freq.drop(del_cols, axis=1, inplace=True)
+ logger.debug('length after gap_in_trailing_col: %d', len(df_seq.columns))
+ else:
+ logger.debug('do not use gap_in_trailing_col as the number of del_cols is %d'%len(del_cols))
+
+ # remove internal gap columns
+ del_cols = []
+ for i in range(len(df_seq.columns)):
+ if float(list(df_seq[df_seq.columns[i]]).count('-')) / len(samples) > gap_in_internal_col:
+ del_cols.append(df_seq.columns[i])
+ df_seq.drop(del_cols, axis=1, inplace=True)
+ df_freq.drop(del_cols, axis=1, inplace=True)
+ logger.debug('length after gap_in_internal_col: %d', len(df_seq.columns))
+
+ # remove N columns
+ if len(df_seq.columns) > 0:
+ del_cols = []
+ remove_N_col = False
+ for i in range(len(df_seq.columns)):
+ if list(df_seq[df_seq.columns[i]]).count('N') > N_count:
+ del_cols.append(df_seq.columns[i])
+ if float(len(del_cols)) / len(df_seq.columns) < N_col:
+ remove_N_col = True
+ df_seq.drop(del_cols, axis=1, inplace=True)
+ df_freq.drop(del_cols, axis=1, inplace=True)
+ logger.debug('length after N_col: %d', len(df_seq.columns))
+
+ if N_count > 0 or not remove_N_col:
+ logger.debug('replace Ns by gaps for all samples')
+ for sample in samples:
+ seq = ''.join(df_seq.loc[sample])
+ logger.debug('sample %s, number of Ns: %d'\
+ %(sample, seq.count('N')))
+ sample2seq[sample] = list(seq.replace('N', '-'))
+ else:
+ for sample in samples:
+ sample2seq[sample] = df_seq.loc[sample].tolist()
+ for sample in samples:
+ sample2freq[sample] = df_freq.loc[sample].tolist()
+ else:
+ sample2seq = {}
+ sample2freq = {}
+
+ return sample2seq, sample2freq
+
+
+
+
+def add_ref_genomes(genome2marker, marker_records, ifn_ref_genomes, tmp_dir):
+ ifn_ref_genomes = sorted(list(set(ifn_ref_genomes)))
+ logger.debug('add %d reference genomes'%len(ifn_ref_genomes))
+ logger.debug('Number of samples: %d'%len(genome2marker))
+
+ # marker list
+ if len(genome2marker) == 0:
+ unique_markers = set(marker_records.keys())
+ else:
+ unique_markers = set([])
+ for sample in genome2marker:
+ for marker in genome2marker[sample]:
+ if marker not in unique_markers:
+ unique_markers.add(marker)
+ logger.debug('Number of unique markers: %d'%len(unique_markers))
+
+ # add ifn_ref_genomes
+ oosp = ooSubprocess.ooSubprocess(tmp_dir=tmp_dir)
+ logger.debug('load genome contigs')
+ p1 = SpooledTemporaryFile(dir=tmp_dir)
+ contigs = defaultdict(dict)
+ for ifn_genome in ifn_ref_genomes:
+ genome = ooSubprocess.splitext(ifn_genome)[0]
+ if ifn_genome[-4:] == '.bz2':
+ ifile_genome = bz2.BZ2File(ifn_genome, 'r')
+ elif ifn_genome[-3:] == '.gz':
+ ifile_genome = gzip.GzipFile(ifn_genome, 'r')
+ elif ifn_genome[-4:] == '.fna':
+ ifile_genome = open(ifn_genome, 'r')
+ else:
+ logger.error('Unknown file type of %s. '%ifn_genome +\
+ 'It should be .fna.bz2, .fna.gz, or .fna!')
+ exit(1)
+
+ # extract genome contigs
+ for rec in SeqIO.parse(ifile_genome, 'fasta'):
+ #rec.name = genome + '___' + rec.name
+ if rec.name in contigs:
+ logger.error(
+ 'Error: Contig %s in genome%s'\
+ %(rec.name.split('___')[-1], genome)\
+ + ' are not unique!')
+ exit(1)
+ contigs[rec.name]['seq'] = str(rec.seq)
+ contigs[rec.name]['genome'] = genome
+ SeqIO.write(rec, p1, 'fasta')
+
+ ifile_genome.close()
+ p1.seek(0)
+
+ # build blastdb
+ logger.debug('build blastdb')
+ blastdb_prefix = oosp.ftmp('genome_blastn_db_%s'%(random.random()))
+ if len(glob.glob('%s*'%blastdb_prefix)):
+ logger.error('blastdb exists! Please remove it or rerun!')
+ exit(1)
+ oosp.ex('makeblastdb',
+ args=[
+ '-dbtype', 'nucl',
+ '-title', 'genome_db',
+ '-out', blastdb_prefix],
+ in_pipe=p1,
+ verbose=True)
+
+ # blast markers against contigs
+ logger.debug('blast markers against contigs')
+ p1 = SpooledTemporaryFile(dir=tmp_dir)
+ for marker in unique_markers:
+ SeqIO.write(marker_records[marker], p1, 'fasta')
+ p1.seek(0)
+ blastn_args = [
+ '-db', blastdb_prefix,
+ '-outfmt', '6',
+ '-evalue', '1e-10',
+ '-max_target_seqs', '1000000000']
+ if args['nprocs_main'] > 1:
+ blastn_args += ['-num_threads', str(args['nprocs_main'])]
+ output = oosp.ex(
+ 'blastn',
+ args=blastn_args,
+ in_pipe=p1,
+ get_out_pipe=True,
+ verbose=True)
+
+ #output = output.split('\n')
+ for line in output:
+ if line.strip() == '':
+ break
+ line = line.strip().split()
+ query = line[0]
+ target = line[1]
+ pstart = int(line[8])-1
+ pend = int(line[9])-1
+ genome = contigs[target]['genome']
+ if query not in genome2marker[genome]:
+ genome2marker[genome][query] = {}
+ if pstart < pend:
+ genome2marker[genome][query]['seq'] = contigs[target]['seq'][pstart:pend+1]
+ else:
+ genome2marker[genome][query]['seq'] = \
+ str(Seq.Seq(
+ contigs[target]['seq'][pend:pstart+1],
+ IUPAC.unambiguous_dna).reverse_complement())
+ genome2marker[genome][query]['freq'] = [(0.0, 0.0, 0.0) for i in \
+ range(len(genome2marker[genome][query]['seq']))]
+ genome2marker[genome][query]['seq'] = genome2marker[genome][query]['seq'].upper()
+
+ # remove database
+ for fn in glob.glob('%s*'%blastdb_prefix):
+ os.remove(fn)
+
+ logger.debug('Number of samples and genomes: %d'%len(genome2marker))
+ return genome2marker
+
+
+
+
+ at trace_unhandled_exceptions
+def align_clean(args):
+ marker = args['marker']
+ sample2marker = shared_variables.sample2marker #args['sample2marker']
+ clade = args['clade']
+ gap_in_trailing_col = args['gap_in_trailing_col']
+ gap_trailing_col_limit = args['gap_trailing_col_limit']
+ gap_in_internal_col = args['gap_in_internal_col']
+ N_col = args['N_col']
+ N_count = args['N_count']
+ sample_in_marker = args['sample_in_marker']
+ tmp_dir = args['tmp_dir']
+ alignment_program = args['alignment_program']
+ alignment_fn = args['alignment_fn']
+
+ logger.debug('align and clean for marker: %s'%marker)
+ marker_file = NamedTemporaryFile(dir=tmp_dir, delete=False)
+ marker_fn = marker_file.name
+ sample_count = 0
+ for sample in iter(sample2marker.keys()):
+ if marker in iter(sample2marker[sample].keys()):
+ sample_count += 1
+ SeqIO.write(
+ SeqRecord.SeqRecord(
+ id=sample,
+ description='',
+ seq=Seq.Seq(sample2marker[sample][marker]['seq'])),
+ marker_file,
+ 'fasta')
+ marker_file.close()
+ ratio = float(sample_count) / len(sample2marker)
+ if ratio < sample_in_marker:
+ os.remove(marker_fn)
+ logger.debug('skip this marker because percentage of samples '\
+ 'it present is %f < sample_in_marker'%ratio)
+ return {}, {}
+
+ alignment_file = align(marker_fn, alignment_program)
+ os.remove(marker_fn)
+
+ sample2seq = {}
+ sample2freq = {}
+ for rec in SeqIO.parse(alignment_file, 'fasta'):
+ sample = rec.name
+ sample2seq[sample] = list(str(rec.seq))
+ sample2freq[sample] = list(sample2marker[sample][marker]['freq'])
+ for i, c in enumerate(sample2seq[sample]):
+ if c == '-':
+ sample2freq[sample].insert(i, (0.0, 0.0, 0.0))
+ logger.debug('alignment length of sample %s is %d, %d'%(
+ sample,
+ len(sample2seq[sample]),
+ len(sample2freq[sample])))
+ if alignment_fn:
+ shutil.copyfile(alignment_file.name, alignment_fn)
+
+ alignment_file.close()
+ logger.debug('alignment for marker %s is done'%marker)
+
+
+ if len(sample2seq) == 0:
+ logger.error('Fatal error in alignment step!')
+ exit(1)
+
+ sample2seq, sample2freq = clean_alignment(
+ sample2marker.keys(),
+ sample2seq,
+ sample2freq,
+ gap_in_trailing_col,
+ gap_trailing_col_limit,
+ gap_in_internal_col,
+ N_count,
+ N_col)
+ logger.debug('cleaning for marker %s is done'%marker)
+
+ return sample2seq, sample2freq
+
+
+
+
+def build_tree(
+ clade,
+ sample2marker,
+ sample2order,
+ clade2num_markers,
+ sample_in_clade,
+ sample_in_marker,
+ gap_in_trailing_col,
+ gap_trailing_col_limit,
+ gap_in_internal_col,
+ N_count,
+ N_col,
+ gap_in_sample,
+ second_gap_in_sample,
+ long_gap_length,
+ long_gap_percentage,
+ p_value,
+ output_dir,
+ nprocs_align_clean,
+ alignment_program,
+ nprocs_raxml,
+ keep_alignment_files,
+ bootstrap_raxml,
+ save_sample2fullfreq,
+ use_threads):
+
+ # build the tree for each clade
+ if len(sample2marker) < sample_in_clade:
+ logger.debug(
+ 'skip clade %s because number of present samples '
+ 'is %d'%(clade, len(sample2marker)))
+ return
+
+ ofn_cladeinfo = os.path.join(output_dir, '%s.info'%clade)
+ ofile_cladeinfo = open(ofn_cladeinfo, 'w')
+
+ logger.debug('clade: %s', clade)
+ ofile_cladeinfo.write('clade: %s\n'%clade)
+ logger.debug('number of samples: %d', len(sample2marker))
+ ofile_cladeinfo.write('number of samples: %d\n'\
+ %len(sample2marker))
+ if clade in clade2num_markers:
+ logger.debug('number of markers of the clade in db: %d'\
+ %clade2num_markers[clade])
+ ofile_cladeinfo.write('number of markers of the clade in db: %d\n'\
+ %clade2num_markers[clade])
+
+ # align sequences in each marker
+ markers = set([])
+ for sample in sample2marker:
+ if sample2order[sample] == 'first':
+ for marker in sample2marker[sample]:
+ if marker not in markers:
+ markers.add(marker)
+ markers = sorted(list(markers))
+
+ logger.debug('number of used markers: %d'%len(markers))
+ ofile_cladeinfo.write('number of used markers: %d\n'%len(markers))
+ if clade in clade2num_markers:
+ logger.debug('fraction of used markers: %f'\
+ %(float(len(markers)) / clade2num_markers[clade]))
+ ofile_cladeinfo.write('fraction of used markers: %f\n'\
+ %(float(len(markers)) / clade2num_markers[clade]))
+
+ logger.debug('align and clean')
+ args_list = []
+
+ # parallelize
+ for i in range(len(markers)):
+ args_list.append({})
+ args_list[i]['marker'] = markers[i]
+ args_list[i]['clade'] = clade
+ args_list[i]['gap_in_trailing_col'] = gap_in_trailing_col
+ args_list[i]['gap_trailing_col_limit'] = gap_trailing_col_limit
+ args_list[i]['gap_in_internal_col'] = gap_in_internal_col
+ args_list[i]['N_count'] = N_count
+ args_list[i]['N_col'] = N_col
+ args_list[i]['sample_in_marker'] = sample_in_marker
+ args_list[i]['tmp_dir'] = output_dir
+ args_list[i]['alignment_program'] = alignment_program
+ if keep_alignment_files:
+ args_list[i]['alignment_fn'] = os.path.join(output_dir, markers[i] + '.marker_aligned')
+ else:
+ args_list[i]['alignment_fn'] = None
+
+ logger.debug('start to align_clean for all markers')
+ results = ooSubprocess.parallelize(
+ align_clean,
+ args_list,
+ nprocs_align_clean,
+ use_threads=use_threads)
+
+ sample2seqs, sample2freqs = zip(*results)
+ sample2fullseq = defaultdict(list)
+ sample2fullfreq = defaultdict(list)
+ empty_markers = []
+ pos = 0
+ marker_pos = []
+ for i in range(len(sample2seqs)):
+ #logger.debug('marker_name: %s, seq: %s'%(markers[i], sample2seqs[i]))
+ if len(sample2seqs[i]):
+ for sample in sample2seqs[i]:
+ sample2fullseq[sample] += sample2seqs[i][sample]
+ sample2fullfreq[sample] += sample2freqs[i][sample]
+ marker_pos.append([markers[i], pos])
+ pos += len(sample2seqs[i][sample])
+ else:
+ empty_markers.append(markers[i])
+
+ logger.debug(
+ 'number of markers after deleting empty markers: %d',
+ len(markers) - len(empty_markers))
+ ofile_cladeinfo.write('number of markers after deleting '\
+ 'empty markers: %d\n'%
+ (len(markers) - len(empty_markers)))
+
+ if clade in clade2num_markers:
+ logger.debug('fraction of used markers after deleting empty markers: '\
+ '%f'%(float(len(markers) - len(empty_markers)) / clade2num_markers[clade]))
+ ofile_cladeinfo.write('fraction of used markers after deleting empty '\
+ 'markers: %f\n'\
+ %(float(len(markers) - len(empty_markers)) / clade2num_markers[clade]))
+
+
+ if len(sample2fullseq) == 0:
+ logger.debug('all markers were removed, skip this clade!')
+ ofile_cladeinfo.write('all markers were removed, skip this clade!\n')
+ return
+
+ # remove long gaps
+ logger.debug('full sequence length before long_gap_length: %d'\
+ %(len(sample2fullseq[sample2fullseq.keys()[0]])))
+ ofile_cladeinfo.write(
+ 'full sequence length before long_gap_length: %d\n'\
+ %(len(sample2fullseq[sample2fullseq.keys()[0]])))
+
+ df_seq = pandas.DataFrame.from_dict(sample2fullseq, orient='index')
+ df_freq = pandas.DataFrame.from_dict(sample2fullfreq, orient='index')
+ del_cols = []
+ del_pos = []
+ for sample in sample2fullseq:
+ row = df_seq.loc[sample]
+ gap_in_cols = []
+ gap_in_pos = []
+ for i in range(len(row)):
+ if row[df_seq.columns[i]] == '-':
+ gap_in_cols.append(df_seq.columns[i])
+ gap_in_pos.append(i)
+ else:
+ if len(gap_in_cols) > long_gap_length:
+ del_cols += gap_in_cols
+ del_pos += gap_in_pos
+ gap_in_cols = []
+ gap_in_pos = []
+
+ if len(gap_in_cols) > long_gap_length:
+ del_cols += gap_in_cols
+ del_pos += gap_in_pos
+
+ del_cols = list(set(del_cols))
+ del_pos = sorted(list(set(del_pos)))
+ del_ratio = float(len(del_cols)) / len(sample2fullseq[sample])
+ if del_ratio < long_gap_percentage:
+ df_seq.drop(del_cols, axis=1, inplace=True)
+ df_freq.drop(del_cols, axis=1, inplace=True)
+ for sample in sample2fullseq:
+ sample2fullseq[sample] = df_seq.loc[sample].tolist()
+ sample2fullfreq[sample] = df_freq.loc[sample].tolist()
+ logger.debug('full sequence length after long_gap_length: %d'\
+ %(len(sample2fullseq[sample2fullseq.keys()[0]])))
+ ofile_cladeinfo.write(
+ 'full sequence length after long_gap_length: %d\n'\
+ %(len(sample2fullseq[sample2fullseq.keys()[0]])))
+
+ for i in range(len(marker_pos)):
+ num_del = 0
+ for p in del_pos:
+ if marker_pos[i][1] > p:
+ num_del += 1
+ marker_pos[i][1] -= num_del
+ else:
+ logger.debug('do not apply long_gap_length because '\
+ 'long_gap_percentage is not satisfied. '\
+ 'del_ratio: %f'%del_ratio)
+ ofile_cladeinfo.write('do not apply long_gap_length because '\
+ 'long_gap_percentage is not satisfied. '\
+ 'del_ratio: %f\n'%del_ratio)
+
+ ofn_clademarker = os.path.join(output_dir, '%s.marker_pos'%clade)
+ with open(ofn_clademarker, 'w') as ofile_clademarker:
+ for m, p in marker_pos:
+ ofile_clademarker.write('%s\t%d\n'%(m, p))
+
+ # remove samples with too many gaps
+ logger.debug(
+ 'number of samples before gap_in_sample: %d'\
+ %len(sample2fullseq))
+ ofile_cladeinfo.write(
+ 'number of samples before gap_in_sample: %d\n'\
+ %len(sample2fullseq))
+ for sample in sample2marker:
+ ratio = float(sample2fullseq[sample].count('-')) / len(sample2fullseq[sample])
+ gap_ratio = gap_in_sample if (sample2order[sample] == 'first') else second_gap_in_sample
+ if ratio > gap_ratio:
+ del sample2fullseq[sample]
+ del sample2fullfreq[sample]
+ if sample2order[sample] == 'first':
+ logger.debug('remove sample %s by gap_in_sample %f'%(sample, ratio))
+ else:
+ logger.debug('remove sample %s by second_gap_in_sample %f'%(sample, ratio))
+ logger.debug(
+ 'number of samples after gap_in_sample: %d'\
+ %len(sample2fullseq))
+ ofile_cladeinfo.write(
+ 'number of samples after gap_in_sample: %d\n'\
+ %len(sample2fullseq))
+
+ # log gaps
+ sequential_gaps = []
+ all_gaps = []
+ for sample in sample2fullseq:
+ agap = 0
+ sgap = 0
+ row2 = sample2fullseq[sample]
+ for i in range(len(row2)):
+ if row2[i] == '-':
+ sgap += 1
+ agap += 1
+ elif sgap > 0:
+ sequential_gaps.append(sgap)
+ sgap = 0
+ all_gaps.append(agap)
+
+ ofile_cladeinfo.write('all_gaps:\n' + statistics(all_gaps)[1])
+ if sequential_gaps == []:
+ sequential_gaps = [0]
+ ofile_cladeinfo.write('sequential_gaps:\n' + \
+ statistics(sequential_gaps)[1])
+ ofile_cladeinfo.close()
+
+ # compute ppercentage of polymorphic sites
+ if save_sample2fullfreq:
+ with open(os.path.join(output_dir, 'sample2fullfreq.msgpack'), 'wb') as ofile:
+ msgpack.dump(sample2fullfreq, ofile)
+
+ ofn_pol = os.path.join(output_dir, '%s.polymorphic'%clade)
+ logger.debug('polymorphic file: %s'%ofn_pol)
+ with open(ofn_pol, 'w') as ofile:
+ ofile.write('#sample\tpercentage_of_polymorphic_sites\tavg_freq\tmedian_freq\tstd_freq\tmin_freq\tmax_freq\tq90_freq\tq10_freq\tavg_coverage\tmedian_coverage\tstd_coverage\tmin_coverage\tmax_coverage\tq90_coverage\tq10_coverage\n')
+ for sample in sample2fullfreq:
+ freqs = [x[0] for x in sample2fullfreq[sample] if x[0] > 0 and x[0] < 1 and x[2] < p_value]
+ coverages = [x[1] for x in sample2fullfreq[sample] if x[0] > 0 and x[0] < 1 and x[2] < p_value]
+ ofile.write('%s\t%f'%(sample, float(len(freqs)) * 100 / len(sample2fullfreq[sample])))
+ for vals in [freqs, coverages]:
+ if len(vals):
+ ofile.write('\t%f\t%f\t%f\t%f\t%f\t%f\t%f'%(\
+ numpy.average(vals),
+ numpy.percentile(vals,50),
+ numpy.std(vals),
+ numpy.min(vals),
+ numpy.max(vals),
+ numpy.percentile(vals,90),
+ numpy.percentile(vals,10),
+ ))
+ else:
+ ofile.write('\t0\t0\t0\t0\t0\t0\t0')
+ ofile.write('\n')
+
+
+ # save merged alignment
+ ofn_align = os.path.join(output_dir, '%s.fasta'%clade)
+ logger.debug('alignment file: %s'%ofn_align)
+ with open(ofn_align, 'w') as ofile:
+ for sample in sample2fullseq:
+ SeqIO.write(
+ SeqRecord.SeqRecord(
+ id=sample,
+ description='',
+ seq=Seq.Seq(''.join(sample2fullseq[sample]))),
+ ofile,
+ 'fasta')
+
+ # produce tree
+ oosp = ooSubprocess.ooSubprocess()
+ #ofn_tree = os.path.join(output_dir, '%s.tree'%clade)
+ #oosp.ex('FastTree', args=['-quiet', '-nt', ofn_align], out_fn=ofn_tree)
+ ofn_tree = clade + '.tree'
+ logger.debug('tree file: %s'%ofn_tree)
+ try:
+ for fn in glob.glob('%s/RAxML_*%s'
+ %(os.path.abspath(output_dir), ofn_tree)):
+ os.remove(fn)
+ raxml_args = [
+ '-s', os.path.abspath(ofn_align),
+ '-w', os.path.abspath(output_dir),
+ '-n', ofn_tree,
+ '-p', '1234'
+ ]
+ if bootstrap_raxml:
+ raxml_args += ['-f', 'a',
+ '-m', 'GTRGAMMA',
+ '-x', '1234',
+ '-N', str(bootstrap_raxml)]
+ else:
+ raxml_args += ['-m', 'GTRCAT']
+
+ if nprocs_raxml > 1:
+ raxml_args += ['-T', str(nprocs_raxml)]
+ raxml_prog = 'raxmlHPC-PTHREADS-SSE3'
+ else:
+ raxml_prog = 'raxmlHPC'
+ oosp.ex(
+ raxml_prog,
+ args=raxml_args
+ )
+ except:
+ logger.info('Cannot build the tree! The number of samples is too few '\
+ 'or there is some error with raxmlHMP')
+ pass
+
+
+
+
+ at trace_unhandled_exceptions
+def load_sample(args):
+ ifn_sample = args['ifn_sample']
+ logger.debug('load %s'%ifn_sample)
+ output_dir = args['output_dir']
+ ifn_markers = args['ifn_markers']
+ clades = args['clades']
+ kept_clade = args['kept_clade']
+ db = shared_variables.db
+ sing_clades = shared_variables.sing_clades
+ clade2num_markers = shared_variables.clade2num_markers
+ marker_in_clade = args['marker_in_clade']
+ kept_markers = args['kept_markers']
+ sample = ooSubprocess.splitext(ifn_sample)[0]
+ with open(ifn_sample, 'rb') as ifile:
+ marker2seq = msgpack.load(ifile, use_list=False)
+
+ if kept_clade:
+ if kept_clade == 'singleton':
+ nmarkers = len(marker2seq)
+ else:
+ # remove redundant clades and markers
+ nmarkers = 0
+ for marker in marker2seq.keys():
+ clade = db['markers'][marker]['taxon'].split('|')[-1]
+ if kept_markers:
+ if marker in kept_markers and clade == kept_clade:
+ nmarkers += 1
+ else:
+ del marker2seq[marker]
+ elif clade == kept_clade:
+ nmarkers += 1
+ else:
+ del marker2seq[marker]
+ total_num_markers = clade2num_markers[kept_clade] if not kept_markers else len(kept_markers)
+ if float(nmarkers) / total_num_markers < marker_in_clade:
+ marker2seq = {}
+
+ # reformat 'pileup'
+ for m in marker2seq:
+ freq = marker2seq[m]['freq']
+ marker2seq[m]['freq'] = [(0.0, 0.0, 0.0) for i in \
+ range(len(marker2seq[m]['seq']))]
+ for p in freq:
+ marker2seq[m]['freq'][p] = freq[p]
+ marker2seq[m]['seq'] = marker2seq[m]['seq'].replace('-', 'N') # make sure we have no gaps in the sequence
+
+ return marker2seq
+ else:
+ # remove redundant clades and markers
+ clade2n_markers = defaultdict(int)
+ remove_clade = []
+ for marker in marker2seq:
+ clade = db['markers'][marker]['taxon'].split('|')[-1]
+ if 's__' in clade or clade in sing_clades:
+ clade2n_markers[clade] = clade2n_markers[clade] + 1
+ else:
+ remove_clade.append(clade)
+ remove_clade += [clade for clade in clade2n_markers if
+ float(clade2n_markers[clade]) \
+ / float(clade2num_markers[clade]) < marker_in_clade]
+ remove_marker = [marker for marker in marker2seq if
+ db['markers'][marker]['taxon'].split('|')[-1] in
+ remove_clade]
+ for marker in remove_marker:
+ del marker2seq[marker]
+
+ sample_clades = set([])
+ for marker in marker2seq:
+ clade = db['markers'][marker]['taxon'].split('|')[-1]
+ sample_clades.add(clade)
+ return sample_clades
+
+
+
+
+def load_all_samples(args, sample2order, kept_clade, kept_markers):
+ ifn_samples = args['ifn_samples'] + args['ifn_second_samples']
+ if args['ifn_representative_sample']:
+ ifn_samples.append(args['ifn_representative_sample'])
+ ifn_samples = sorted(list(set(ifn_samples)))
+ if not ifn_samples:
+ return None
+ else:
+ args_list = []
+ for ifn_sample in ifn_samples:
+ func_args = {}
+ func_args['ifn_sample'] = ifn_sample
+ func_args['kept_clade'] = kept_clade
+ func_args['kept_markers'] = kept_markers
+ for k in [
+ 'output_dir',
+ 'ifn_markers', 'nprocs_load_samples',
+ 'clades',
+ 'mpa_pkl',
+ ]:
+ func_args[k] = args[k]
+ sample = ooSubprocess.splitext(ifn_sample)[0]
+ if sample2order[sample] == 'first':
+ func_args['marker_in_clade'] = args['marker_in_clade']
+ else:
+ func_args['marker_in_clade'] = args['second_marker_in_clade']
+ args_list.append(func_args)
+
+ results = ooSubprocess.parallelize(
+ load_sample,
+ args_list,
+ args['nprocs_load_samples'],
+ use_threads=args['use_threads'])
+ if kept_clade:
+ sample2marker = {}
+ for i in range(len(ifn_samples)):
+ sample = ooSubprocess.splitext(ifn_samples[i])[0]
+ if len(results[i]): # skip samples with no markers
+ sample2marker[sample] = results[i]
+ return sample2marker
+ else:
+ all_clades = set([])
+ for r in results:
+ for c in r:
+ all_clades.add(c)
+ all_clades = sorted(list(all_clades))
+ return all_clades
+
+
+
+
+def strainer(args):
+ # auto-set some params
+ if args['relaxed_parameters']:
+ args['sample_in_marker'] = 0.5
+ args['N_in_marker'] = 0.5
+ args['gap_in_sample'] = 0.5
+ elif args['relaxed_parameters2']:
+ args['sample_in_marker'] = 0.2
+ args['N_in_marker'] = 0.8
+ args['gap_in_sample'] = 0.8
+ elif args['relaxed_parameters3']:
+ args['gap_in_trailing_col'] = 0.9
+ args['gap_in_internal_col'] = 0.9
+ args['gap_in_sample'] = 0.9
+ args['second_gap_in_sample'] = 0.5
+ args['sample_in_marker'] = 0.1
+ args['marker_in_clade'] = 0.1
+ args['second_marker_in_clade'] = 0.1
+
+ if args['keep_full_alignment_files']:
+ args['keep_alignment_files'] = True
+ args['marker_strip_length'] = 0
+
+ if os.path.isfile(args['clades'][0]):
+ with open(args['clades'][0], 'r') as ifile:
+ args['clades'] = [line.strip() for line in ifile]
+
+
+ # check conditions
+ ooSubprocess.mkdir(args['output_dir'])
+ with open(os.path.join(args['output_dir'], 'arguments.txt'), 'w') as ofile:
+ #for para in args:
+ # ofile.write('%s\n'%para)
+ # ofile.write('%s\n'%args[para])
+ ofile.write('%s\n'%' '.join(sys.argv))
+ ofile.write('%s'%args)
+
+ if args['ifn_markers'] == None and args['ifn_ref_genomes'] != None:
+ logger.error('ifn_ref_genomes is set but ifn_markers is not set!')
+ exit(1)
+
+ if args['ifn_markers'] != None and args['ifn_ref_genomes'] != None:
+ if len(args['clades']) != 1 or args['clades'] == ['all']:
+ logger.error('Only one clade can be specified when adding '\
+ 'reference genomes')
+ exit(1)
+
+ if args['ifn_markers'] == None and args['clades'] == ['singleton']:
+ logger.error('clades is set to singleton but ifn_markers is not set!')
+ exit(1)
+
+ if args['ifn_samples'] == []:
+ args['clades'] = ['singleton']
+
+ if args['nprocs_load_samples'] == None:
+ args['nprocs_load_samples'] = args['nprocs_main']
+
+ if args['nprocs_align_clean'] == None:
+ args['nprocs_align_clean'] = args['nprocs_main']
+
+ if args['nprocs_raxml'] == None:
+ args['nprocs_raxml'] = args['nprocs_main']
+
+ if args['clades'] == ['singleton']:
+ shared_variables.db = None
+ shared_variables.sing_clades = []
+ nmarkers = 0
+ for rec in SeqIO.parse(args['ifn_markers'], 'fasta'):
+ nmarkers += 1
+ clade2num_markers = {'singleton': nmarkers}
+ shared_variables.clade2num_markers = clade2num_markers
+ else:
+ # load mpa_pkl
+ logger.info('Load mpa_pkl')
+ db = pickle.load(bz2.BZ2File(args['mpa_pkl']))
+ shared_variables.db = db
+
+ # reduce and convert to shared memory
+ #logger.debug('converting db')
+ db['taxonomy'] = db['taxonomy'].keys()
+ for m in db['markers']:
+ del db['markers'][m]['clade']
+ del db['markers'][m]['ext']
+ del db['markers'][m]['len']
+ del db['markers'][m]['score']
+ gc.collect()
+ #logger.debug('converted db')
+
+ # get clades from db
+ logger.info('Get clades from db')
+ sing_clades, clade2num_markers, clade2subclades = get_db_clades(db)
+ shared_variables.sing_clades = sing_clades
+ shared_variables.clade2num_markers = clade2num_markers
+
+ # set order
+ sample2order = {}
+ if args['ifn_representative_sample']:
+ sample = ooSubprocess.splitext(args['ifn_representative_sample'])[0]
+ sample2order[sample] = 'first'
+
+ for ifn in args['ifn_samples']:
+ sample = ooSubprocess.splitext(ifn)[0]
+ sample2order[sample] = 'first'
+
+ for ifn in args['ifn_second_samples']:
+ sample = ooSubprocess.splitext(ifn)[0]
+ if sample not in sample2order:
+ sample2order[sample] = 'second'
+
+ kept_markers = set([])
+ if args['marker_list_fn']:
+ with open(args['marker_list_fn'], 'r') as ifile:
+ for line in ifile:
+ kept_markers.add(line.strip())
+ if not kept_markers:
+ raise Exception('Number of markers in the marker_list_fn is 0!'%args['marker_list_fn'])
+ elif args['ifn_representative_sample']:
+ with open(args['ifn_representative_sample'], 'rb') as ifile:
+ repr_marker2seq = msgpack.load(ifile, use_list=False)
+ if args['clades'] != ['all'] and args['clades'] != ['singleton']:
+ for marker in repr_marker2seq:
+ clade = db['markers'][marker]['taxon'].split('|')[-1]
+ if clade in args['clades']:
+ kept_markers.add(marker)
+ else:
+ kept_markers = set(repr_marker2seq.keys())
+ logger.debug('Number of markers in the representative '\
+ 'sample: %d'%len(kept_markers))
+ if not kept_markers:
+ raise Exception('Number of markers in the representative sample is 0!')
+
+ # get clades from samples
+ if args['clades'] == ['all']:
+ logger.info('Get clades from samples')
+ args['clades'] = load_all_samples(args,
+ sample2order,
+ kept_clade=None,
+ kept_markers=kept_markers)
+
+ if args['print_clades_only']:
+ for c in args['clades']:
+ if c.startswith('s__'):
+ print c
+ else:
+ print c, '(%s)'%(','.join(list(clade2subclades[c])))
+ return
+
+ # add reference genomes
+ ref2marker = defaultdict(dict)
+ if args['ifn_markers'] != None and args['ifn_ref_genomes'] != None:
+ logger.info('Add reference genomes')
+ marker_records = {}
+ for rec in SeqIO.parse(open(args['ifn_markers'], 'r'), 'fasta'):
+ if rec.id in kept_markers or (not kept_markers):
+ marker_records[rec.id] = rec
+ add_ref_genomes(
+ ref2marker,
+ marker_records,
+ args['ifn_ref_genomes'],
+ args['output_dir'])
+
+ # remove bad reference genomes
+ if not kept_markers:
+ nmarkers = shared_variables.clade2num_markers[args['clades'][0]]
+ else:
+ nmarkers = len(kept_markers)
+ remove_ref = []
+ mic = args['second_marker_in_clade'] if args['add_reference_genomes_as_second_samples'] else args['marker_in_clade']
+ for ref in ref2marker:
+ if float(len(ref2marker[ref])) / nmarkers < mic:
+ remove_ref.append(ref)
+ for ref in remove_ref:
+ del ref2marker[ref]
+ ref2marker = dict(ref2marker)
+ for ref in ref2marker:
+ if args['add_reference_genomes_as_second_samples']:
+ sample2order[ref] = 'second'
+ else:
+ sample2order[ref] = 'first'
+
+ # build tree for each clade
+ for clade in args['clades']:
+ logger.info('Build the tree for %s'%clade)
+
+ # load samples and reference genomes
+ sample2marker = load_all_samples(args,
+ sample2order,
+ kept_clade=clade,
+ kept_markers=kept_markers)
+
+ for r in ref2marker:
+ sample2marker[r] = ref2marker[r]
+ logger.debug('number of samples and reference genomes: %d'%len(sample2marker))
+
+ for sample in sample2marker:
+ logger.debug('number of markers in sample %s: %d'%(
+ sample,
+ len(sample2marker[sample])))
+
+ # Filter sequences
+ logger.debug('Filter consensus marker sequences')
+ for sample in sample2marker:
+ sample2marker[sample] = filter_sequence(
+ sample,
+ sample2marker[sample],
+ args['marker_strip_length'],
+ args['N_in_marker'])
+
+ # remove samples with percentage of markers less than marker_in_clade
+ logger.debug('remove samples with percentage of markers '\
+ 'less than marker_in_clade')
+ for sample in sample2marker.keys():
+ if len(sample2marker[sample]):
+ if clade == 'singleton':
+ c = 'singleton'
+ else:
+ marker = sample2marker[sample].keys()[0]
+ c = db['markers'][marker]['taxon'].split('|')[-1]
+ if len(sample2marker[sample]) / \
+ float(clade2num_markers[c]) < args['marker_in_clade']:
+ del sample2marker[sample]
+ else:
+ del sample2marker[sample]
+
+ # build trees
+ shared_variables.sample2marker = sample2marker
+ build_tree(
+ clade=clade,
+ sample2marker=sample2marker,
+ sample2order=sample2order,
+ clade2num_markers=clade2num_markers,
+ sample_in_clade=args['sample_in_clade'],
+ sample_in_marker=args['sample_in_marker'],
+ gap_in_trailing_col=args['gap_in_trailing_col'],
+ gap_trailing_col_limit=args['gap_trailing_col_limit'],
+ gap_in_internal_col=args['gap_in_internal_col'],
+ N_count=args['N_count'],
+ N_col=args['N_col'],
+ gap_in_sample=args['gap_in_sample'],
+ second_gap_in_sample=args['second_gap_in_sample'],
+ long_gap_length=args['long_gap_length'],
+ long_gap_percentage=args['long_gap_percentage'],
+ p_value=args['p_value'],
+ output_dir=args['output_dir'],
+ nprocs_align_clean=args['nprocs_align_clean'],
+ alignment_program=args['alignment_program'],
+ nprocs_raxml=args['nprocs_raxml'],
+ keep_alignment_files=args['keep_alignment_files'],
+ bootstrap_raxml=args['bootstrap_raxml'],
+ save_sample2fullfreq=args['save_sample2fullfreq'],
+ use_threads=args['use_threads'])
+ del shared_variables.sample2marker
+ del sample2marker
+ #gc.collect()
+
+ logger.info('Finished!')
+
+
+
+
+def check_dependencies(args):
+ programs = ['muscle']
+
+ if args['ifn_markers'] != None or args['ifn_ref_genomes'] != None:
+ programs += ['blastn', 'makeblastdb']
+
+ if args['nprocs_main'] > 1:
+ programs += ['raxmlHPC-PTHREADS-SSE3']
+ else:
+ programs += ['raxmlHPC']
+
+ for prog in programs:
+ if not which.is_exe(prog):
+ logger.error('Cannot find %s in the executable path!'%prog)
+ exit(1)
+
+
+
+
+if __name__ == "__main__":
+ args = read_params()
+ check_dependencies(args)
+ strainer(args)
diff --git a/strainphlan_src/add_metadata_tree.py b/strainphlan_src/add_metadata_tree.py
new file mode 100755
index 0000000..3aec59a
--- /dev/null
+++ b/strainphlan_src/add_metadata_tree.py
@@ -0,0 +1,109 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import sys
+import os
+import argparse as ap
+import pandas
+import copy
+import ConfigParser
+import dendropy
+import numpy
+
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--ifn_trees', nargs='+', required=True, default=None, type=str)
+ p.add_argument('--ifn_metadatas', nargs='+', required=True, default=None, type=str)
+ p.add_argument('--string_to_remove',
+ required=False, default='', type=str,
+ help='string to be removed in the tree node names')
+ p.add_argument(
+ '--metadatas',
+ nargs='+',
+ required=False,
+ default=['all'],
+ type=str,
+ help='The metadata fields that you want to add. '\
+ 'Default: add all metadata from the first line.')
+ return vars(p.parse_args())
+
+
+def get_index_col(ifn):
+ with open(ifn, 'r') as ifile:
+ line = ifile.readline()
+ line = line.strip().split()
+ for i in range(len(line)):
+ if line[i].upper() == 'SAMPLEID':
+ return i
+ return -1
+
+
+def main(args):
+ add_fields = args['metadatas']
+ for ifn_tree in args['ifn_trees']:
+ print 'Input:', ifn_tree
+ df_list = []
+ samples = []
+ for ifn in args['ifn_metadatas']:
+ index_col = get_index_col(ifn)
+ df = pandas.read_csv(
+ ifn,
+ sep='\t',
+ dtype=unicode,
+ header=0,
+ index_col=index_col)
+ df = df.transpose()
+ df_list.append(df)
+ samples += df.columns.values.tolist()
+ if add_fields == ['all']:
+ with open(ifn, 'r') as ifile:
+ add_fields = [f for f in ifile.readline().strip().split('\t') \
+ if f.upper() != 'SAMPLEID']
+ print 'number of samples in metadata: %d'%len(samples)
+ count = 0
+ with open(ifn_tree, 'r') as ifile:
+ line = ifile.readline()
+ line = line.replace(args['string_to_remove'], '')
+ tree = dendropy.Tree(stream=open(ifn_tree, 'r'), schema='newick')
+ for node in tree.leaf_nodes():
+ sample = node.get_node_str().strip("'")
+ sample = sample.replace(args['string_to_remove'], '')
+ prefixes = [prefix for prefix in
+ ['k__', 'p__', 'c__', 'o__',
+ 'f__', 'g__', 's__'] \
+ if prefix in sample]
+
+ metadata = sample
+ if len(prefixes) == 0:
+ count += 1
+ for meta in add_fields:
+ old_meta = meta
+ for i in range(len(df_list)):
+ if sample in df_list[i].columns.values.tolist():
+ df = df_list[i]
+ if meta.lower() in df[sample]:
+ meta = meta.lower()
+ elif meta.upper() in df[sample]:
+ meta = meta.upper()
+ elif meta.title() in df[sample]:
+ meta = meta.title()
+ if meta in df[sample]:
+ metadata += '|%s-%s'%(
+ old_meta,
+ str(df[sample][meta]).replace(':','_'))
+ break # take the first metadata
+
+ line = line.replace(sample + ':', metadata + ':')
+
+ ofn_tree = ifn_tree + '.metadata'
+ print 'Number of samples in tree: %d'%count
+ print 'Output:', ofn_tree
+ with open(ofn_tree, 'w') as ofile:
+ ofile.write(line)
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
diff --git a/strainphlan_src/build_tree_single_strain.py b/strainphlan_src/build_tree_single_strain.py
new file mode 100755
index 0000000..c1ede91
--- /dev/null
+++ b/strainphlan_src/build_tree_single_strain.py
@@ -0,0 +1,146 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '17 Sep 2015'
+
+import sys
+import os
+import argparse
+import numpy
+from Bio import SeqIO
+import glob
+
+
+def read_params():
+ p = argparse.ArgumentParser()
+ p.add_argument(
+ '--ifn_alignments',
+ nargs='+',
+ required=True,
+ default=None,
+ type=str,
+ help='The alignment file.')
+ p.add_argument(
+ '--log_ofn',
+ required=True,
+ default=None,
+ type=str,
+ help='The log file.')
+ p.add_argument(
+ '--nprocs',
+ required=True,
+ default=None,
+ type=int,
+ help='Number of processors.')
+ p.add_argument(
+ '--bootstrap_raxml',
+ required=False,
+ default=0,
+ type=int,
+ help='The number of runs for bootstraping when building the tree. '\
+ 'Default 0.')
+ p.add_argument(
+ '--verbose',
+ required=False,
+ dest='quiet',
+ action='store_false',
+ help='Show all information. Default "not set".')
+ p.set_defaults(quiet=True)
+
+ return p.parse_args()
+
+
+def run(cmd):
+ print cmd
+ os.system(cmd)
+
+
+def main(args):
+ lfile = open(args.log_ofn, 'w')
+ for ifn_alignment in args.ifn_alignments:
+ if 'remove_' in ifn_alignment:
+ continue
+ sample2polrate = {}
+ ifn_polymorphic = ifn_alignment.replace('.fasta', '.polymorphic')
+ singles = set([])
+ with open(ifn_polymorphic, 'r') as ifile:
+ for line in ifile:
+ if line[0] == '#':
+ continue
+ line = line.strip().split()
+ val = float(line[1])
+ if line[0][:3] in ['k__', 'p__', 'c__', 'o__', 'f__', 'g__', 's__', 't__']:
+ singles.add(line[0])
+ continue
+ sample2polrate[line[0]] = val
+ median = numpy.median(sample2polrate.values())
+ std = numpy.std(sample2polrate.values())
+ for s in sample2polrate:
+ if sample2polrate[s] <= median + std:
+ singles.add(s)
+
+ if len(sample2polrate):
+ log_line = '%s\t%d\t%d\t%f\n'%\
+ (os.path.basename(ifn_polymorphic).replace('.polymorphic', ''),
+ len(singles),
+ len(sample2polrate),
+ float(len(singles)) / len(sample2polrate))
+ else:
+ log_line = '%s\t%d\t%d\t%f\n'%\
+ (os.path.basename(ifn_polymorphic).replace('.polymorphic', ''),
+ len(singles),
+ len(sample2polrate),
+ 0)
+ lfile.write(log_line)
+
+ ifn_alignment2 = ifn_alignment.replace('.fasta', '') + '.remove_multiple_strains.fasta'
+ with open(ifn_alignment2, 'w') as ofile:
+ for rec in SeqIO.parse(open(ifn_alignment, 'r'), 'fasta'):
+ if rec.name in singles:
+ SeqIO.write(rec, ofile, 'fasta')
+
+ with open(ifn_alignment2 + '.log', 'w') as ofile:
+ ofile.write(log_line)
+
+ output_suffix = os.path.basename(ifn_alignment2).replace('.polymorphic', '').replace('.fasta', '')
+ output_suffix += '.tree'
+ if args.bootstrap_raxml:
+ cmd = 'raxmlHPC-PTHREADS-SSE3 '
+ cmd += '-f a '
+ cmd += '-m GTRGAMMA '
+ #cmd += '-b 1234 '
+ cmd += '-x 1234 '
+ cmd += '-N %d '%(args.bootstrap_raxml)
+ cmd += '-s %s '%os.path.abspath(ifn_alignment2)
+ cmd += '-w %s '%os.path.abspath(os.path.dirname(ifn_alignment2))
+ cmd += '-n %s '%output_suffix
+ cmd += '-p 1234 '
+ else:
+ cmd = 'raxmlHPC-PTHREADS-SSE3 '
+ cmd += '-m GTRCAT '
+ cmd += '-s %s '%os.path.abspath(ifn_alignment2)
+ cmd += '-w %s '%os.path.abspath(os.path.dirname(ifn_alignment2))
+ cmd += '-n %s '%output_suffix
+ cmd += '-p 1234 '
+ if args.nprocs:
+ cmd += '-T %d '%(args.nprocs)
+ raxfns = glob.glob('%s/RAxML_*%s*'%(os.path.dirname(ifn_alignment2), output_suffix))
+ for fn in raxfns:
+ os.remove(fn)
+ '''
+ if len(raxfns) == 0:
+ run(cmd)
+ '''
+ run(cmd)
+ lfile.close()
+
+
+
+
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
diff --git a/strainphlan_src/compute_distance.py b/strainphlan_src/compute_distance.py
new file mode 100755
index 0000000..f72b38d
--- /dev/null
+++ b/strainphlan_src/compute_distance.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '1 Sep 2014'
+
+import sys
+import os
+ABS_PATH = os.path.abspath(sys.argv[0])
+MAIN_DIR = os.path.dirname(ABS_PATH)
+os.environ['PATH'] += ':%s'%MAIN_DIR
+sys.path.append(MAIN_DIR)
+
+from mixed_utils import dist2file, statistics
+import argparse as ap
+from Bio import SeqIO, Seq, SeqRecord
+from collections import defaultdict
+import numpy
+from ooSubprocess import ooSubprocess
+
+
+'''
+SUBST = {
+ 'A':{'A':0.0, 'C':1.0, 'G':1.0, 'T':1.0, '-':1.0},
+ 'C':{'A':1.0, 'C':0.0, 'G':1.0, 'T':1.0, '-':1.0},
+ 'G':{'A':1.0, 'C':1.0, 'G':0.0, 'T':1.0, '-':1.0},
+ 'T':{'A':1.0, 'C':1.0, 'G':1.0, 'T':0.0, '-':1.0},
+ '-':{'A':1.0, 'C':1.0, 'G':1.0, 'T':1.0, '-':0.0}}
+'''
+
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--ifn_alignment', required=True, default=None, type=str)
+ p.add_argument('--ofn_prefix', required=True, default=None, type=str)
+ p.add_argument('--count_gaps',
+ required=False,
+ dest='ignore_gaps',
+ action='store_false')
+ p.set_defaults(ignore_gaps=True)
+ p.add_argument('--overwrite',
+ required=False,
+ dest='overwrite',
+ action='store_true')
+ p.set_defaults(overwrite=True)
+
+ return vars(p.parse_args())
+
+
+def get_dist(seq1, seq2, ignore_gaps):
+ if len(seq1) != len(seq2):
+ print >> sys.stderr, 'Error: Two sequences have different lengths!'
+ print >> sys.stderr, 'Cannot compute the distance!'
+ exit(1)
+
+ abs_dist = 0.0
+ for i in range(len(seq1)):
+ if seq1[i] != seq2[i]:
+ if ignore_gaps:
+ if seq1[i] != '-' and seq2[i] != '-':
+ abs_dist += 1.0
+ else:
+ abs_dist += 1.0
+
+ abs_sim = len(seq1) - abs_dist
+ rel_dist = abs_dist / float(len(seq1))
+ rel_sim = 1.0 - rel_dist
+ return abs_dist, rel_dist, abs_sim, rel_sim
+
+
+def compute_dist_matrix(ifn_alignment, ofn_prefix, ignore_gaps):
+ ofn_abs_dist = ofn_prefix + '.abs_dist'
+ if os.path.isfile(ofn_abs_dist):
+ print 'File %s exists, skip!'%ofn_abs_dist
+ return
+ else:
+ print 'Compute dist_matrix for %s'%ofn_abs_dist
+ #print 'Compute dist_matrix for %s'%ofn_abs_dist
+
+ recs = [rec for rec in SeqIO.parse(open((ifn_alignment), 'r'), 'fasta')]
+ abs_dist = numpy.zeros((len(recs), len(recs)))
+ abs_dist_flat = []
+ rel_dist = numpy.zeros((len(recs), len(recs)))
+ rel_dist_flat = []
+ abs_sim = numpy.zeros((len(recs), len(recs)))
+ abs_sim_flat = []
+ rel_sim = numpy.zeros((len(recs), len(recs)))
+ rel_sim_flat = []
+
+ for i in range(len(recs)):
+ for j in range(i+1, len(recs)):
+ abs_d, rel_d, abs_s, rel_s = get_dist(recs[i].seq,
+ recs[j].seq,
+ ignore_gaps)
+
+ abs_dist[i][j] = abs_d
+ abs_dist[j][i] = abs_d
+ abs_dist_flat.append(abs_d)
+
+ rel_dist[i][j] = rel_d
+ rel_dist[j][i] = rel_d
+ rel_dist_flat.append(rel_d)
+
+ abs_sim[i][j] = abs_s
+ abs_sim[j][i] = abs_s
+ abs_sim_flat.append(abs_s)
+
+ rel_sim[i][j] = rel_s
+ rel_sim[j][i] = rel_s
+ rel_sim_flat.append(rel_s)
+
+
+ labels = [rec.name for rec in recs]
+ oosp = ooSubprocess()
+
+ ofn_abs_dist = ofn_prefix + '.abs_dist'
+ dist2file(abs_dist, labels, ofn_abs_dist)
+ with open(ofn_abs_dist + '.info', 'w') as ofile:
+ ofile.write(statistics(abs_dist_flat)[1])
+ '''
+ if len(abs_dist_flat) > 0:
+ oosp.ex('hclust2.py',
+ args=['-i', ofn_abs_dist,
+ '-o', ofn_abs_dist + '.png',
+ '--f_dist_f', 'euclidean',
+ '--s_dist_f', 'euclidean',
+ '-l', '--dpi', '300',
+ '--flabel_size', '5',
+ '--slabel_size', '5',
+ '--max_flabel_len', '200'])
+ '''
+
+ ofn_rel_dist = ofn_prefix + '.rel_dist'
+ dist2file(rel_dist, labels, ofn_rel_dist)
+ with open(ofn_rel_dist + '.info', 'w') as ofile:
+ ofile.write(statistics(rel_dist_flat)[1])
+ '''
+ if len(rel_dist_flat) > 0:
+ oosp.ex('hclust2.py',
+ args=['-i', ofn_rel_dist,
+ '-o', ofn_rel_dist + '.png',
+ '--f_dist_f', 'euclidean',
+ '--s_dist_f', 'euclidean',
+ '-l', '--dpi', '300',
+ '--flabel_size', '5',
+ '--slabel_size', '5',
+ '--max_flabel_len', '200'])
+ '''
+
+ ofn_abs_sim = ofn_prefix + '.abs_sim'
+ dist2file(abs_sim, labels, ofn_abs_sim)
+ with open(ofn_abs_sim + '.info', 'w') as ofile:
+ ofile.write(statistics(abs_sim_flat)[1])
+ '''
+ if len(abs_sim_flat) > 0:
+ oosp.ex('hclust2.py',
+ args=['-i', ofn_abs_sim,
+ '-o', ofn_abs_sim + '.png',
+ '--f_dist_f', 'euclidean',
+ '--s_dist_f', 'euclidean',
+ '-l', '--dpi', '300',
+ '--flabel_size', '5',
+ '--slabel_size', '5',
+ '--max_flabel_len', '200'])
+ '''
+
+ ofn_rel_sim = ofn_prefix + '.rel_sim'
+ dist2file(rel_sim, labels, ofn_rel_sim)
+ with open(ofn_rel_sim + '.info', 'w') as ofile:
+ ofile.write(statistics(rel_sim_flat)[1])
+ '''
+ if len(rel_sim_flat) > 0:
+ oosp.ex('hclust2.py',
+ args=['-i', ofn_rel_sim,
+ '-o', ofn_rel_sim + '.png',
+ '--f_dist_f', 'euclidean',
+ '--s_dist_f', 'euclidean',
+ '-l', '--dpi', '300',
+ '--flabel_size', '5',
+ '--slabel_size', '5',
+ '--max_flabel_len', '200'])
+ '''
+
+
+def main(args):
+ compute_dist_matrix(
+ args['ifn_alignment'],
+ args['ofn_prefix'],
+ args['ignore_gaps'])
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
diff --git a/strainphlan_src/dump_file.py b/strainphlan_src/dump_file.py
new file mode 100755
index 0000000..9133804
--- /dev/null
+++ b/strainphlan_src/dump_file.py
@@ -0,0 +1,77 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import sys
+import argparse as ap
+import bz2
+import gzip
+import tarfile
+#import logging.config
+#sys.path.append('../pyphlan')
+#sys.path.append('pyphlan')
+import ooSubprocess
+
+#logging.config.fileConfig('logging.ini', disable_existing_loggers=False)
+#logger = logging.getLogger(__name__)
+
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--input_file', required=True, default=None, type=str)
+
+ return vars(p.parse_args())
+
+
+def dump_file(ifn):
+ file_ext = ''
+ if ifn.endswith('.tar.bz2'):
+ ifile = tarfile.open(ifn, 'r:bz2')
+ file_ext = '.tar.bz2'
+ elif ifn.endswith('.tar.gz'):
+ ifile = tarfile.open(ifn, 'r:gz')
+ file_ext = '.tar.gz'
+ elif ifn.endswith('.bz2'):
+ ifile = bz2.BZ2File(ifn, 'r')
+ file_ext = '.bz2'
+ elif ifn.endswith('.gz'):
+ ifile = gzip.GzipFile(ifn, 'r')
+ file_ext = '.gz'
+ elif ifn.endswith('.fastq'):
+ ifile = open(ifn, 'r')
+ file_ext = '.fastq'
+ elif ifn.endswith('.sam'):
+ ifile = open(ifn, 'r')
+ file_ext = '.sam'
+ elif ifn.endswith('.sra'):
+ oosp = ooSubprocess.ooSubprocess()
+ ifile = oosp.ex(
+ 'fastq-dump',
+ args=[
+ '-Z', ifn,
+ '--split-spot'],
+ get_out_pipe=True)
+ file_ext = '.sra'
+ else:
+ raise Exception('Unrecognized format! The format should be .bz2, .gz'\
+ '.tar.bz2, .tar.gz, .sra, .sam.bz2, .sam, or .fastq\n')
+
+ try:
+ if file_ext in ['.tar.bz2', '.tar.gz']:
+ for tar_info in ifile:
+ ifile2 = ifile.extractfile(tar_info)
+ if ifile2 != None:
+ for line in ifile2:
+ sys.stdout.write(line)
+ else:
+ for line in ifile:
+ sys.stdout.write(line)
+ except:
+ sys.stderr.write('Error while dumping file %s\n'%ifn)
+ raise
+
+
+if __name__ == "__main__":
+ args = read_params()
+ dump_file(args['input_file'])
diff --git a/strainphlan_src/extract_markers.py b/strainphlan_src/extract_markers.py
new file mode 100755
index 0000000..2c8ca9c
--- /dev/null
+++ b/strainphlan_src/extract_markers.py
@@ -0,0 +1,45 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '1 Sep 2014'
+
+import sys
+import os
+import argparse as ap
+import pickle
+import bz2
+from Bio import SeqIO, Seq, SeqRecord
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--mpa_pkl', required=True, default=None, type=str)
+ p.add_argument('--ifn_markers', required=True, default=None, type=str)
+ p.add_argument('--clade', required=True, default=None, type=str)
+ p.add_argument('--ofn_markers', required=True, default=None, type=str)
+ return vars(p.parse_args())
+
+
+def extract_markers(mpa_pkl, ifn_markers, clade, ofn_markers):
+ with open(mpa_pkl, 'rb') as ifile:
+ db = pickle.loads(bz2.decompress(ifile.read()))
+ markers = set([])
+ for marker in db['markers']:
+ if clade == db['markers'][marker]['taxon'].split('|')[-1]:
+ markers.add(marker)
+ print 'number of markers', len(markers)
+ with open(ofn_markers, 'w') as ofile:
+ for rec in SeqIO.parse(open(ifn_markers, 'r'), 'fasta'):
+ if rec.name in markers:
+ SeqIO.write(rec, ofile, 'fasta')
+
+
+if __name__ == "__main__":
+ args = read_params()
+ extract_markers(
+ mpa_pkl=args['mpa_pkl'],
+ ifn_markers=args['ifn_markers'],
+ clade=args['clade'],
+ ofn_markers=args['ofn_markers'])
diff --git a/strainphlan_src/fastx_len_filter.py b/strainphlan_src/fastx_len_filter.py
new file mode 100755
index 0000000..84af5ab
--- /dev/null
+++ b/strainphlan_src/fastx_len_filter.py
@@ -0,0 +1,17 @@
+#!/usr/bin/env python
+from Bio import SeqIO
+import argparse as ap
+import sys
+
+def read_params(args):
+ p = ap.ArgumentParser(description = 'fastax_len_filter.py Parameters\n')
+ p.add_argument('--min_len', required = True, default = None, type = int)
+ return vars(p.parse_args())
+
+if __name__ == '__main__':
+ args = read_params(sys.argv)
+ min_len = args['min_len']
+ with sys.stdout as outf:
+ for r in SeqIO.parse(sys.stdin, "fastq"):
+ if len(r) >= min_len:
+ SeqIO.write(r, outf, "fastq")
diff --git a/strainphlan_src/fix_AF1.py b/strainphlan_src/fix_AF1.py
new file mode 100755
index 0000000..b24edd5
--- /dev/null
+++ b/strainphlan_src/fix_AF1.py
@@ -0,0 +1,36 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import sys
+import argparse
+
+def read_params():
+ p = argparse.ArgumentParser()
+ p.add_argument('--input_file', required=True, default='-', type=str)
+
+ return vars(p.parse_args())
+
+
+def fix_AF1(ifn):
+ if ifn == '-':
+ ifile = sys.stdin
+ else:
+ ifile = open(ifn, 'r')
+
+ for line in ifile:
+ if line[0] != '#':
+ if 'AF1=0' in line:
+ spline = line.split()
+ if spline[3] != spline[4] and spline[4].upper() in ['A', 'T', 'C', 'G']:
+ line = line.replace('AF1=0', 'AF1=1')
+ sys.stdout.write(line)
+
+ if ifn != '-':
+ ifile.close()
+
+
+if __name__ == "__main__":
+ args = read_params()
+ fix_AF1(args['input_file'])
diff --git a/strainphlan_src/logging.ini b/strainphlan_src/logging.ini
new file mode 100755
index 0000000..528b555
--- /dev/null
+++ b/strainphlan_src/logging.ini
@@ -0,0 +1,22 @@
+[loggers]
+keys=root
+
+[handlers]
+keys=consoleHandler
+
+[formatters]
+keys=simpleFormatter
+
+[logger_root]
+level=DEBUG
+handlers=consoleHandler
+
+[handler_consoleHandler]
+class=StreamHandler
+level=DEBUG
+formatter=simpleFormatter
+args=(sys.stdout,)
+
+[formatter_simpleFormatter]
+format=%(asctime)s | %(name)s | %(levelname)s | %(funcName)s | %(lineno)d | %(message)s
+datefmt=
diff --git a/strainphlan_src/mixed_utils.py b/strainphlan_src/mixed_utils.py
new file mode 100755
index 0000000..1b46ab8
--- /dev/null
+++ b/strainphlan_src/mixed_utils.py
@@ -0,0 +1,99 @@
+#!/usr/bin/env python
+# Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '1st Sep 2014'
+
+import numpy
+import sys
+
+def dist2file(dist, labels, ofn):
+ with open(ofn, 'w') as ofile:
+ ofile.write('ID')
+ for label in labels:
+ ofile.write('\t%s'%label)
+ ofile.write('\n')
+ for i in range(len(labels)):
+ ofile.write('%s\t'%labels[i])
+ for j in range(len(labels)):
+ if j == len(labels) - 1:
+ ofile.write('%f\n'%dist[i][j])
+ else:
+ ofile.write('%f\t'%dist[i][j])
+
+
+
+def statistics(vals):
+ vals = numpy.array(vals)
+ result = {}
+ if len(vals.shape) == 1:
+ num_elems = len(vals)
+ nrows = num_elems
+ ncols = 1
+ else:
+ nrows, ncols = vals.shape
+ num_elems = nrows * ncols
+ if num_elems > 0:
+ result['nrows'] = nrows
+ result['ncols'] = ncols
+ result['size'] = num_elems
+ result['average'] = numpy.average(vals)
+ result['min'] = numpy.min(vals)
+ result['max'] = numpy.max(vals)
+ result['median'] = numpy.percentile(vals, 50)
+ result['percentile_25'] = numpy.percentile(vals, 25)
+ result['percentile_75'] = numpy.percentile(vals, 75)
+ else:
+ result['nrows'] = nrows
+ result['ncols'] = ncols
+ result['size'] = num_elems
+ result['average'] = 0
+ result['min'] = 0
+ result['max'] = 0
+ result['median'] = 0
+ result['percentile_25'] = 0
+ result['percentile_75'] = 0
+
+ str_result = ''
+ for key in ['nrows',
+ 'ncols',
+ 'size',
+ 'average',
+ 'min',
+ 'max',
+ 'median',
+ 'percentile_25',
+ 'percentile_75']:
+ str_result += '%s: %s\n'%(key, result[key])
+
+ return result, str_result
+
+
+
+def dict2str(dict_var):
+ result = ''
+ for key in dict_var:
+ result += '%s: %s\n'%(key, dict_var[key])
+ return result
+
+
+def openr( fn, mode = "r" ):
+ if fn is None:
+ return sys.stdin
+ return bz2.BZ2File(fn) if fn.endswith(".bz2") else open(fn,mode)
+
+
+def openw( fn ):
+ if fn is None:
+ return sys.stdout
+ return bz2.BZ2File(fn,"w") if fn.endswith(".bz2") else open(fn,"w")
+
+
+def is_number(s):
+ try:
+ int(s)
+ return True
+ except ValueError:
+ return False
diff --git a/strainphlan_src/ooSubprocess.py b/strainphlan_src/ooSubprocess.py
new file mode 100755
index 0000000..66d435e
--- /dev/null
+++ b/strainphlan_src/ooSubprocess.py
@@ -0,0 +1,300 @@
+#!/usr/bin/env python
+# Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import subprocess
+import os
+import multiprocessing
+from multiprocessing.pool import ThreadPool
+import sys
+import cStringIO
+from tempfile import NamedTemporaryFile
+import which
+import functools
+import traceback
+import numpy
+
+
+class ooSubprocessException(Exception):
+ pass
+
+class ooSubprocess:
+
+ def __init__(self, tmp_dir='tmp/'):
+ self.chain_cmds = []
+ self.tmp_dir = tmp_dir
+ mkdir(tmp_dir)
+
+ def ex(
+ self,
+ prog,
+ args=[],
+ get_output=False,
+ get_out_pipe=False,
+ out_fn=None,
+ in_pipe=None,
+ verbose=True,
+ **kwargs):
+
+ if not which.is_exe(prog):
+ raise ooSubprocessException('Error: cannot find the program %s '\
+ 'in the executable path!'%prog)
+
+ if isinstance(args, str):
+ args = args.split()
+
+ if not isinstance(args, list):
+ args = [args]
+
+ cmd = [prog] + args
+ print_cmd = 'ooSubprocess: ' + ' '.join(cmd)
+ if verbose and out_fn and (not get_output):
+ print_stderr(print_cmd + ' > ' + out_fn)
+ elif verbose:
+ print_stderr(print_cmd)
+
+ if get_output:
+ result = subprocess.check_output(
+ cmd,
+ stdin=in_pipe,
+ **kwargs)
+ elif get_out_pipe:
+ tmp_file = NamedTemporaryFile(dir=self.tmp_dir)
+ p = subprocess.Popen(
+ cmd,
+ stdin=in_pipe,
+ stdout=tmp_file,
+ **kwargs)
+ p.wait()
+ if in_pipe != None:
+ in_pipe.close()
+ tmp_file.seek(0)
+ result = tmp_file
+ elif out_fn:
+ ofile = open(out_fn, 'w') if out_fn else None
+ result = subprocess.check_call(
+ cmd,
+ stdin=in_pipe,
+ stdout=ofile,
+ **kwargs)
+ ofile.close()
+ else:
+ result = subprocess.check_call(
+ cmd,
+ stdin=in_pipe,
+ **kwargs)
+ return result
+
+ def chain(
+ self,
+ prog,
+ args=[],
+ stop=False,
+ in_pipe=None,
+ get_output=False,
+ get_out_pipe=False,
+ out_fn=None,
+ verbose=True,
+ **kwargs):
+
+ if not which.is_exe(prog):
+ raise ooSubprocessException('Error: cannot find the program %s '\
+ 'in the executable path!'%prog)
+
+ if in_pipe is None and self.chain_cmds != []:
+ raise ooSubprocessException(
+ 'The pipeline was not stopped before creating a new one!'\
+ 'In cache: %s' % (' | '.join(self.chain_cmds)))
+ if out_fn and stop == False:
+ raise ooSubprocessException(
+ 'out_fn (output_file_name) is only specified when stop = True!')
+
+ if isinstance(args, str):
+ args = args.split()
+
+ if not isinstance(args, list):
+ args = [args]
+ cmd = [prog] + args
+
+ print_cmd = ' '.join(cmd)
+ if out_fn and (not get_output):
+ print_cmd += ' > ' + out_fn
+ self.chain_cmds.append(print_cmd)
+
+ if stop:
+ if in_pipe is None:
+ raise ooSubprocessException('No input process to create a pipeline!')
+
+ if verbose:
+ print_stderr('ooSubprocess: ' + ' | '.join(self.chain_cmds))
+
+ self.chain_cmds = []
+ if get_output:
+ result = subprocess.check_output(
+ cmd,
+ stdin=in_pipe,
+ **kwargs)
+ elif get_out_pipe:
+ tmp_file = NamedTemporaryFile(dir=self.tmp_dir)
+ p = subprocess.Popen(
+ cmd,
+ stdin=in_pipe,
+ stdout=tmp_file,
+ **kwargs)
+ return_code = p.wait()
+ if return_code != 0:
+ raise ooSubprocessException(
+ 'Failed when executing the command: %s\n'\
+ 'return code: %s'\
+ %(' | '.join(self.chain_cmds), return_code))
+ tmp_file.seek(0)
+ if in_pipe != None:
+ in_pipe.close()
+
+ result = tmp_file
+ elif out_fn:
+ ofile = open(out_fn, 'w')
+ result = subprocess.check_call(
+ cmd,
+ stdin=in_pipe,
+ stdout=ofile,
+ **kwargs)
+ ofile.close()
+ else:
+ result = subprocess.check_call(
+ cmd,
+ stdin=in_pipe,
+ **kwargs)
+ else:
+ tmp_file = NamedTemporaryFile(dir=self.tmp_dir)
+ p = subprocess.Popen(
+ cmd,
+ stdin=in_pipe,
+ stdout=tmp_file,
+ **kwargs)
+ return_code = p.wait()
+ if return_code != 0:
+ raise ooSubprocessException(
+ 'Failed when executing the command: %s\n'\
+ 'return code: %s'\
+ %(' | '.join(self.chain_cmds), return_code))
+ tmp_file.seek(0)
+ if in_pipe != None:
+ in_pipe.close()
+
+ result = tmp_file
+ return result
+
+ def ftmp(self, ifn):
+ return os.path.join(self.tmp_dir, os.path.basename(ifn))
+
+
+def fdir(dir, ifn):
+ return os.path.join(dir, os.path.basename(ifn))
+
+
+def mkdir(dir):
+ if not os.path.exists(dir):
+ try:
+ os.makedirs(dir)
+ except OSError as e:
+ if e.errno != 17:
+ raise
+ pass
+ elif not os.path.isdir(dir):
+ raise ooSubprocessException('Error: %s is not a directory!' % dir)
+
+
+def replace_ext(ifn, old_ext, new_ext):
+ # if not os.path.isfile(ifn):
+ # print_stderr('Error: file %s does not exist!'%(ifn))
+ # exit(1)
+ if ifn[len(ifn) - len(old_ext):] != old_ext:
+ # print_stderr('Error: the old file extension %s does not match!'%old_ext)
+ # exit(1)
+ new_ifn = ifn + new_ext
+ else:
+ new_ifn = ifn[:len(ifn) - len(old_ext)] + new_ext
+ return new_ifn
+
+
+def splitext(ifn):
+ basename = os.path.basename(ifn)
+ if ifn.endswith('.tar.bz2'):
+ ext = '.tar.bz2'
+ elif ifn.endswith('.tar.gz'):
+ ext = '.tar.gz'
+ else:
+ ext = basename.split('.')[-1]
+ if ext != basename:
+ ext = '.' + ext
+ base = basename[:-len(ext)]
+ for t in ['.sam', '.fastq', '.fasta', '.fna']:
+ if base.endswith(t):
+ ext = t + ext
+ base = base[:-len(t)]
+ return base, ext
+
+
+def trace_unhandled_exceptions(f):
+ @functools.wraps(f)
+ def wrapper(*args, **kwargs):
+ try:
+ return f(*args, **kwargs)
+ except:
+ #traceback.print_exc()
+ #raise Exception(''.join(traceback.format_exception(*sys.exc_info())))
+ raise Exception(traceback.format_exc())
+ return wrapper
+
+
+def parallelize(func, args, nprocs=1, use_threads=False):
+ if nprocs > 1:
+ if use_threads:
+ pool = ThreadPool(nprocs)
+ else:
+ pool = multiprocessing.Pool(nprocs)
+ results = pool.map(func, args)
+ pool.close()
+ pool.join()
+ else:
+ results = serialize(func, args)
+ return results
+
+
+def parallelize_async(func, args, nprocs=1, use_threads=False):
+ if nprocs > 1:
+ if use_threads:
+ pool = ThreadPool(nprocs)
+ else:
+ pool = multiprocessing.Pool(nprocs)
+ app_results = []
+ for a in args:
+ app_results.append(pool.apply_async(func, [a]))
+ pool.close()
+ pool.join()
+ results = [r.get() for r in app_results]
+ else:
+ results = serialize(func, args)
+ return results
+
+
+def serialize(func, args):
+ results = []
+ for arg in args:
+ results.append(func(arg))
+ return results
+
+
+def print_stderr(*args):
+ sys.stderr.write(' '.join(map(str, args)) + '\n')
+ sys.stderr.flush()
+
+
+def print_stdout(*args):
+ sys.stdout.write(' '.join(map(str, args)) + '\n')
+ sys.stdout.flush()
+
+
+
diff --git a/strainphlan_src/plot_tree_graphlan.py b/strainphlan_src/plot_tree_graphlan.py
new file mode 100755
index 0000000..f514f24
--- /dev/null
+++ b/strainphlan_src/plot_tree_graphlan.py
@@ -0,0 +1,177 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '4 May 2015'
+
+import sys
+import os
+import argparse as ap
+import dendropy
+from StringIO import StringIO
+import re
+from collections import defaultdict
+import ConfigParser
+import matplotlib.colors as colors
+import subprocess
+
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--ifn_tree',
+ required=True,
+ default=None,
+ type=str,
+ help='The input tree in newick format.')
+ p.add_argument('--colorized_metadata',
+ required=False,
+ default='unset',
+ type=str,
+ help='The metadata field to colorize. Default "unset".')
+ p.add_argument('--fig_size',
+ required=False,
+ default=8,
+ type=float,
+ help='The figure size. Default "8".')
+ p.add_argument('--legend_marker_size',
+ required=False,
+ default=20,
+ type=int,
+ help='The legend marker size. Default "20".'
+ )
+ p.add_argument('--legend_font_size',
+ required=False,
+ default=10,
+ type=int,
+ help='The legend font size. Default "10".'
+ )
+ p.add_argument('--legend_marker_edge_width',
+ required=False,
+ default=0.2,
+ type=float,
+ help='The legend marker edge width. Default "0.2".'
+ )
+ p.add_argument('--leaf_marker_size',
+ required=False,
+ default=20,
+ type=int,
+ help='The legend marker size. Default "20".'
+ )
+ p.add_argument('--leaf_marker_edge_width',
+ required=False,
+ default=0.2,
+ type=float,
+ help='The legend marker edge width. Default "0.2".'
+ )
+ p.add_argument('--dpi',
+ required=False,
+ default=300,
+ type=int,
+ help='The figure dpi.')
+ p.add_argument('--figure_extension',
+ required=False,
+ default='.png',
+ type=str,
+ help='The figure extension. Default ".png".')
+ p.add_argument('--ofn_prefix',
+ required=False,
+ default=None,
+ type=str,
+ help='The prefix of output files.')
+ return p.parse_args()
+
+
+
+
+def run(cmd):
+ print cmd
+ subprocess.call(cmd.split())
+
+
+
+
+def main(args):
+ tree = dendropy.Tree.get_from_path(args.ifn_tree, schema='newick',
+ preserve_underscores=True)
+ tree.reroot_at_midpoint()
+ count = 0
+ metadatas = set([])
+ node2metadata = {}
+ for node in tree.preorder_node_iter():
+ nodestr = node.get_node_str().strip("'")
+ if node.is_leaf():
+ if '.' in nodestr:
+ nodestr = nodestr.replace('.',',')
+ node.taxon = dendropy.Taxon(label=nodestr)
+ substrs = re.findall(
+ '%s-[a-zA-Z0-9.]*'%args.colorized_metadata,
+ nodestr)
+ if substrs:
+ md = substrs[0].replace(args.colorized_metadata + '-', '')
+ metadatas.add(md)
+ node2metadata[nodestr] = md
+ else:
+ count += 1
+ node.taxon = dendropy.Taxon(label='node_%d'%count)
+ metadatas = sorted(list(metadatas))
+ color_names = colors.cnames.keys()
+ metadata2color = {}
+ for i, md in enumerate(metadatas):
+ metadata2color[md] = color_names[i % len(color_names)]
+
+ if not args.ofn_prefix:
+ args.ofn_prefix = args.ifn_tree
+ ofn_tree = args.ofn_prefix + '.graphlantree'
+ tree.write_to_path(ofn_tree, 'newick')
+ ofn_annot = args.ofn_prefix + '.annot'
+ with open(ofn_annot, 'w') as ofile:
+ #ofile.write('clade_separation\t0\n')
+ ofile.write('branch_bracket_width\t0\n')
+ #ofile.write('clade_separation\t0.15\n')
+ ofile.write('branch_bracket_depth\t0\n')
+ #ofile.write('branch_thickness\t1.25\n')
+ ofile.write('annotation_background_width\t0\n')
+
+ # legend
+ ofile.write('#legends\n')
+ ofile.write('class_legend_font_size\t%d\n'%args.legend_font_size)
+
+ for md in metadata2color:
+ ofile.write('%s\tclade_marker_size\t%d\n'%(md, args.legend_marker_size))
+ ofile.write('%s\tclade_marker_color\t%s\n'%(md, metadata2color[md]))
+ ofile.write('%s\tclade_marker_edge_width\t%f\n'%(md, args.legend_marker_edge_width))
+
+ # remove intermedate nodes
+ for node in tree.preorder_node_iter():
+ if not node.is_leaf():
+ nodestr = node.get_node_str().strip("'")
+ ofile.write('%s\tclade_marker_size\t0\n'%(nodestr))
+
+ # colorize leaf nodes
+ for node in tree.seed_node.leaf_nodes():
+ nodestr = node.get_node_str().strip("'")
+ if nodestr in node2metadata:
+ leaf_color = metadata2color[node2metadata[nodestr]]
+ ofile.write('%s\tclade_marker_size\t%d\n'%(nodestr, args.leaf_marker_size))
+ ofile.write('%s\tclade_marker_color\t%s\n'%(nodestr, leaf_color))
+ ofile.write('%s\tclade_marker_edge_width\t%f\n'%(nodestr, args.leaf_marker_edge_width))
+
+ ofn_xml = args.ofn_prefix + '.xml'
+ cmd = 'graphlan_annotate.py --annot %s %s %s'%(ofn_annot, ofn_tree, ofn_xml)
+ run(cmd)
+
+ ofn_fig = args.ofn_prefix + args.figure_extension
+ cmd = 'graphlan.py %s %s --dpi %d --size %f'%(ofn_xml, ofn_fig, args.dpi, args.fig_size)
+ run(cmd)
+
+ print 'Output file: %s'%ofn_fig
+
+
+
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
+ #test()
diff --git a/strainphlan_src/sam_filter.py b/strainphlan_src/sam_filter.py
new file mode 100755
index 0000000..20a90e9
--- /dev/null
+++ b/strainphlan_src/sam_filter.py
@@ -0,0 +1,59 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '18 Jul 2015'
+
+import sys
+import os
+import argparse
+
+
+def read_params():
+ p = argparse.ArgumentParser()
+ p.add_argument(
+ '--input_file',
+ required=False,
+ default=None,
+ type=str,
+ help='The input sam file.')
+ p.add_argument(
+ '--min_align_score',
+ required=True,
+ default=None,
+ type=int,
+ help='The sam records with alignment score smaller than this value '
+ 'will be discarded.')
+ p.add_argument(
+ '--verbose',
+ required=False,
+ dest='quiet',
+ action='store_false',
+ help='Show all information. Default "not set".')
+ p.set_defaults(quiet=True)
+
+ return p.parse_args()
+
+
+def main(args):
+ if args.input_file == None:
+ ifile = sys.stdin
+ else:
+ ifile = open(args.input_file, 'r')
+ for line in ifile:
+ if line[0] == '@':
+ sys.stdout.write(line)
+ else:
+ spline = line.split()
+ read_length = len(spline[9])
+ align_score = float(spline[11].replace('AS:i:', ''))
+ if align_score >= args.min_align_score * read_length / 100.0:
+ sys.stdout.write(line)
+
+
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
diff --git a/strainphlan_src/sample2markers.py b/strainphlan_src/sample2markers.py
new file mode 100755
index 0000000..144a0a9
--- /dev/null
+++ b/strainphlan_src/sample2markers.py
@@ -0,0 +1,421 @@
+#!/usr/bin/env python
+# Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import sys
+import os
+ABS_PATH = os.path.abspath(sys.argv[0])
+MAIN_DIR = os.path.dirname(ABS_PATH)
+os.environ['PATH'] += ':%s'%MAIN_DIR
+os.environ['PATH'] += ':%s'%os.path.join(MAIN_DIR, 'strainphlan_src')
+sys.path.append(MAIN_DIR)
+sys.path.append(os.path.join(MAIN_DIR, 'strainphlan_src'))
+
+import argparse as ap
+import glob
+import ooSubprocess
+from ooSubprocess import print_stderr, trace_unhandled_exceptions
+import ConfigParser
+from Bio import SeqIO, Seq, SeqRecord
+import cStringIO
+import msgpack
+import random
+import subprocess
+import bz2
+import gzip
+import logging
+import logging.config
+import tarfile
+import threading
+import multiprocessing
+import pysam
+from collections import defaultdict
+from scipy import stats
+import numpy
+
+logging.basicConfig(level=logging.DEBUG, stream=sys.stderr,
+ disable_existing_loggers=False,
+ format='%(asctime)s | %(levelname)s | %(name)s | %(funcName)s | %(lineno)d | %(message)s')
+logger = logging.getLogger(__name__)
+
+
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--ifn_samples', nargs='+', required=True, default=None, type=str)
+ p.add_argument('--ifn_markers', required=False, default=None, type=str)
+ p.add_argument('--output_dir', required=True, default=None, type=str)
+ p.add_argument('--nprocs', required=False, default=1, type=int)
+ p.add_argument('--min_read_len', required=False, default=90, type=int)
+ p.add_argument('--min_align_score', required=False, default=None, type=int)
+ p.add_argument('--min_base_quality', required=False, default=30, type=float)
+ p.add_argument('--error_rate', required=False, default=0.01, type=float)
+ p.add_argument('--marker2file_ext', required=False, default='.markers', type=str)
+ p.add_argument('--sam2file_ext', required=False, default='.sam.bz2', type=str)
+ p.add_argument(
+ '--verbose',
+ required=False,
+ dest='quiet',
+ action='store_false',
+ help='Show all information. Default "not set".')
+ p.set_defaults(quiet=True)
+ '''
+ p.add_argument(
+ '--use_processes',
+ required=False,
+ default=False,
+ action='store_false',
+ dest='use_threads',
+ help='Use multiprocessing. Default "Use multithreading".')
+ p.set_defaults(use_threads=True)
+ '''
+ p.add_argument(
+ '--input_type',
+ required=True,
+ default=None,
+ type=str,
+ choices=['fastq', 'sam'],
+ help='The input type:'\
+ ' fastq, sam. Sam '\
+ ' files can be obtained from the previous run of'\
+ ' this script or strainphlan.py).')
+
+ return vars(p.parse_args())
+
+
+def build_bowtie2db(ifn_markers, tmp_dir, error_pipe=None):
+ # build bowtie2-index
+ if not os.path.isfile(ifn_markers):
+ error = 'ifn_markers %s does not exist!'%ifn_markers
+ logger.error(error)
+ raise Exception(error)
+
+ if not os.path.isdir(tmp_dir):
+ ooSubprocess.mkdir(tmp_dir)
+ bt2_base = ooSubprocess.splitext(ifn_markers)[0]
+ index_fns = glob.glob('%s/%s.*'%(tmp_dir, bt2_base))
+ index_path = os.path.join(tmp_dir, bt2_base)
+ oosp = ooSubprocess.ooSubprocess(tmp_dir)
+ if index_fns == []:
+ oosp.ex(
+ 'bowtie2-build',
+ ['--quiet', ifn_markers, index_path],
+ stderr=error_pipe)
+ else:
+ logger.warning('bowtie2-indexes of %s are ready, skip rebuilding!'
+ %(bt2_base))
+ return index_path
+
+
+
+def sample2markers(
+ ifn_sample,
+ min_read_len,
+ min_align_score,
+ min_base_quality,
+ error_rate,
+ ifn_markers,
+ index_path,
+ nprocs=1,
+ sam2file=None,
+ marker2file=None,
+ tmp_dir='tmp',
+ quiet=False):
+
+ '''
+ Compute the consensus markers in a sample file ifn_sample.
+
+ :param ifn_sample: the sample file in fastq format.
+ :param marker2file: the file name to store the consensus markers.
+ :param sam2file: the file name to store the sam content.
+ :returns: if marker2file==None, return the dictionary of the consensus
+ markers. Otherwise, save the result in fasta format to the file declared by
+ marker2file
+ '''
+
+ if quiet:
+ error_pipe = open(os.devnull, 'w')
+ else:
+ error_pipe = None
+ oosp = ooSubprocess.ooSubprocess(tmp_dir)
+
+ # sample to sam
+ sample_pipe = oosp.chain(
+ 'dump_file.py',
+ args=['--input_file', ifn_sample],
+ stderr=error_pipe
+ )
+ filter_length_pipe = oosp.chain(
+ 'fastx_len_filter.py',
+ args=['--min_len', str(min_read_len)],
+ in_pipe=sample_pipe,
+ stderr=error_pipe
+ )
+ bowtie2_pipe = oosp.chain(
+ 'bowtie2',
+ args=[
+ '-U', '-',
+ '-x', index_path,
+ '--very-sensitive',
+ '--no-unal',
+ '-p', str(nprocs)],
+ in_pipe=filter_length_pipe,
+ stderr=error_pipe)
+ if sam2file == None:
+ sam_pipe = bowtie2_pipe
+ else:
+ oosp.chain(
+ 'compress_file.py',
+ args=['--output_file', sam2file],
+ in_pipe=bowtie2_pipe,
+ stderr=error_pipe,
+ stop=True)
+
+ sam_pipe = oosp.chain(
+ 'dump_file.py',
+ args=['--input_file', sam2file],
+ stderr=error_pipe)
+ ofn_bam_sorted_prefix = os.path.join(
+ tmp_dir,
+ os.path.basename(ifn_sample) + '.bam.sorted')
+
+ return sam2markers(
+ sam_file=sam_pipe,
+ ofn_bam_sorted_prefix=ofn_bam_sorted_prefix,
+ marker2file=marker2file,
+ oosp=oosp,
+ tmp_dir=tmp_dir,
+ quiet=quiet)
+
+
+
+def save2file(tmp_file, ofn):
+ logger.debug('save %s'%ofn)
+ with open(ofn, 'w') as ofile:
+ for line in tmp_file:
+ ofile.write(line)
+ tmp_file.seek(0)
+
+
+
+def sam2markers(
+ sam_file,
+ ofn_bam_sorted_prefix,
+ min_align_score=None,
+ min_base_quality=30,
+ error_rate=0.01,
+ marker2file=None,
+ oosp=None,
+ tmp_dir='tmp',
+ quiet=False):
+ '''
+ Compute the consensus markers in a sample from a sam content.
+
+ :param sam_file: a file name, a file object or subprocess.Popen object
+ containing the content of a sam file.
+ :param marker2file: the file name to store the consensus genomes.
+ :param ofn_bam_sorted_prefix: the bam sorted file prefix
+ :param oosp: an instance of ooSubprocess for running a pipe
+ :returns: if marker2file==None, return the dictionary of the consensus
+ genomes. Otherwise, save the result in fasta format to the file declared by
+ marker2file
+ '''
+
+ if quiet:
+ error_pipe = open(os.devnull, 'w')
+ else:
+ error_pipe = None
+
+ # sam content to file object
+ if oosp is None:
+ oosp = ooSubprocess.ooSubprocess()
+
+ if type(sam_file) == str:
+ p1 = oosp.chain(
+ 'dump_file.py',
+ args=['--input_file', sam_file],
+ stderr=error_pipe)
+ else:
+ p1 = sam_file
+
+ # filter sam
+ if min_align_score == None:
+ p1_filtered = p1
+ else:
+ p1_filtered = oosp.chain('sam_filter.py',
+ args=['--min_align_score',
+ str(min_align_score)],
+ in_pipe=p1,
+ stderr=error_pipe)
+ # sam to bam
+ p2 = oosp.chain(
+ 'samtools',
+ args=['view', '-bS', '-'],
+ in_pipe=p1_filtered,
+ stderr=error_pipe)
+
+ # sort bam
+ tmp_fns = glob.glob('%s*'%ofn_bam_sorted_prefix)
+ for tmp_fn in tmp_fns:
+ os.remove(tmp_fn)
+ p3 = oosp.chain(
+ 'samtools',
+ args=['sort', '-o', '-', ofn_bam_sorted_prefix],
+ in_pipe=p2,
+ stderr=error_pipe)
+
+ # extract polimorphic information
+ marker2seq = defaultdict(dict)
+ pysam.index(p3.name)
+ samfile = pysam.AlignmentFile(p3.name)
+ for pileupcolumn in samfile.pileup():
+ rname = samfile.getrname(pileupcolumn.reference_id)
+ pileup = defaultdict(int)
+ for pileupread in pileupcolumn.pileups:
+ if not pileupread.is_del and not pileupread.is_refskip: # query position is None if is_del or is_refskip is set.
+ b = pileupread.alignment.query_sequence[pileupread.query_position]
+ q = pileupread.alignment.query_qualities[pileupread.query_position]
+ if q >= min_base_quality:
+ pileup[b] += 1
+ if len(pileup):
+ f = float(max(pileup.values())) / sum(pileup.values())
+ p = stats.binom.cdf(max(pileup.values()), sum(pileup.values()), 1.0 - error_rate)
+ freq = (f, sum(pileup.values()), p)
+ else:
+ freq = (0.0, 0.0, 0.0)
+ if 'freq' not in marker2seq[rname]:
+ marker2seq[rname]['freq'] = {}
+ marker2seq[rname]['freq'][pileupcolumn.pos] = freq
+ samfile.close()
+ os.remove(p3.name + '.bai')
+
+ # bam to mpileup
+ p3.seek(0)
+ p4 = oosp.chain(
+ 'samtools',
+ args=['mpileup', '-u', '-'],
+ in_pipe=p3,
+ stderr=error_pipe)
+
+ # mpileup to vcf
+ p5 = oosp.chain(
+ 'bcftools',
+ args=['view', '-c', '-g', '-p', '1.1', '-'],
+ in_pipe=p4,
+ stderr=error_pipe)
+ #stderr=open(os.devnull, 'w'))
+
+ # fix AF1=0
+ p6 = oosp.chain(
+ 'fix_AF1.py',
+ args=['--input_file', '-'],
+ in_pipe=p5,
+ stderr=error_pipe)
+
+ # vcf to fastq
+ p7 = oosp.chain(
+ 'vcfutils.pl',
+ args=['vcf2fq'],
+ in_pipe=p6,
+ get_out_pipe=True,
+ stderr=error_pipe,
+ stop=True)
+
+ try:
+ for rec in SeqIO.parse(p7, 'fastq'):
+ marker2seq[rec.name]['seq'] = str(rec.seq).upper()
+ marker2seq = dict(marker2seq)
+ except Exception as e:
+ logger.error("sam2markers failed on file " + sam_file)
+ raise
+
+ if type(p1) == file:
+ p1.close()
+
+ if marker2file:
+ with open(marker2file, 'wb') as ofile:
+ msgpack.dump(marker2seq, ofile)
+
+ return marker2seq
+
+
+
+
+ at trace_unhandled_exceptions
+def run_sample(args_list):
+ ifn_sample = args_list[0]
+ args = args_list[1]
+ base_name = ooSubprocess.splitext(ifn_sample)[0]
+ output_prefix = os.path.join(args['output_dir'], base_name)
+ if args['sam2file_ext'] != None:
+ sam2file = output_prefix + args['sam2file_ext']
+ else:
+ sam2file = None
+ marker2file = output_prefix + args['marker2file_ext']
+ if args['input_type'] == 'fastq':
+ sample2markers(
+ ifn_sample=ifn_sample,
+ min_read_len=args['min_read_len'],
+ min_align_score=args['min_align_score'],
+ min_base_quality=args['min_base_quality'],
+ error_rate=args['error_rate'],
+ ifn_markers=args['ifn_markers'],
+ index_path=args['index_path'],
+ nprocs=args['nprocs'],
+ sam2file=sam2file,
+ marker2file=marker2file,
+ tmp_dir=args['output_dir'],
+ quiet=args['quiet'])
+ else:
+ ofn_bam_sorted_prefix = os.path.join(
+ args['output_dir'],
+ os.path.basename(ifn_sample) + '.bam.sorted')
+ sam2markers(
+ sam_file=ifn_sample,
+ ofn_bam_sorted_prefix=ofn_bam_sorted_prefix,
+ min_align_score=args['min_align_score'],
+ min_base_quality=args['min_base_quality'],
+ error_rate=args['error_rate'],
+ marker2file=marker2file,
+ quiet=args['quiet'])
+ return 0
+
+
+
+
+def compute_polymorphic_sites(sample2pileup, ifn_alignment):
+ return
+
+
+
+
+def main(args):
+ ooSubprocess.mkdir(args['output_dir'])
+ manager = multiprocessing.Manager()
+
+ if args['input_type'] == 'fastq':
+ index_path = build_bowtie2db(args['ifn_markers'], args['output_dir'])
+ args['index_path'] = index_path
+
+ args_list = []
+ for ifn_sample in args['ifn_samples']:
+ args_list.append([ifn_sample, args])
+
+ #ooSubprocess.parallelize(run_sample, args_list, args['nprocs'])
+ pool = multiprocessing.Pool(args['nprocs'])
+ results = []
+ for a in args_list:
+ r = pool.apply_async(run_sample, [a])
+ results.append(r)
+ for r in results:
+ try:
+ r.get()
+ except Exception as e:
+ print e
+
+
+
+if __name__ == "__main__":
+ args = read_params()
+ main(args)
diff --git a/strainphlan_src/which.py b/strainphlan_src/which.py
new file mode 100755
index 0000000..7a7e75e
--- /dev/null
+++ b/strainphlan_src/which.py
@@ -0,0 +1,25 @@
+#!/usr/bin/env python
+# Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+
+import os
+def which(program):
+ def is_exe(fpath):
+ return os.path.isfile(fpath) and os.access(fpath, os.X_OK)
+
+ fpath, fname = os.path.split(program)
+ if fpath:
+ if is_exe(program):
+ return program
+ else:
+ for path in os.environ["PATH"].split(os.pathsep):
+ path = path.strip('"')
+ exe_file = os.path.join(path, program)
+ if is_exe(exe_file):
+ return exe_file
+
+ return None
+
+def is_exe(program):
+ return which(program) != None
diff --git a/strainphlan_tutorial/step1_download.sh b/strainphlan_tutorial/step1_download.sh
new file mode 100644
index 0000000..17c40c0
--- /dev/null
+++ b/strainphlan_tutorial/step1_download.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+wget
+bunzip2
diff --git a/strainphlan_tutorial/step2_fastq2sam.sh b/strainphlan_tutorial/step2_fastq2sam.sh
new file mode 100755
index 0000000..e98004f
--- /dev/null
+++ b/strainphlan_tutorial/step2_fastq2sam.sh
@@ -0,0 +1,8 @@
+#!/bin/bash
+mkdir -p sams
+for f in $(ls fastqs/*.bz2)
+do
+ echo "Running metaphlan2 on ${f}"
+ bn=$(basename ${f} | cut -d . -f 1)
+ tar xjfO ${f} | ../metaphlan2.py --bowtie2db ../db_v20/mpa_v20_m200 --mpa_pkl ../db_v20/mpa_v20_m200.pkl --input_type multifastq --nproc 10 -s sams/${bn}.sam.bz2 --bowtie2out sams/${bn}.bowtie2_out.bz2 -o sams/${bn}.profile
+done
diff --git a/strainphlan_tutorial/step3_sam2marker.sh b/strainphlan_tutorial/step3_sam2marker.sh
new file mode 100755
index 0000000..6660d13
--- /dev/null
+++ b/strainphlan_tutorial/step3_sam2marker.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+mkdir -p consensus_markers
+cwd=$(pwd -P)
+export PATH=${cwd}/../strainphlan_src:${PATH}
+python ../strainphlan_src/sample2markers.py --ifn_samples sams/*.sam.bz2 --input_type sam --output_dir consensus_markers --nprocs 10 &> consensus_markers/log.txt
diff --git a/strainphlan_tutorial/step4_extract_db_marker.sh b/strainphlan_tutorial/step4_extract_db_marker.sh
new file mode 100755
index 0000000..99c1c35
--- /dev/null
+++ b/strainphlan_tutorial/step4_extract_db_marker.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+mkdir -p db_markers
+bowtie2-inspect ../db_v20/mpa_v20_m200 > db_markers/all_markers.fasta
+python ../strainphlan_src/extract_markers.py --mpa_pkl ../db_v20/mpa_v20_m200.pkl --ifn_markers db_markers/all_markers.fasta --clade s__Bacteroides_caccae --ofn_markers db_markers/s__Bacteroides_caccae.markers.fasta
diff --git a/strainphlan_tutorial/step5_build_tree.sh b/strainphlan_tutorial/step5_build_tree.sh
new file mode 100644
index 0000000..9600a2d
--- /dev/null
+++ b/strainphlan_tutorial/step5_build_tree.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+mkdir -p output
+python ../strainphlan.py --mpa_pkl ../db_v20/mpa_v20_m200.pkl --ifn_samples consensus_markers/*.markers --ifn_markers db_markers/s__Bacteroides_caccae.markers.fasta --ifn_ref_genomes reference_genomes/G000273725.fna.bz2 --output_dir output --nprocs_main 10 --clades s__Bacteroides_caccae &> output/log.txt
+python ../strainphlan_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
diff --git a/strainphlan_tutorial/step6_build_tree_single_strain.sh b/strainphlan_tutorial/step6_build_tree_single_strain.sh
new file mode 100644
index 0000000..ad99085
--- /dev/null
+++ b/strainphlan_tutorial/step6_build_tree_single_strain.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+python ../strainer_src/build_tree_single_strain.py --ifn_alignments output/s__Bacteroides_caccae.fasta --nprocs 10 --log_ofn output/build_tree_single_strain.log
+python ../strainer_src/add_metadata_tree.py --ifn_trees output/RAxML_bestTree.s__Bacteroides_caccae.remove_multiple_strains.tree --ifn_metadatas fastqs/metadata.txt --metadatas subjectID
diff --git a/utils/extract_markers.py b/utils/extract_markers.py
new file mode 100755
index 0000000..f649438
--- /dev/null
+++ b/utils/extract_markers.py
@@ -0,0 +1,49 @@
+#!/usr/bin/env python
+#Author: Duy Tin Truong (duytin.truong at unitn.it)
+# at CIBIO, University of Trento, Italy
+
+__author__ = 'Duy Tin Truong (duytin.truong at unitn.it)'
+__version__ = '0.1'
+__date__ = '1 Sep 2014'
+
+import sys
+import os
+import argparse as ap
+import pickle
+import bz2
+from Bio import SeqIO, Seq, SeqRecord
+
+def read_params():
+ p = ap.ArgumentParser()
+ p.add_argument('--mpa_pkl', required=True, default=None, type=str)
+ p.add_argument('--ifn_markers', required=False, default=None, type=str)
+ p.add_argument('--clade', required=True, default=None, type=str)
+ p.add_argument('--ofn_markers', required=True, default=None, type=str)
+ return vars(p.parse_args())
+
+
+def extract_markers(mpa_pkl, ifn_markers, clade, ofn_markers):
+ with open(mpa_pkl, 'rb') as ifile:
+ db = pickle.loads(bz2.decompress(ifile.read()))
+ markers = set([])
+ for marker in db['markers']:
+ if clade == db['markers'][marker]['taxon'].split('|')[-1]:
+ markers.add(marker)
+ print 'number of markers', len(markers)
+ with open(ofn_markers, 'w') as ofile:
+ if ifn_markers:
+ for rec in SeqIO.parse(open(ifn_markers, 'r'), 'fasta'):
+ if rec.name in markers:
+ SeqIO.write(rec, ofile, 'fasta')
+ else:
+ for m in markers:
+ ofile.write('%s\n'%m)
+
+
+if __name__ == "__main__":
+ args = read_params()
+ extract_markers(
+ mpa_pkl=args['mpa_pkl'],
+ ifn_markers=args['ifn_markers'],
+ clade=args['clade'],
+ ofn_markers=args['ofn_markers'])
diff --git a/utils/markers_info.txt.bz2 b/utils/markers_info.txt.bz2
new file mode 100644
index 0000000..81b1804
Binary files /dev/null and b/utils/markers_info.txt.bz2 differ
diff --git a/utils/merge_metaphlan_tables.py b/utils/merge_metaphlan_tables.py
new file mode 100755
index 0000000..c65259c
--- /dev/null
+++ b/utils/merge_metaphlan_tables.py
@@ -0,0 +1,103 @@
+#!/usr/bin/env python
+
+# ==============================================================================
+# Merge script: from MetaPhlAn output on single sample to a joined "clades vs samples" table
+# Authors: Timothy Tickle (ttickle at hsph.harvard.edu) and Curtis Huttenhower (chuttenh at hsph.harvard.edu)
+# ==============================================================================
+
+import argparse
+import csv
+import os
+import sys
+
+
+def merge( aaastrIn, astrLabels, iCol, ostm ):
+ """
+ Outputs the table join of the given pre-split string collection.
+
+ :param aaastrIn: One or more split lines from which data are read.
+ :type aaastrIn: collection of collections of string collections
+ :param astrLabels: File names of input data.
+ :type astrLabels: collection of strings
+ :param iCol: Data column in which IDs are matched (zero-indexed).
+ :type iCol: int
+ :param ostm: Output stream to which matched rows are written.
+ :type ostm: output stream
+
+ """
+
+ setstrIDs = set()
+ """The final set of all IDs in any table."""
+ ahashIDs = [{} for i in range( len( aaastrIn ) )]
+ """One hash of IDs to row numbers for each input datum."""
+ aaastrData = [[] for i in range( len( aaastrIn ) )]
+ """One data table for each input datum."""
+ aastrHeaders = [[] for i in range( len( aaastrIn ) )]
+ """The list of non-ID headers for each input datum."""
+ strHeader = "ID"
+ """The ID column header."""
+
+ # For each input datum in each input stream...
+ pos = 0
+
+ for f in aaastrIn :
+ with open(f) as csvfile :
+ iIn = csv.reader(csvfile, csv.excel_tab)
+
+ # Lines from the current file, empty list to hold data, empty hash to hold ids
+ aastrData, hashIDs = (a[pos] for a in (aaastrData, ahashIDs))
+
+ iLine = -1
+ # For a line in the file
+ for astrLine in iIn:
+ iLine += 1
+
+ # ID is from first column, data are everything else
+ strID, astrData = astrLine[iCol], ( astrLine[:iCol] + astrLine[( iCol + 1 ):] )
+
+ hashIDs[strID] = iLine
+ aastrData.append( astrData )
+
+ # Batch merge every new ID key set
+ setstrIDs.update( hashIDs.keys( ) )
+
+ pos += 1
+
+ # Create writer
+ csvw = csv.writer( ostm, csv.excel_tab, lineterminator='\n' )
+
+ # Make the file names the column names
+ csvw.writerow( [strHeader] + [os.path.splitext(f)[0] for f in astrLabels] )
+
+ # Write out data
+ for strID in sorted( setstrIDs ):
+ astrOut = []
+ for iIn in range( len( aaastrIn ) ):
+ aastrData, hashIDs = (a[iIn] for a in (aaastrData, ahashIDs))
+ # Look up the row number of the current ID in the current dataset, if any
+ iID = hashIDs.get( strID )
+ # If not, start with no data; if yes, pull out stored data row
+ astrData = [0.0] if ( iID == None ) else aastrData[iID]
+ # Pad output data as needed
+ astrData += [None] * ( len( aastrHeaders[iIn] ) - len( astrData ) )
+ astrOut += astrData
+ csvw.writerow( [strID] + astrOut )
+
+
+argp = argparse.ArgumentParser( prog = "merge_metaphlan_tables.py",
+ description = """Performs a table join on one or more metaphlan output files.""")
+argp.add_argument( "aistms", metavar = "input.txt", nargs = "+",
+ help = "One or more tab-delimited text tables to join" )
+
+__doc__ = "::\n\n\t" + argp.format_help( ).replace( "\n", "\n\t" )
+
+argp.usage = argp.format_usage()[7:]+"\n\n\tPlease make sure to supply file paths to the files to combine. If combining 3 files (Table1.txt, Table2.txt, and Table3.txt) the call should be:\n\n\t\tpython merge_metaphlan_tables.py Table1.txt Table2.txt Table3.txt > output.txt\n\n\tA wildcard to indicate all .txt files that start with Table can be used as follows:\n\n\t\tpython merge_metaphlan_tables.py Table*.txt > output.txt"
+
+
+def _main( ):
+ args = argp.parse_args( )
+ merge(args.aistms, [os.path.split(os.path.basename(f))[1] for f in args.aistms], 0, sys.stdout)
+
+
+if __name__ == "__main__":
+ _main( )
diff --git a/utils/metaphlan2krona.py b/utils/metaphlan2krona.py
new file mode 100755
index 0000000..fc1fd5a
--- /dev/null
+++ b/utils/metaphlan2krona.py
@@ -0,0 +1,49 @@
+#!/usr/bin/env python
+
+# ==============================================================================
+# Conversion script: from MetaPhlAn output to Krona text input file
+# Author: Daniel Brami (daniel.brami at gmail.com)
+# ==============================================================================
+
+import sys
+import optparse
+import re
+
+def main():
+ #Parse Command Line
+ parser = optparse.OptionParser()
+ parser.add_option( '-p', '--profile', dest='profile', default='', action='store', help='The input file is the MetaPhlAn standard result file' )
+ parser.add_option( '-k', '--krona', dest='krona', default='krona.out', action='store', help='the Krons output file name' )
+ ( options, spillover ) = parser.parse_args()
+
+ if not options.profile or not options.krona:
+ parser.print_help()
+ sys.exit()
+
+ re_candidates = re.compile(r"s__|unclassified\t")
+ re_replace = re.compile(r"\w__")
+ re_bar = re.compile(r"\|")
+
+ metaPhLan = list()
+ with open(options.profile,'r') as f:
+ metaPhLan = f.readlines()
+ f.close()
+
+ krona_tmp = options.krona
+ metaPhLan_FH = open(krona_tmp, 'w')
+
+ for aline in (metaPhLan):
+ if(re.search(re_candidates, aline)):
+ x=re.sub(re_replace, '\t', aline)
+ x=re.sub(re_bar, '', x)
+
+ x_cells = x.split('\t')
+ lineage = '\t'.join(x_cells[0:(len(x_cells) -1)])
+ abundance = float(x_cells[-1].rstrip('\n'))
+
+ metaPhLan_FH.write('%s\n'%(str(abundance) + '\t' + lineage))
+
+ metaPhLan_FH.close()
+
+if __name__ == '__main__':
+ main()
diff --git a/utils/metaphlan_hclust_heatmap.py b/utils/metaphlan_hclust_heatmap.py
new file mode 100755
index 0000000..c306509
--- /dev/null
+++ b/utils/metaphlan_hclust_heatmap.py
@@ -0,0 +1,483 @@
+#!/usr/bin/env python
+
+import sys
+import numpy as np
+import matplotlib
+matplotlib.use('Agg')
+import scipy
+import pylab
+import scipy.cluster.hierarchy as sch
+from scipy import stats
+
+# User defined color maps (in addition to matplotlib ones)
+bbcyr = {'red': ( (0.0, 0.0, 0.0),
+ (0.25, 0.0, 0.0),
+ (0.50, 0.0, 0.0),
+ (0.75, 1.0, 1.0),
+ (1.0, 1.0, 1.0)),
+ 'green': ( (0.0, 0.0, 0.0),
+ (0.25, 0.0, 0.0),
+ (0.50, 1.0, 1.0),
+ (0.75, 1.0, 1.0),
+ (1.0, 0.0, 1.0)),
+ 'blue': ( (0.0, 0.0, 0.0),
+ (0.25, 1.0, 1.0),
+ (0.50, 1.0, 1.0),
+ (0.75, 0.0, 0.0),
+ (1.0, 0.0, 1.0))}
+
+bbcry = {'red': ( (0.0, 0.0, 0.0),
+ (0.25, 0.0, 0.0),
+ (0.50, 0.0, 0.0),
+ (0.75, 1.0, 1.0),
+ (1.0, 1.0, 1.0)),
+ 'green': ( (0.0, 0.0, 0.0),
+ (0.25, 0.0, 0.0),
+ (0.50, 1.0, 1.0),
+ (0.75, 0.0, 0.0),
+ (1.0, 1.0, 1.0)),
+ 'blue': ( (0.0, 0.0, 0.0),
+ (0.25, 1.0, 1.0),
+ (0.50, 1.0, 1.0),
+ (0.75, 0.0, 0.0),
+ (1.0, 0.0, 1.0))}
+my_colormaps = [ ('bbcyr',bbcyr),
+ ('bbcry',bbcry)]
+
+tax_units = "kpcofgs"
+
+def read_params(args):
+ import argparse as ap
+ import textwrap
+
+ p = ap.ArgumentParser( description= "This scripts generates heatmaps with hierarchical clustering \n"
+ "of both samples and microbial clades. The script can also subsample \n"
+ "the number of clades to display based on the their nth percentile \n"
+ "abundance value in each sample\n" )
+
+ p.add_argument( '--in', metavar='INPUT_FILE', type=str, default=None, required = True,
+ help= "The input file of microbial relative abundances. \n"
+ "This file is typically obtained with the \"utils/merge_metaphlan_tables.py\"\n")
+
+ p.add_argument( '--out', metavar='OUTPUT_FILE', type=str, default=None, required = True,
+ help= "The output image. \n"
+ "The extension of the file determines the image format. png, pdf, and svg are the preferred format" )
+
+ p.add_argument( '-m', type=str,
+ choices=[ "single","complete","average",
+ "weighted","centroid","median",
+ "ward" ],
+ default="average",
+ help = "The hierarchical clustering method, default is \"average\"\n" )
+
+ dist_funcs = [ "euclidean","minkowski","cityblock","seuclidean",
+ "sqeuclidean","cosine","correlation","hamming",
+ "jaccard","chebyshev","canberra","braycurtis",
+ "mahalanobis","yule","matching","dice",
+ "kulsinski","rogerstanimoto","russellrao","sokalmichener",
+ "sokalsneath","wminkowski","ward"]
+ p.add_argument( '-d', type=str, choices=dist_funcs, default="braycurtis",
+ help="The distance function for samples. Default is \"braycurtis\"")
+ p.add_argument( '-f', type=str, choices=dist_funcs, default="correlation",
+ help="The distance function for microbes. Default is \"correlation\"")
+
+ p.add_argument( '-s', metavar='scale norm', type=str,
+ default = 'lin', choices = ['log','lin'])
+
+ p.add_argument( '-x', type=float, default = 0.1,
+ help="Width of heatmap cells. Automatically set, this option should not be necessary unless for very large heatmaps")
+ p.add_argument( '-y', type=float, default = 0.1,
+ help="Height of heatmap cells. Automatically set, this option should not be necessary unless for very large heatmaps")
+
+ p.add_argument( '--minv', type=float, default = 0.0,
+ help="Minimum value to display. Default is 0.0, values around 0.001 are also reasonable")
+ p.add_argument( '--maxv', metavar='max value', type=float,
+ help="Maximum value to display. Default is maximum value present, can be set e.g. to 100 to display the full scale")
+
+ p.add_argument( '--tax_lev', metavar='TAXONOMIC_LEVEL', type=str,
+ choices='a'+tax_units, default='s', help =
+ "The taxonomic level to display:\n"
+ "'a' : all taxonomic levels\n"
+ "'k' : kingdoms (Bacteria and Archaea) only\n"
+ "'p' : phyla only\n"
+ "'c' : classes only\n"
+ "'o' : orders only\n"
+ "'f' : families only\n"
+ "'g' : genera only\n"
+ "'s' : species only\n"
+ "[default 's']" )
+
+ p.add_argument( '--perc', type=int, default=None,
+ help="Percentile to be used for ordering the microbes in order to select with --top the most abundant microbes only. Default is 90")
+ p.add_argument( '--top', type=int, default=None,
+ help="Display the --top most abundant microbes only (ordering based on --perc)")
+
+ p.add_argument( '--sdend_h', type=float, default = 0.1,
+ help="Set the height of the sample dendrogram. Default is 0.1")
+ p.add_argument( '--fdend_w', type=float, default = 0.1,
+ help="Set the width of the microbes dendrogram. Default is 0.1")
+ p.add_argument( '--cm_h', type=float, default = 0.03,
+ help="Set the height of the colormap. Default = 0.03" )
+ p.add_argument( '--cm_ticks', metavar='label for ticks of the colormap', type=str,
+ default = None )
+
+ p.add_argument( '--font_size', type=int, default = 7,
+ help = "Set label font sizes. Default is 7\n" )
+ p.add_argument( '--clust_line_w', type=float, default = 1.0,
+ help="Set the line width for the dendrograms" )
+
+ col_maps = ['Accent', 'Blues', 'BrBG', 'BuGn', 'BuPu', 'Dark2', 'GnBu',
+ 'Greens', 'Greys', 'OrRd', 'Oranges', 'PRGn', 'Paired',
+ 'Pastel1', 'Pastel2', 'PiYG', 'PuBu', 'PuBuGn', 'PuOr',
+ 'PuRd', 'Purples', 'RdBu', 'RdGy', 'RdPu', 'RdYlBu', 'RdYlGn',
+ 'Reds', 'Set1', 'Set2', 'Set3', 'Spectral', 'YlGn', 'YlGnBu',
+ 'YlOrBr', 'YlOrRd', 'afmhot', 'autumn', 'binary', 'bone',
+ 'brg', 'bwr', 'cool', 'copper', 'flag', 'gist_earth',
+ 'gist_gray', 'gist_heat', 'gist_ncar', 'gist_rainbow',
+ 'gist_stern', 'gist_yarg', 'gnuplot', 'gnuplot2', 'gray',
+ 'hot', 'hsv', 'jet', 'ocean', 'pink', 'prism', 'rainbow',
+ 'seismic', 'spectral', 'spring', 'summer', 'terrain', 'winter'] + [n for n,c in my_colormaps]
+ p.add_argument( '-c', type=str, choices = col_maps, default = 'jet',
+ help="Set the colormap. Default is \"jet\"." )
+
+ return vars(p.parse_args())
+
+# Predefined colors for dendrograms brances and class labels
+colors = [ "#B22222","#006400","#0000CD","#9400D3","#696969","#8B4513",
+ "#FF1493","#FF8C00","#3CB371","#00Bfff","#CDC9C9","#FFD700",
+ "#2F4F4F","#FF0000","#ADFF2F","#B03060" ]
+
+def samples2classes_panel(fig, samples, s2l, idx1, idx2, height, xsize, cols, legendon, fontsize, label2cols, legend_ncol ):
+ from matplotlib.patches import Rectangle
+ samples2labels = dict([(s,l)
+ for s,l in [ll.strip().split('\t')
+ for ll in open(s2l)]])
+
+ if label2cols:
+ labels2colors = dict([(l[0],l[1]) for l in [ll.strip().split('\t') for ll in open(label2cols)]])
+ else:
+ cs = cols if cols else colors
+ labels2colors = dict([(l,cs[i%len(cs)]) for i,l in enumerate(set(samples2labels.values()))])
+ ax1 = fig.add_axes([0.,1.0,1.0,height],frameon=False)
+ ax1.set_xticks([])
+ ax1.set_yticks([])
+ ax1.set_ylim( [0.0, height] )
+ ax1.set_xlim( [0.0, xsize] )
+ step = xsize / float(len(samples))
+ labels = set()
+ added_labels = set()
+ for i,ind in enumerate(idx2):
+ if not samples[ind] in samples2labels or \
+ not samples2labels[samples[ind]] in labels2colors:
+ fc, ll = None, None
+ else:
+ ll = samples2labels[samples[ind]]
+ ll = None if ll in added_labels else ll
+ added_labels.add( ll )
+ fc = labels2colors[samples2labels[samples[ind]]]
+
+ rect = Rectangle( [float(i)*step, 0.0], step, height,
+ facecolor = fc,
+ label = ll,
+ edgecolor='b', lw = 0.0)
+ labels.add( ll )
+ ax1.add_patch(rect)
+ ax1.autoscale_view()
+
+ if legendon:
+ ax1.legend( loc = 2, ncol = legend_ncol, bbox_to_anchor=(1.01, 3.),
+ borderpad = 0.0, labelspacing = 0.0,
+ handlelength = 0.5, handletextpad = 0.3,
+ borderaxespad = 0.0, columnspacing = 0.3,
+ prop = {'size':fontsize}, frameon = False)
+
+def samples_dend_panel( fig, Z, Z2, ystart, ylen, lw ):
+ ax2 = fig.add_axes([0.0,1.0+ystart,1.0,ylen], frameon=False)
+ Z2['color_list'] = [c.replace('b','k') for c in Z2['color_list']]
+ mh = max(Z[:,2])
+ sch._plot_dendrogram( Z2['icoord'], Z2['dcoord'], Z2['ivl'],
+ Z.shape[0] + 1, Z.shape[0] + 1,
+ mh, 'top', no_labels=True,
+ color_list=Z2['color_list'] )
+ for coll in ax2.collections:
+ coll._linewidths = (lw,)
+ ax2.set_xticks([])
+ ax2.set_yticks([])
+ ax2.set_xticklabels([])
+
+def features_dend_panel( fig, Z, Z2, width, lw ):
+ ax1 = fig.add_axes([-width,0.0,width,1.0], frameon=False)
+ Z2['color_list'] = [c.replace('b','k').replace('x','b') for c in Z2['color_list']]
+ mh = max(Z[:,2])
+ sch._plot_dendrogram(Z2['icoord'], Z2['dcoord'], Z2['ivl'], Z.shape[0] + 1, Z.shape[0] + 1, mh, 'right', no_labels=True, color_list=Z2['color_list'])
+ for coll in ax1.collections:
+ coll._linewidths = (lw,)
+ ax1.set_xticks([])
+ ax1.set_yticks([])
+ ax1.set_xticklabels([])
+
+
+def add_cmap( cmapdict, name ):
+ my_cmap = matplotlib.colors.LinearSegmentedColormap(name,cmapdict,256)
+ pylab.register_cmap(name=name,cmap=my_cmap)
+
+def init_fig(xsize,ysize,ncol):
+ fig = pylab.figure(figsize=(xsize,ysize))
+ sch._link_line_colors = colors[:ncol]
+ return fig
+
+def heatmap_panel( fig, D, minv, maxv, idx1, idx2, cm_name, scale, cols, rows, label_font_size, cb_offset, cb_l, flabelson, slabelson, cm_ticks, gridon, bar_offset ):
+ cm = pylab.get_cmap(cm_name)
+ bottom_col = [ cm._segmentdata['red'][0][1],
+ cm._segmentdata['green'][0][1],
+ cm._segmentdata['blue'][0][1] ]
+ axmatrix = fig.add_axes( [0.0,0.0,1.0,1.0],
+ axisbg=bottom_col)
+ if any([c < 0.95 for c in bottom_col]):
+ axmatrix.spines['right'].set_color('none')
+ axmatrix.spines['left'].set_color('none')
+ axmatrix.spines['top'].set_color('none')
+ axmatrix.spines['bottom'].set_color('none')
+ norm_f = matplotlib.colors.LogNorm if scale == 'log' else matplotlib.colors.Normalize
+ im = axmatrix.matshow( D, norm = norm_f( vmin=minv if minv > 0.0 else None,
+ vmax=maxv),
+ aspect='auto', origin='lower', cmap=cm, vmax=maxv)
+
+ axmatrix2 = axmatrix.twinx()
+ axmatrix3 = axmatrix.twiny()
+
+ axmatrix.set_xticks([])
+ axmatrix2.set_xticks([])
+ axmatrix3.set_xticks([])
+ axmatrix.set_yticks([])
+ axmatrix2.set_yticks([])
+ axmatrix3.set_yticks([])
+
+ axmatrix.set_xticklabels([])
+ axmatrix2.set_xticklabels([])
+ axmatrix3.set_xticklabels([])
+ axmatrix.set_yticklabels([])
+ axmatrix2.set_yticklabels([])
+ axmatrix3.set_yticklabels([])
+
+ if any([c < 0.95 for c in bottom_col]):
+ axmatrix2.spines['right'].set_color('none')
+ axmatrix2.spines['left'].set_color('none')
+ axmatrix2.spines['top'].set_color('none')
+ axmatrix2.spines['bottom'].set_color('none')
+ if any([c < 0.95 for c in bottom_col]):
+ axmatrix3.spines['right'].set_color('none')
+ axmatrix3.spines['left'].set_color('none')
+ axmatrix3.spines['top'].set_color('none')
+ axmatrix3.spines['bottom'].set_color('none')
+ if flabelson:
+ axmatrix2.set_yticks(np.arange(len(rows))+0.5)
+ axmatrix2.set_yticklabels([rows[r] for r in idx1],size=label_font_size,va='center')
+ if slabelson:
+ axmatrix.set_xticks(np.arange(len(cols)))
+ axmatrix.set_xticklabels([cols[r] for r in idx2],size=label_font_size,rotation=90,va='top',ha='center')
+ axmatrix.tick_params(length=0)
+ axmatrix2.tick_params(length=0)
+ axmatrix3.tick_params(length=0)
+ axmatrix2.set_ylim(0,len(rows))
+
+ if gridon:
+ axmatrix.set_yticks(np.arange(len(idx1)-1)+0.5)
+ axmatrix.set_xticks(np.arange(len(idx2))+0.5)
+ axmatrix.grid( True )
+ ticklines = axmatrix.get_xticklines()
+ ticklines.extend( axmatrix.get_yticklines() )
+ #gridlines = axmatrix.get_xgridlines()
+ #gridlines.extend( axmatrix.get_ygridlines() )
+
+ for line in ticklines:
+ line.set_linewidth(3)
+
+ if cb_l > 0.0:
+ axcolor = fig.add_axes([0.0,1.0+bar_offset*1.25,1.0,cb_l])
+ cbar = fig.colorbar(im, cax=axcolor, orientation='horizontal')
+ cbar.ax.tick_params(labelsize=label_font_size)
+ if cm_ticks:
+ cbar.ax.set_xticklabels( cm_ticks.split(":") )
+
+
+def read_table( fin, xstart,xstop,ystart,ystop, percentile = None, top = None, tax_lev = 's' ):
+ mat = [l.strip().split('\t') for l in open( fin ) if l.strip()]
+ if tax_lev != 'a':
+ i = tax_units.index(tax_lev)
+ mat = [m for i,m in enumerate(mat) if i == 0 or m[0].split('|')[-1][0] == tax_lev or ( len(m[0].split('|')) == i and m[0].split('|')[-1][0].endswith("unclassified"))]
+ sample_labels = mat[0][xstart:xstop]
+
+ m = [(mm[xstart-1],np.array([float(f) for f in mm[xstart:xstop]])) for mm in mat[ystart:ystop]]
+
+ if top and not percentile:
+ percentile = 90
+
+ if percentile:
+ m = sorted(m,key=lambda x:-stats.scoreatpercentile(x[1],percentile))
+ if top:
+ feat_labels = [mm[0].split("|")[-1] for mm in m[:top]]
+ m = [mm[1] for mm in m[:top]]
+ else:
+ feat_labels = [mm[0].split("|")[-1] for mm in m]
+ m = [mm[1] for mm in m]
+
+ D = np.matrix( np.array( m ) )
+
+ return D, feat_labels, sample_labels
+
+def read_dm( fin, n ):
+ mat = [[float(f) for f in l.strip().split('\t')] for l in open( fin )]
+ nc = sum([len(r) for r in mat])
+
+ if nc == n*n:
+ dm = []
+ for i in range(n):
+ dm += mat[i][i+1:]
+ return np.array(dm)
+ if nc == (n*n-n)/2:
+ dm = []
+ for i in range(n):
+ dm += mat[i]
+ return np.array(dm)
+ sys.stderr.write( "Error in reading the distance matrix\n" )
+ sys.exit()
+
+
+def hclust( fin, fout,
+ method = "average",
+ dist_func = "euclidean",
+ feat_dist_func = "d",
+ xcw = 0.1,
+ ycw = 0.1,
+ scale = 'lin',
+ minv = 0.0,
+ maxv = None,
+ xstart = 1,
+ ystart = 1,
+ xstop = None,
+ ystop = None,
+ percentile = None,
+ top = None,
+ cm_name = 'jet',
+ s2l = None,
+ label_font_size = 7,
+ feat_dend_col_th = None,
+ sample_dend_col_th = None,
+ clust_ncols = 7,
+ clust_line_w = 1.0,
+ label_cols = None,
+ sdend_h = 0.1,
+ fdend_w = 0.1,
+ cm_h = 0.03,
+ dmf = None,
+ dms = None,
+ legendon = False,
+ label2cols = None,
+ flabelon = True,
+ slabelon = True,
+ cm_ticks = None,
+ legend_ncol = 3,
+ pad_inches = None,
+ legend_font_size = 7,
+ gridon = 0,
+ tax_lev = 's'):
+
+ if label_cols and label_cols.count("-"):
+ label_cols = label_cols.split("-")
+
+ for n,c in my_colormaps:
+ add_cmap( c, n )
+
+ if feat_dist_func == 'd':
+ feat_dist_func = dist_func
+
+ D, feat_labels, sample_labels = read_table(fin,xstart,xstop,ystart,ystop,percentile,top,tax_lev=tax_lev)
+
+ ylen,xlen = D[:].shape
+ Dt = D.transpose()
+
+ size_cx, size_cy = xcw, ycw
+
+ xsize, ysize = max(xlen*size_cx,2.0), max(ylen*size_cy,2.0)
+ ydend_offset = 0.025*8.0/ysize if s2l else 0.0
+
+ fig = init_fig(xsize,ysize,clust_ncols)
+
+ nfeats, nsamples = len(D), len(Dt)
+
+ if dmf:
+ p1 = read_dm( dmf, nfeats )
+ Y1 = sch.linkage( p1, method=method )
+ else:
+ if len(D) < 2 or len(Dt) < 2:
+ Y1 = []
+ elif feat_dist_func == 'correlation':
+ Y1 = sch.linkage( D, method=method, metric=lambda x,y:max(0.0,scipy.spatial.distance.correlation(x,y)) )
+ else:
+ Y1 = sch.linkage( D, method=method, metric=feat_dist_func )
+
+ if len(Y1):
+ Z1 = sch.dendrogram(Y1, no_plot=True, color_threshold=feat_dend_col_th)
+ idx1 = Z1['leaves']
+ else:
+ idx1 = list(range(len(D)))
+
+ if dms:
+ p2 = read_dm( dms, nsamples )
+ Y2 = sch.linkage( p2, method=method )
+ else:
+ if len(Dt) < 2 or len(D) < 2:
+ Y2 = []
+ elif sample_dend_col_th == 'correlation':
+ Y2 = sch.linkage( Dt, method=method, metric=lambda x,y:max(0.0,scipy.spatial.distance.correlation(x,y)) )
+ else:
+ Y2 = sch.linkage( Dt, method=method, metric=dist_func )
+
+ if len(Y2):
+ Z2 = sch.dendrogram(Y2, no_plot=True, color_threshold=sample_dend_col_th)
+ idx2 = Z2['leaves']
+ else:
+ idx2 = list(range(len(Dt)))
+ D = D[idx1,:][:,idx2]
+
+ if fdend_w > 0.0 and len(Y1):
+ features_dend_panel(fig, Y1, Z1, fdend_w*8.0/xsize, clust_line_w )
+ if sdend_h > 0.0 and len(Y2):
+ samples_dend_panel(fig, Y2, Z2, ydend_offset, sdend_h*8.0/ysize, clust_line_w)
+
+
+ if s2l:
+ samples2classes_panel( fig, sample_labels, s2l, idx1, idx2, 0.025*8.0/ysize, xsize, label_cols, legendon, legend_font_size, label2cols, legend_ncol )
+ heatmap_panel( fig, D, minv, maxv, idx1, idx2, cm_name, scale, sample_labels, feat_labels, label_font_size, -cm_h*8.0/ysize, cm_h*0.8*8.0/ysize, flabelon, slabelon, cm_ticks, gridon, ydend_offset+sdend_h*8.0/ysize )
+
+ fig.savefig( fout, bbox_inches='tight',
+ pad_inches = pad_inches,
+ dpi=300) if fout else pylab.show()
+
+if __name__ == '__main__':
+ pars = read_params( sys.argv )
+
+ hclust( fin = pars['in'],
+ fout = pars['out'],
+ method = pars['m'],
+ dist_func = pars['d'],
+ feat_dist_func = pars['f'],
+ xcw = pars['x'],
+ ycw = pars['y'],
+ scale = pars['s'],
+ minv = pars['minv'],
+ maxv = pars['maxv'],
+ percentile = pars['perc'],
+ top = pars['top'],
+ cm_name = pars['c'],
+ label_font_size = pars['font_size'],
+ clust_line_w = pars['clust_line_w'],
+ sdend_h = pars['sdend_h'],
+ fdend_w = pars['fdend_w'],
+ cm_h = pars['cm_h'],
+ cm_ticks = pars['cm_ticks'],
+ pad_inches = 0.1,
+ tax_lev = pars['tax_lev']
+ )
+
diff --git a/utils/plot_bug.py b/utils/plot_bug.py
new file mode 100755
index 0000000..f580c2f
--- /dev/null
+++ b/utils/plot_bug.py
@@ -0,0 +1,254 @@
+#!/usr/bin/env python
+
+import sys
+import numpy as np
+import scipy.spatial.distance as spd
+import scipy.cluster.hierarchy as sph
+from scipy import stats
+import matplotlib
+#matplotlib.use('Agg')
+import pylab
+import pandas as pd
+import matplotlib.pyplot as plt
+
+class ReadCmd:
+
+ def __init__( self ):
+ import argparse as ap
+ import textwrap
+
+ p = ap.ArgumentParser( description= "TBA" )
+ arg = p.add_argument
+
+ arg( '-i', '--inp', '--in', metavar='INPUT_FILE', type=str, nargs='?', default=sys.stdin,
+ help= "The input matrix" )
+ arg( '-o', '--out', metavar='OUTPUT_FILE', type=str, nargs='?', default=None,
+ help= "The output image file [image on screen of not specified]" )
+
+ arg( '-m', '--metadata_file', type=str, default='None',
+ help= "The input metadata file [default None]" )
+
+ DataMatrix.input_parameters( p )
+ BarPlot.input_parameters( p )
+ self.args = p.parse_args()
+
+ def check_consistency( self ):
+ pass
+
+ def get_args( self ):
+ return self.args
+
+class DataMatrix:
+ datatype = 'data_matrix'
+
+ @staticmethod
+ def input_parameters( parser ):
+ dm_param = parser.add_argument_group('Input data matrix parameters')
+ arg = dm_param.add_argument
+
+ arg( '--sep', type=str, default='\t' )
+ arg( '-f', '--feat', type=str, default=None, required = True,
+ help = "Name of the feature to plot"
+ "[or the ending string if --endswith is specified]")
+ arg( '--endswith', action='store_true',
+ help = "Match the ending part of the feature name" )
+ arg( '--fname_row', type=int, default=0,
+ help = "row number containing the names of the features "
+ "[default 0, specify -1 if no names are present in the matrix")
+ arg( '--sname_row', type=int, default=0,
+ help = "column number containing the names of the samples "
+ "[default 0, specify -1 if no names are present in the matrix")
+ arg( '--skip_rows', type=str, default=None,
+ help = "Row numbers to skip (0-indexed, comma separated) from the input file"
+ "[default None, meaning no rows skipped")
+ arg( '--def_na', type=float, default=None,
+ help = "Set the default value for missing values [default None which means no replacement]")
+
+ def __init__( self, input_file, args ):
+ self.args = args
+ toskip = [int(l) for l in self.args.skip_rows.split(",")] if self.args.skip_rows else None
+ self.table = pd.read_table(
+ input_file, sep = self.args.sep, skipinitialspace = True, skiprows = toskip,
+ header = self.args.fname_row if self.args.fname_row > -1 else None,
+ index_col = self.args.sname_row if self.args.sname_row > -1 else None
+ )
+
+ rows = []
+
+ if self.args.endswith:
+ for n in self.table.index:
+ if n.endswith( self.args.feat ):
+ rows.append( n )
+ elif self.args.feat in self.table.index:
+ rows.append( self.args.feat )
+ self.table = self.table.reindex( index=rows )
+
+ if not len(rows):
+ sys.stderr.write("Error, feat "+self.args.feat+" not found!")
+ sys.exit()
+ if len(rows) > 1:
+ sys.stderr.write("Error, multiple features matching "+self.args.feat+" !")
+ sys.exit()
+
+ if not self.args.def_na is None:
+ self.table = self.table.fillna( self.args.def_na )
+
+ def get_numpy_matrix( self ):
+ return self.table
+
+ def get_snames( self ):
+ return list(self.table.index)
+
+ def get_fnames( self ):
+ return list(self.table.columns)
+
+ def save_matrix( self, output_file ):
+ self.table.to_csv( output_file, sep = '\t' )
+
+class MetadataMatrix:
+ datatype = 'metadata_matrix'
+
+ @staticmethod
+ def input_parameters( parser ):
+ dm_param = parser.add_argument_group('Input metadata file')
+ arg = dm_param.add_argument
+
+ arg( '--sep', type=str, default='\t' )
+ arg( '--fname_row', type=int, default=0,
+ help = "row number containing the names of the features "
+ "[default 0, specify -1 if no names are present in the matrix")
+ arg( '--def_na', type=float, default=None,
+ help = "Set the default value for missing values [default None which means no replacement]")
+
+ def __init__( self, input_file, args ):
+ self.args = args
+ self.table = pd.read_table(
+ input_file, sep = self.args.sep, skipinitialspace = True,
+ #header = self.args.fname_row if self.args.fname_row > -1 else None,
+ index_col = self.args.sname_row if self.args.sname_row > -1 else None
+ )
+
+ if not self.args.def_na is None:
+ self.table = self.table.fillna( self.args.def_na )
+
+ def get_snames( self ):
+ return list(self.table.index)
+
+ def get_fnames( self ):
+ return list(self.table.columns)
+
+ def get_table( self ):
+ return self.table
+
+class BarPlot:
+ datatype = 'barplot'
+
+ @staticmethod
+ def input_parameters( parser ):
+ hm_param = parser.add_argument_group('Heatmap options')
+ arg = hm_param.add_argument
+
+ arg( '--dpi', type=int, default=72,
+ help = "Image resolution in dpi [default 72]")
+ arg( '-C', '--color_condition', type=str, default=None,
+ help = "The name of the metadata column used for coloring")
+ arg( '-H', '--hatch_condition', type=str, default=None,
+ help = "The name of the metadata column used for hatching")
+ arg( '-G', '--group_condition', type=str, default=None,
+ help = "The name of the metadata column used for grouping")
+ arg( '-t', '--title', type=str, default=None,
+ help = "The title of the plot [default no title]")
+ arg( '-l', '--log_scale', action='store_true',
+ help = "Log scale" )
+
+
+ def __init__( self, numpy_matrix, metadata_matrix, args = None ):
+ self.numpy_matrix = numpy_matrix
+ self.mmatrix = metadata_matrix
+ self.args = args
+
+ def draw( self ):
+
+ fig = plt.figure( figsize=(20,8) )
+ ax = fig.add_subplot(111)
+
+ width = 0.65
+
+ names = list(self.numpy_matrix.index)
+ n0 = names[0]
+
+ tp = self.numpy_matrix.to_dict()
+
+ keys = sorted(tp)
+
+ if self.args.color_condition not in self.mmatrix:
+ self.args.color_condition = None
+ cond_values = [None] if self.args.color_condition is None else sorted(set(self.mmatrix[self.args.color_condition]) )
+ if self.args.hatch_condition not in self.mmatrix:
+ self.args.hatch_condition = None
+ hatch_values = [None] if self.args.hatch_condition is None else sorted(set(self.mmatrix[self.args.hatch_condition]) )
+
+ if self.args.group_condition:
+ group_values = list(sorted(set(self.mmatrix[self.args.group_condition])))
+ keys = sorted( keys, key=lambda x:group_values.index(self.mmatrix[self.args.group_condition][x]) )
+ else:
+ keys, group_values = sorted( keys ), []
+
+ ind = np.arange( len(tp) )
+ pos = ind-width/2
+
+ hatches = ['//','\\\\','++','--','xx']
+ cols = ['r','g','c','b']
+ minv,maxv = 0.0, max([v[n0] for v in tp.values()])
+
+ bar_sets = []
+ for i,c in (enumerate(cond_values) if len(cond_values) > 0 else None):
+ for j,h in enumerate(hatch_values):
+ values = [(tp[k][n0] if (c is None or self.mmatrix[self.args.color_condition][k] == c)
+ and (h is None or self.mmatrix[self.args.hatch_condition][k] == h) else 0.0) for k in keys]
+ b = ax.bar(pos, values, width, hatch=hatches[j%len(hatches)] if len(hatch_values) > 1 else "", color=cols[i%len(cols)])
+ cond = self.args.color_condition + " "+str(c).strip()+", " if c else ""
+ hatch = self.args.hatch_condition + " "+str(h).strip()+", " if h else ""
+ bar_sets.append( (b,cond+hatch) )
+
+ v0 = ind[0]-0.5
+ vm1 = v0
+ ax.plot([v0,v0],[minv,maxv],"--",linewidth=2,color='k')
+ for g in group_values:
+ vm1 = v0
+ v0 += list(self.mmatrix[self.args.group_condition]).count(g)
+ ax.plot([v0,v0],[minv,maxv],"--",linewidth=2,color='k')
+ ax.text( (vm1+v0)*0.5, maxv * 0.9, str(g), horizontalalignment='center', verticalalignment='center' )
+ #ax.text( (vm1+v0)*0.5, maxv * 0.9, str(round(g,1)), horizontalalignment='center', verticalalignment='center' )
+
+ if self.args.color_condition or self.args.hatch_condition:
+ leg = ax.legend( zip(*bar_sets)[0], zip(*bar_sets)[1], bbox_to_anchor=(1.02, 0,0.3,1), loc=1,
+ ncol=1, mode="expand", borderaxespad=0., frameon = False)
+
+ ax.set_xlim(-width,ind[-1]+width)
+ ax.set_ylim(0,maxv)
+ ax.set_xticks( ind )
+ ax.set_xticklabels( keys, rotation = 90 )
+ ax.set_title( self.args.title or "" )
+
+ if not self.args.out:
+ plt.show()
+ else:
+ fig.savefig( self.args.out, bbox_inches='tight', dpi = self.args.dpi,
+ bbox_extra_artists=((fig.get_axes()[0].get_legend(),) if self.args.color_condition or self.args.hatch_condition else None) ) #dpi = self.args.dpi )
+
+if __name__ == '__main__':
+
+ read = ReadCmd( )
+ read.check_consistency()
+ args = read.get_args()
+
+ dm = DataMatrix( args.inp, args )
+ mdm = MetadataMatrix( args.metadata_file, args )
+
+ bp = BarPlot( dm.get_numpy_matrix(), mdm.get_table(),args )
+ bp.draw()
+
+
+
+
diff --git a/utils/species2genomes.txt b/utils/species2genomes.txt
new file mode 100644
index 0000000..5beec92
--- /dev/null
+++ b/utils/species2genomes.txt
@@ -0,0 +1,7678 @@
+s__Streptomyces_clavuligerus 3 GCF_000154925 GCF_000148465 GCF_000163875
+s__Crinalium_epipsammum 1 GCF_000317495
+s__Cronobacter_phage_CR5 1 PRJNA209076
+s__Schlesneria_paludicola 1 GCF_000255655
+s__Abiotrophia_defectiva 1 GCF_000160075
+s__Indian_peanut_clump_virus 1 PRJNA14882
+s__Pseudomonas_sp_Lz4W 1 GCF_000346225
+s__Acinetobacter_phage_AP205 1 PRJNA14710
+s__Enterobacteria_phage_BP_4795 1 PRJNA14287
+s__Cronobacter_phage_CR3 1 PRJNA167658
+s__Carrot_mottle_mimic_virus 1 PRJNA15085
+s__Hirschia_maritima 1 GCF_000378345
+s__Schizosaccharomyces_pombe 1 GCA_000002945
+s__Pseudomonas_phage_NH_4 1 PRJNA181065
+s__Candidatus_Nitrosopumilus_sp_AR2 1 GCF_000299395
+s__Kadipiro_virus 1 PRJNA14858
+s__Brachyspira_hampsonii 2 GCF_000334935 GCF_000316195
+s__Propionibacterium_phage_PHL067M10 1 PRJNA219115
+s__Staphylococcus_phage_phiETA 1 PRJNA14141
+s__Circovirus_like_genome_BBC_A 1 PRJNA39611
+s__Tomato_leaf_deformation_virus 2 PRJNA178590 PRJNA52633
+s__Pseudomonas_phage_JBD30 1 PRJNA188536
+s__Staphylococcus_phage_Twort 1 PRJNA15246
+s__Acidiphilium_multivorum 1 GCF_000202835
+s__Alistipes_onderdonkii 1 GCF_000374505
+s__Porcine_sapelovirus 1 PRJNA15400
+s__Clostridium_ljungdahlii 1 GCF_000143685
+s__Torque_teno_virus 1 PRJNA70005
+s__Methanococcus_aeolicus 1 GCF_000017185
+s__Clostridium_phage_phiZP2 1 PRJNA169232
+s__Bacillus_fordii 1 GCF_000374565
+s__Agrococcus_pavilionensis 1 GCF_000400485
+s__Pseudomonas_sp_HPB0071 1 GCF_000478505
+s__Facklamia_ignava 1 GCF_000301055
+s__Alistipes_indistinctus 1 GCF_000231275
+s__Staphylococcus_phage_StB20 1 PRJNA184156
+s__Staphylococcus_phage_StB27 1 PRJNA184157
+s__Liao_ning_virus 1 PRJNA16336
+s__Synechococcus_phage_S_PM2 1 PRJNA15223
+s__Bacillus_sonorensis 1 GCF_000342105
+s__Streptomyces_coelicolor 1 GCF_000203835
+s__Candidatus_Aquiluna_sp_IMCC13023 1 GCF_000257665
+s__Gremmeniella_abietina_RNA_virus_MS1 1 PRJNA14836
+s__Streptomyces_davawensis 1 GCF_000349325
+s__Streptococcus_equinus 2 GCF_000146405 GCF_000187265
+s__Exiguobacterium_sp_AT1b 1 GCF_000023045
+s__Leucas_zeylanica_yellow_vein_virus_satellite_DNA_beta 1 PRJNA41305
+s__Hyposoter_fugitivus_ichnovirus 1 PRJNA18779
+s__Hoeflea_sp_108 1 GCF_000372965
+s__Vallota_speciosa_virus 1 PRJNA167578
+s__Human_papillomavirus_126_like_viruses 1 PRJNA76727
+s__Salmonella_phage_RE_2010 1 PRJNA181070
+s__Lactobacillus_sakei 2 GCF_000026065 GCF_000478625
+s__Mycobacterium_phage_Hamulus 1 PRJNA215116
+s__Burkholderia_ambifaria 4 GCF_000181975 GCF_000019925 GCF_000182015 GCF_000203915
+s__Streptomyces_filamentosus 2 GCF_000156455 GCF_000156695
+s__Leptotrichia_wadei 2 GCF_000373345 GCF_000469405
+s__zeta_proteobacterium_SCGC_AB_602_C20 1 GCF_000379345
+s__Rhizobium_phage_RR1_B 1 PRJNA209212
+s__Leptospira_fainei 1 GCF_000306235
+s__Acanthocystis_turfacea_Chlorella_virus_1 1 PRJNA18527
+s__Nora_virus 1 PRJNA16656
+s__Wasabi_mottle_virus 1 PRJNA14733
+s__Papaya_leaf_curl_virus_betasatellite 1 PRJNA14448
+s__Botrytis_cinerea_mitovirus_1 1 PRJNA32247
+s__Razdan_virus 2 PRJNA225931 PRJNA226013
+s__Nectria_haematococca 1 GCA_000151355
+s__Verminephrobacter_eiseniae 1 GCF_000015565
+s__Desulfovibrio_gigas 1 GCF_000468495
+s__Paenibacillus_sp_HGF7 1 GCF_000214295
+s__Streptomyces_rimosus 1 GCF_000331185
+s__Coprothermobacter_platensis 1 GCF_000378005
+s__Sclerotinia_sclerotiorum 1 GCA_000146945
+s__Burkholderia_phage_BcepNazgul 1 PRJNA14305
+s__Candidatus_Nitrososphaera_gargensis 1 GCF_000303155
+s__Fischerella_sp_JSC_11 1 GCF_000231365
+s__Corynebacterium_efficiens 2 GCF_000011305 GCF_000160795
+s__Leptolyngbya_sp_PCC_7375 1 GCF_000316115
+s__Eubacterium_cellulosolvens 1 GCF_000183525
+s__Mycobacterium_phage_SiSi 1 PRJNA206026
+s__Leptolyngbya_sp_PCC_7376 1 GCF_000316605
+s__Oceanithermus_profundus 1 GCF_000183745
+s__Lactococcus_phage_r1t 1 PRJNA14225
+s__Chinese_wheat_mosaic_virus 1 PRJNA14694
+s__Mycobacterium_phage_Lockley 1 PRJNA30519
+s__Pseudoalteromonas_undina 1 GCF_000238275
+s__Persea_americana_endornavirus 1 PRJNA81035
+s__Pyrococcus_horikoshii 1 GCF_000011105
+s__Banana_streak_UI_virus 1 PRJNA66611
+s__Ruania_albidiflava 1 GCF_000421225
+s__Eclipta_yellow_vein_virus 1 PRJNA81215
+s__Blueberry_virus_A 1 PRJNA173920
+s__Eastern_equine_encephalitis_virus 1 PRJNA15429
+s__Nocardioidaceae_bacterium_Broad_1 1 GCF_000192415
+s__Clostridium_novyi 1 GCF_000014125
+s__Veillonella_sp_oral_taxon_780 1 GCF_000221605
+s__Thioalkalivibrio_sp_ARh3 1 GCF_000377265
+s__Grimontia_hollisae 1 GCF_000176515
+s__Thioalkalivibrio_sp_ARh5 1 GCF_000381805
+s__Thioalkalivibrio_sp_ARh4 1 GCF_000378265
+s__Meiothermus_timidus 1 GCF_000373205
+s__Niabella_aurantiaca 1 GCF_000374125
+s__Burkholderia_kururiensis 1 GCF_000341045
+s__Morogoro_virus 1 PRJNA39791
+s__Mycobacterium_phage_WIVsmall 1 PRJNA206482
+s__Montana_myotis_leukoencephalitis_virus 1 PRJNA15402
+s__Mycobacterium_phage_Trouble 1 PRJNA215119
+s__Collinsella_aerofaciens 1 GCF_000169035
+s__Vernonia_yellow_vein_Fujian_virus_betasatellite 1 PRJNA72143
+s__Phipapillomavirus_1 1 PRJNA16815
+s__Gloeobacter_kilaueensis 1 GCF_000484535
+s__Barley_yellow_mosaic_virus 1 PRJNA15362
+s__Corchorus_yellow_vein_mosaic_betasatellite 1 PRJNA192608
+s__Lactobacillus_prophage_Lj928 1 PRJNA14350
+s__Mycoplasma_bovigenitalium 1 GCF_000367805
+s__Streptomyces_sp_KhCrAH_244 1 GCF_000373505
+s__Thiorhodococcus_drewsii 1 GCF_000224065
+s__Streptomyces_ghanaensis 1 GCF_000156435
+s__Beet_black_scorch_virus_satellite_RNA 1 PRJNA14623
+s__Spodoptera_litura_nucleopolyhedrovirus 1 PRJNA14138
+s__Eubacterium_dolichum 1 GCF_000154285
+s__Burkholderia_ubonensis 1 GCF_000170335
+s__Eupatorium_vein_clearing_virus 1 PRJNA29879
+s__Roseobacter_litoralis 1 GCF_000154785
+s__Sphaerochaeta_pleomorpha 1 GCF_000236685
+s__Erwinia_phage_vB_EamP_L1 1 PRJNA181229
+s__Alloiococcus_otitis 1 GCF_000315445
+s__Minute_virus_of_mice 1 PRJNA14019
+s__Gremmeniella_abietina_RNA_virus_L1 1 PRJNA14824
+s__Bacillus_phage_BCP78 1 PRJNA177518
+s__Gremmeniella_abietina_RNA_virus_L2 1 PRJNA15230
+s__Kamiti_River_virus 1 PRJNA14896
+s__Dialister_succinatiphilus 1 GCF_000242435
+s__Hop_latent_virus 1 PRJNA15373
+s__Staphylococcus_pettenkoferi 1 GCF_000260275
+s__Poinsettia_mosaic_virus 1 PRJNA15366
+s__Corynebacterium_maris 1 GCF_000442645
+s__Thermodesulfatator_indicus 1 GCF_000217795
+s__Tomato_leaf_curl_Bangladesh_virus 1 PRJNA14245
+s__Propionibacterium_phage_P100D 1 PRJNA177534
+s__Tomato_leaf_curl_Hajipur_betasatellite 1 PRJNA175587
+s__Impatiens_necrotic_spot_virus 1 PRJNA14767
+s__Salinivibrio_costicola 1 GCF_000390145
+s__Enterobacteria_phage_mEp043_c_1 1 PRJNA183145
+s__Brochothrix_phage_BL3 1 PRJNA64549
+s__Enterococcus_sp_GMD2E 1 GCF_000296895
+s__Propionibacterium_avidum 3 GCF_000463645 GCF_000227295 GCF_000367205
+s__Haloferax_sp_ATCC_BAA_646 1 GCF_000336855
+s__Haloferax_sp_ATCC_BAA_645 1 GCF_000336835
+s__Haloferax_sp_ATCC_BAA_644 1 GCF_000336975
+s__Halosarcina_pallida 1 GCF_000337095
+s__Tobacco_curly_shoot_virus 1 PRJNA15257
+s__Tetrasphaera_elongata 1 GCF_000367525
+s__Banana_streak_UL_virus 1 PRJNA66613
+s__Broad_bean_necrosis_virus 1 PRJNA14870
+s__Capnocytophaga_sp_oral_taxon_380 1 GCF_000318255
+s__Nonlabens_dokdonensis 1 GCF_000332115
+s__Belliella_baltica 1 GCF_000265405
+s__Porcine_stool_associated_circular_virus_2 1 PRJNA202891
+s__Tomato_leaf_curl_Bangalore_virus_Ban5_satellite_DNA_beta 1 PRJNA28067
+s__Caldicellulosiruptor_owensensis 1 GCF_000166335
+s__Methanobrevibacter_smithii 23 GCF_000189975 GCF_000190115 GCF_000190035 GCF_000151245 GCF_000189915 GCF_000190135 GCF_000190095 GCF_000016525 GCF_000190075 GCF_000190015 GCF_000189955 GCF_000190175 GCF_000190055 GCF_000189935 GCF_000189875 GCF_000189795 GCF_000189815 GCF_000151225 GCF_000189855 GCF_000189995 GCF_000189895 GCF_000189835 GCF_000190155
+s__Leifsonia_aquatica 1 GCF_000469485
+s__Sphaeropsis_sapinea_RNA_virus_2 1 PRJNA14687
+s__Ancylobacter_sp_FA202 1 GCF_000380205
+s__Afipia_birgiae 1 GCF_000308295
+s__Brachybacterium_faecium 1 GCF_000023405
+s__Grapevine_leafroll_associated_virus_2 1 PRJNA15884
+s__Brucella_ovis 16 GCF_000366065 GCF_000370905 GCF_000365985 GCF_000365965 GCF_000367085 GCF_000371345 GCF_000366045 GCF_000413515 GCF_000365885 GCF_000365905 GCF_000365945 GCF_000365925 GCF_000366005 GCF_000370885 GCF_000016845 GCF_000367065
+s__Lactococcus_phage_bIL311 1 PRJNA14139
+s__Lactococcus_phage_bIL310 1 PRJNA14112
+s__Grapevine_leafroll_associated_virus_6 1 PRJNA77937
+s__Bacteroides_plebeius 1 GCF_000187895
+s__Grapevine_leafroll_associated_virus_4 1 PRJNA77935
+s__Grapevine_leafroll_associated_virus_5 1 PRJNA74429
+s__Meiothermus_ruber 2 GCF_000376665 GCF_000024425
+s__Thermus_phage_IN93 1 PRJNA14235
+s__Tiger_puffer_nervous_necrosis_virus 1 PRJNA41607
+s__Bradyrhizobium_sp_CCGE_LA001 1 GCF_000296215
+s__Eubacterium_infirmum 1 GCF_000242675
+s__Agrobacterium_sp_H13_3 1 GCF_000192635
+s__Vibrio_splendidus 13 GCF_000272105 GCF_000222625 GCF_000256485 GCF_000152765 GCF_000272345 GCF_000272245 GCF_000272225 GCF_000272125 GCF_000272285 GCF_000272265 GCF_000272305 GCF_000091465 GCF_000272325
+s__Eel_picornavirus_1 1 PRJNA219023
+s__Tomato_leaf_curl_virus_Pune_associated_DNA_beta 1 PRJNA18001
+s__Oryza_sativa_endornavirus 1 PRJNA16239
+s__Tomato_leaf_curl_Oman_virus 1 PRJNA52947
+s__Sphingomonas_sp_LH128 1 GCF_000293195
+s__Gallionella_sp_SCGC_AAA018_N21 1 GCF_000379385
+s__Acinetobacter_phage_Bphi_B1251 1 PRJNA181989
+s__Sclerophthora_macrospora_virus_B 1 PRJNA14912
+s__Sclerophthora_macrospora_virus_A 1 PRJNA14361
+s__Herbaspirillum_frisingense 1 GCF_000300975
+s__Nodularia_spumigena 2 GCF_000169135 GCF_000340565
+s__Cellulophaga_phage_phi12_2 1 PRJNA212943
+s__Fodinicurvata_sediminis 1 GCF_000420625
+s__Dickeya_sp_D_s0432_1 1 GCF_000474655
+s__Sclerotinia_sclerotiorum_hypovirus_1 1 PRJNA72389
+s__Sodalis_phage_phiSG1 1 PRJNA16583
+s__Tomato_chocolate_spot_virus 1 PRJNA39867
+s__Methylobacterium_radiotolerans 1 GCF_000019725
+s__Propionibacterium_sp_KPL1852 1 GCF_000477695
+s__Propionibacterium_sp_KPL1854 1 GCF_000477815
+s__Pelargonium_line_pattern_virus 1 PRJNA15413
+s__Citrobacter_sp_A1 1 GCF_000277565
+s__Melissococcus_plutonius 1 GCF_000270185
+s__Postia_placenta 1 GCA_000006255
+s__Sordaria_macrospora 1 GCA_000182805
+s__Vibrio_sp_624788 1 GCF_000316985
+s__Euphorbia_leaf_curl_virus 1 PRJNA14341
+s__Halomonas_smyrnensis 1 GCF_000265245
+s__actinobacterium_SCGC_AAA028_A23 1 GCF_000378905
+s__Thermus_islandicus 1 GCF_000421625
+s__Pineapple_bacilliform_comosus_virus 1 PRJNA60049
+s__Mycobacterium_phage_LeBron 1 PRJNA51673
+s__Clostridium_nexile 1 GCF_000156035
+s__Pseudomonas_phage_phi297 1 PRJNA82641
+s__Sida_yellow_net_virus 1 PRJNA189215
+s__Methanothermobacter_phage_psiM100 1 PRJNA14289
+s__Bacteroides_helcogenes 1 GCF_000186225
+s__Bacillus_phage_PBC1 1 PRJNA167662
+s__Anaerostipes_caccae 1 GCF_000154305
+s__Thermococcus_barophilus 1 GCF_000151105
+s__Rhizobium_phaseoli 1 GCF_000268285
+s__Cyprinid_herpesvirus_2 1 PRJNA181228
+s__Cyprinid_herpesvirus_3 1 PRJNA19059
+s__Himetobi_P_virus 1 PRJNA14801
+s__Cyprinid_herpesvirus_1 1 PRJNA181227
+s__Granulicella_tundricola 1 GCF_000178975
+s__Bacillus_halodurans 1 GCF_000011145
+s__Pseudomonas_phage_Phi_S1 1 PRJNA197298
+s__Treponema_vincentii 2 GCF_000412995 GCF_000175895
+s__Nitrospina_gracilis 1 GCF_000341545
+s__Ourmia_melon_virus 1 PRJNA30737
+s__Psittacid_herpesvirus_1 1 PRJNA14314
+s__Prevotella_oris 3 GCF_000142965 GCF_000162915 GCF_000377685
+s__Melon_chlorotic_mosaic_virus_associated_alphasatellite 1 PRJNA51413
+s__Ustilaginoidea_virens_RNA_virus_1 1 PRJNA196971
+s__Cyclovirus_PKgoat11_PAK_2009 1 PRJNA61949
+s__Xanthomonas_axonopodis 70 GCF_000266345 GCF_000266045 GCF_000266165 GCF_000266545 GCF_000265905 GCF_000266405 GCF_000265605 GCF_000266425 GCF_000266305 GCF_000265705 GCF_000259445 GCF_000309925 GCF_000266725 GCF_000265665 GCF_000285775 GCF_000266245 GCF_000265865 GCF_000266665 GCF_000265765 GCF_000265985 GCF_000266505 GCF_000265745 GCF_000266145 GCF_000266085 GCF_000266385 GCF_000266025 GCF_000309905 GCF_000266325 GCF_000265845 GCF_000265565 GCF_000265945 GCF_000266445 GCF_000265625 G [...]
+s__Cupriavidus_taiwanensis 2 GCF_000372525 GCF_000069785
+s__Sclerotinia_sclerotiorum_endornavirus_1 1 PRJNA210796
+s__Canine_adenovirus_A 1 PRJNA14516
+s__Rinderpest_virus 1 PRJNA15050
+s__Pseudomonas_fuscovaginae 3 GCF_000251185 GCF_000280575 GCF_000364705
+s__Arabis_mosaic_virus_large_satellite_RNA 1 PRJNA14752
+s__Lymphocystis_disease_virus_isolate_China 1 PRJNA14472
+s__Haemophilus_parasuis 13 GCF_000444625 GCF_000444605 GCF_000444585 GCF_000021885 GCF_000172375 GCF_000478405 GCF_000444685 GCF_000444705 GCF_000444545 GCF_000444645 GCF_000444665 GCF_000439395 GCF_000444565
+s__Vibrio_phage_pYD21_A 1 PRJNA195477
+s__Infectious_spleen_and_kidney_necrosis_virus 1 PRJNA14600
+s__Niastella_koreensis 1 GCF_000246855
+s__Arcobacter_butzleri 4 GCF_000014025 GCF_000215345 GCF_000284355 GCF_000185325
+s__Synechococcus_sp_WH_7805 1 GCF_000153285
+s__Rhodobacter_capsulatus 7 GCF_000506565 GCF_000021865 GCF_000506425 GCF_000505785 GCF_000506545 GCF_000506525 GCF_000506965
+s__Synechococcus_sp_WH_7803 1 GCF_000063505
+s__Borrelia_burgdorferi 16 GCF_000172315 GCF_000166635 GCF_000181575 GCF_000008685 GCF_000181855 GCF_000382565 GCF_000171755 GCF_000021405 GCF_000171735 GCF_000181555 GCF_000181715 GCF_000172255 GCF_000444465 GCF_000172335 GCF_000172295 GCF_000166655
+s__Pseudoalteromonas_tunicata 1 GCF_000153245
+s__Solitalea_canadensis 1 GCF_000242635
+s__Methylobacterium_sp_4_46 1 GCF_000019365
+s__Mouse_parvovirus_3 1 PRJNA17123
+s__Mouse_parvovirus_1 1 PRJNA14325
+s__Mouse_parvovirus_4 1 PRJNA33009
+s__Mycoplasma_wenyonii 1 GCF_000277795
+s__Desulfobacca_acetoxidans 1 GCF_000195295
+s__Agropyron_mosaic_virus 1 PRJNA15063
+s__Spodoptera_frugiperda_ascovirus_1a 1 PRJNA17721
+s__Potato_yellow_dwarf_virus 1 PRJNA74995
+s__Bacillus_phage_Fah 1 PRJNA16382
+s__Mycobacterium_sp_MCS 1 GCF_000014165
+s__Glaciecola_sp_4H_3_7_YE_5 1 GCF_000212335
+s__Canine_papillomavirus_14 1 PRJNA183910
+s__Corynebacterium_sp_KPL1859 1 GCF_000478015
+s__Grapevine_vein_clearing_virus 1 PRJNA70007
+s__Sulfurihydrogenibium_sp_YO3AOP1 1 GCF_000020325
+s__Corynebacterium_sp_KPL1855 1 GCF_000478075
+s__Corynebacterium_sp_KPL1856 1 GCF_000478055
+s__Corynebacterium_sp_KPL1857 1 GCF_000478035
+s__Aggregatibacter_segnis 1 GCF_000185305
+s__Streptomyces_sp_CNT302 1 GCF_000377525
+s__Megamonas_hypermegale 1 GCF_000209975
+s__Brucella_sp_56_94 1 GCF_000370925
+s__Clerodendrum_golden_mosaic_virus 1 PRJNA29849
+s__Candidatus_Chloracidobacterium_thermophilum 1 GCF_000226295
+s__Culex_originated_Tymoviridae_like_virus 1 PRJNA176434
+s__Pseudomonas_phage_phi_2 1 PRJNA42717
+s__Tomato_yellow_leaf_curl_Vietnam_virus 1 PRJNA19785
+s__Cryptosporidium_muris 1 GCA_000006515
+s__Alicyclobacillus_acidocaldarius 3 GCF_000024285 GCF_000173835 GCF_000219875
+s__Paramecium_bursaria_Chlorella_virus_1 1 PRJNA14564
+s__Gemella_sanguinis 1 GCF_000204335
+s__Equine_foamy_virus 1 PRJNA14738
+s__East_African_cassava_mosaic_Cameroon_virus 1 PRJNA15180
+s__Mosquito_flavivirus 1 PRJNA198479
+s__Banana_streak_Mysore_virus 1 PRJNA15234
+s__Pseudomonas_syringae 62 GCF_000416805 GCF_000416865 GCF_000282735 GCF_000344515 GCF_000012245 GCF_000344435 GCF_000245435 GCF_000416665 GCF_000233835 GCF_000416585 GCF_000344375 GCF_000416945 GCF_000344475 GCF_000145925 GCF_000177515 GCF_000416705 GCF_000344555 GCF_000245395 GCF_000416485 GCF_000344455 GCF_000416545 GCF_000412165 GCF_000333995 GCF_000344495 GCF_000416885 GCF_000344335 GCF_000416845 GCF_000416685 GCF_000416905 GCF_000233795 GCF_000331385 GCF_000344395 GCF_000416645 GCF [...]
+s__Cyanophage_P_SSP2 1 PRJNA81179
+s__Bacteroides_nordii 1 GCF_000273175
+s__Bacillus_atrophaeus 3 GCF_000385965 GCF_000264395 GCF_000165925
+s__Enterobacteria_phage_vB_EcoS_ACG_M12 1 PRJNA179414
+s__Enterobacterial_phage_mEp213 1 PRJNA183152
+s__Salinarchaeum_sp_Harcht_Bsk1 1 GCF_000403645
+s__Ponticaulis_koreensis 1 GCF_000420665
+s__Methylobacterium_sp_WSM2598 1 GCF_000379105
+s__Halorubrum_lacusprofundi 1 GCF_000022205
+s__Tobacco_rattle_virus 1 PRJNA14808
+s__Anguillid_rhabdovirus 1 PRJNA224248
+s__Arthrobacter_sp_TB_23 1 GCF_000294595
+s__Flexistipes_sinusarabici 1 GCF_000218625
+s__Bovine_parvovirus 1 PRJNA14020
+s__Frateuria_aurantia 1 GCF_000242255
+s__Rickettsia_africae 1 GCF_000023005
+s__Staphylococcus_sp_EGD_HP3 1 GCF_000463545
+s__Goose_hemorrhagic_polyomavirus 1 PRJNA14286
+s__Hoeflea_phototrophica 1 GCF_000154705
+s__Tibrogargan_virus 1 PRJNA194142
+s__Rubritalea_marina 1 GCF_000378105
+s__Streptococcus_tigurinus 4 GCF_000442155 GCF_000344275 GCF_000442175 GCF_000344255
+s__O_nyong_nyong_virus 1 PRJNA15311
+s__Shewanella_denitrificans 1 GCF_000013765
+s__Tomato_spotted_wilt_virus 1 PRJNA14997
+s__Phthorimaea_operculella_granulovirus 1 PRJNA14202
+s__Ateles_paniscus_polyomavirus_1 1 PRJNA183902
+s__Pseudomonas_phage_73 1 PRJNA16384
+s__Pepper_leaf_curl_Bangladesh_virus 1 PRJNA14218
+s__Rosellinia_necatrix_quadrivirus_1 1 PRJNA82351
+s__Ictalurid_herpesvirus_1 1 PRJNA14018
+s__Mycobacterium_colombiense 1 GCF_000222105
+s__Aeromonas_phage_25 1 PRJNA17105
+s__Pseudomonas_mandelii 2 GCF_000257545 GCF_000381285
+s__Rubella_virus 1 PRJNA15315
+s__Arcobacter_sp_L 1 GCF_000284235
+s__Synechococcus_phage_Syn19 1 PRJNA64709
+s__Oceanicola_granulosus 1 GCF_000153305
+s__Homalodisca_vitripennis_reovirus 1 PRJNA36621
+s__Thermococcus_prieurii_virus_1 1 PRJNA84407
+s__Pepper_huasteco_yellow_vein_virus 1 PRJNA14059
+s__Pseudanabaena_sp_PCC_7367 1 GCF_000317065
+s__Sphingomonas_sp_S17 1 GCF_000211795
+s__Paracoccus_sp_TRP 1 GCF_000185925
+s__Arthrobacter_aurescens 1 GCF_000014925
+s__Tomato_yellow_leaf_curl_virus 1 PRJNA15182
+s__Rickettsia_massiliae 2 GCF_000016625 GCF_000283855
+s__Synechococcus_phage_metaG_MbCM1 1 PRJNA181073
+s__Roseobacter_phage_SIO1 1 PRJNA14308
+s__Thiothrix_nivea 1 GCF_000260135
+s__Mycobacterium_phage_Porky 1 PRJNA30699
+s__Enterobacter_cancerogenus 1 GCF_000155995
+s__Streptomyces_sp_HmicA12 1 GCF_000373565
+s__Papaya_leaf_curl_Guandong_virus 1 PRJNA14537
+s__Trichomonas_vaginalis_virus_2 1 PRJNA14822
+s__Trichomonas_vaginalis_virus_3 1 PRJNA14837
+s__Clostridium_sp_ATCC_29733 1 GCF_000466605
+s__Mycobacterium_rhodesiae 2 GCF_000230895 GCF_000230935
+s__Lactobacillus_phage_phiadh 1 PRJNA14588
+s__Pseudomonas_sp_R62 1 GCF_000257605
+s__Bat_coronavirus_BM48_31_BGR_2008 1 PRJNA51751
+s__Clavibacter_phage_CMP1 1 PRJNA42947
+s__Methanosaeta_harundinacea 1 GCF_000235565
+s__Escherichia_phage_phiV10 1 PRJNA16381
+s__Desulfovibrio_oxyclinae 1 GCF_000375485
+s__Pseudomonas_sp_GM78 1 GCF_000282475
+s__Morganella_morganii 1 GCF_000286435
+s__Grapevine_leafroll_associated_virus_1 1 PRJNA80677
+s__Rhodopirellula_sp_SWK7 1 GCF_000346425
+s__Pectobacterium_phage_PP1 1 PRJNA181988
+s__Psychromonas_ossibalaenae 1 GCF_000381745
+s__Acidianus_two_tailed_virus 1 PRJNA15686
+s__Bandicoot_papillomatosis_carcinomatosis_virus_type_1 1 PRJNA27985
+s__Bandicoot_papillomatosis_carcinomatosis_virus_type_2 1 PRJNA30081
+s__Chloroflexus_aurantiacus 1 GCF_000018865
+s__Agromyces_subbeticus 1 GCF_000421565
+s__Roseomonas_cervicalis 1 GCF_000164635
+s__Acinetobacter_sp_CIP_102529 1 GCF_000368325
+s__Enterobacteria_phage_G4_sensu_lato 1 PRJNA14318
+s__Bacteroidales_bacterium_ph8 1 GCF_000311925
+s__Actinopolymorpha_alba 1 GCF_000373925
+s__Vibrio_ordalii 6 GCF_000287095 GCF_000287075 GCF_000287115 GCF_000287135 GCF_000287155 GCF_000257205
+s__actinobacterium_SCGC_AAA278_O22 1 GCF_000372185
+s__Kingella_oralis 1 GCF_000160435
+s__Acidianus_filamentous_virus_9 1 PRJNA29195
+s__Acidianus_filamentous_virus_8 1 PRJNA28079
+s__Acidianus_filamentous_virus_7 1 PRJNA28077
+s__Acidianus_filamentous_virus_6 1 PRJNA28075
+s__Atkinsonella_hypoxylon_virus 1 PRJNA14164
+s__Acidianus_filamentous_virus_3 1 PRJNA28073
+s__Acidianus_filamentous_virus_2 1 PRJNA20965
+s__Luffa_yellow_mosaic_virus 1 PRJNA14290
+s__Perina_nuda_virus 1 PRJNA14717
+s__Citrus_tristeza_virus 1 PRJNA15334
+s__Fischerella_sp_PCC_9339 1 GCF_000315585
+s__Shuni_virus 1 PRJNA173357
+s__Rickettsia_conorii 4 GCF_000007025 GCF_000263815 GCF_000257435 GCF_000261325
+s__Cotesia_congregata_bracovirus 1 PRJNA14556
+s__Sphaerochaeta_globosa 1 GCF_000190435
+s__Pseudomonas_aeruginosa 137 GCF_000481865 GCF_000481845 GCF_000481825 GCF_000480965 GCF_000467675 GCF_000258285 GCF_000481785 GCF_000482005 GCF_000481205 GCF_000480765 GCF_000480865 GCF_000481425 GCF_000481325 GCF_000480395 GCF_000480685 GCF_000480845 GCF_000481305 GCF_000481385 GCF_000481025 GCF_000359505 GCF_000265035 GCF_000480475 GCF_000290555 GCF_000480515 GCF_000259025 GCF_000481045 GCF_000481885 GCF_000481945 GCF_000481145 GCF_000481905 GCF_000480745 GCF_000481405 GCF_000226155 [...]
+s__Nitrospina_sp_AB_629_B06 1 GCF_000375745
+s__Atopobium_rimae 1 GCF_000174015
+s__Burkholderia_thailandensis 8 GCF_000170315 GCF_000012365 GCF_000170395 GCF_000179515 GCF_000170495 GCF_000266985 GCF_000385525 GCF_000152285
+s__Delftia_acidovorans 3 GCF_000411215 GCF_000018665 GCF_000411195
+s__Shigella_phage_Shfl1 1 PRJNA66345
+s__Mycobacterium_sp_141 1 GCF_000382405
+s__Xanthomonas_phage_OP2 1 PRJNA16300
+s__Xanthomonas_phage_OP1 1 PRJNA16299
+s__Nudaurelia_capensis_beta_virus 1 PRJNA14982
+s__Enterobacteria_phage_HK106 1 PRJNA183158
+s__Bacteroides_sp_9_1_42FAA 1 GCF_000157075
+s__Giardia_intestinalis 1 GCA_000002435
+s__Natronococcus_occultus 1 GCF_000328685
+s__Azoarcus_sp_KH32C 1 GCF_000349945
+s__Curtobacterium_ginsengisoli 1 GCF_000419445
+s__Tomato_chlorotic_leaf_distortion_virus 1 PRJNA72721
+s__Bdellovibrio_bacteriovorus 2 GCF_000196175 GCF_000317895
+s__Kytococcus_sedentarius 1 GCF_000023925
+s__Okra_enation_leaf_curl_betasatellite 1 PRJNA61781
+s__Thermoanaerobacter_indiensis 1 GCF_000373165
+s__Marichromatium_purpuratum 1 GCF_000224005
+s__Sweet_potato_latent_virus 1 PRJNA196180
+s__Thauera_phenylacetica 1 GCF_000310225
+s__Desulfomonile_tiedjei 1 GCF_000266945
+s__Bacillus_vallismortis 1 GCF_000245315
+s__Lachnospiraceae_bacterium_1_4_56FAA 1 GCF_000218385
+s__Pseudomonas_phage_LMA2 1 PRJNA31055
+s__Elm_mottle_virus 1 PRJNA14760
+s__Cuban_alphasatellite_1 1 PRJNA210798
+s__Rudbeckia_flower_distortion_virus 1 PRJNA33679
+s__Human_polyomavirus_9 1 PRJNA63123
+s__Yersinia_phage_phiR201 1 PRJNA184144
+s__Hollyhock_yellow_vein_mosaic_virus 1 PRJNA81151
+s__Polaromonas_sp_JS666 1 GCF_000013865
+s__Equine_pegivirus_1 1 PRJNA196421
+s__Enterovibrio_calviensis 3 GCF_000286915 GCF_000286875 GCF_000286895
+s__Actinobacillus_minor 2 GCF_000175195 GCF_000174155
+s__Brochothrix_phage_A9 1 PRJNA64547
+s__Bathycoccus_sp_RCC1105_virus_BpV 1 PRJNA61009
+s__Shewanella_putrefaciens 2 GCF_000169215 GCF_000016585
+s__Blackberry_yellow_vein_associated_virus 1 PRJNA15168
+s__Burkholderia_sp_YI23 1 GCF_000236065
+s__Honeysuckle_yellow_vein_mosaic_disease_associated_satellite_DNA_beta 1 PRJNA19863
+s__Gardnerella_vaginalis 36 GCF_000414485 GCF_000414505 GCF_000263475 GCF_000178355 GCF_000263555 GCF_000414425 GCF_000414445 GCF_000414465 GCF_000176495 GCF_000263655 GCF_000414645 GCF_000263535 GCF_000165635 GCF_000414585 GCF_000214315 GCF_000414625 GCF_000414605 GCF_000414565 GCF_000414665 GCF_000263615 GCF_000159155 GCF_000176475 GCF_000263595 GCF_000414685 GCF_000263515 GCF_000263575 GCF_000414705 GCF_000213955 GCF_000165615 GCF_000414545 GCF_000414525 GCF_000263495 GCF_000025205 GC [...]
+s__Thermotoga_naphthophila 1 GCF_000025105
+s__Bhendi_yellow_vein_betasatellite 1 PRJNA14445
+s__Pedobacter_agri 1 GCF_000258495
+s__Hymenobacter_aerophilus 1 GCF_000382225
+s__Mycobacterium_phage_Sarfire 1 PRJNA219123
+s__Mal_de_Rio_Cuarto_virus 1 PRJNA18539
+s__Oat_blue_dwarf_virus 1 PRJNA15341
+s__Roseobacter_sp_GAI101 1 GCF_000156335
+s__Imperata_yellow_mottle_virus 1 PRJNA32677
+s__Lactobacillus_helveticus 6 GCF_000422165 GCF_000160855 GCF_000165775 GCF_000189515 GCF_000015385 GCF_000195355
+s__Ferroglobus_placidus 1 GCF_000025505
+s__Pseudomonas_phage_phiCTX 1 PRJNA14415
+s__Oceanobacillus_iheyensis 1 GCF_000011245
+s__Clostridium_butyricum 4 GCF_000355785 GCF_000182605 GCF_000171115 GCF_000371625
+s__Rickettsia_montanensis 1 GCF_000284175
+s__Idiomarina_sp_A28L 1 GCF_000218785
+s__Adoxophyes_honmai_entomopoxvirus_L 1 PRJNA203665
+s__Dialister_invisus 1 GCF_000160055
+s__Azospirillum_lipoferum 2 GCF_000010725 GCF_000283655
+s__Gillisia_limnaea 1 GCF_000243235
+s__Redspotted_grouper_nervous_necrosis_virus 1 PRJNA16819
+s__Bartonella_henselae 1 GCF_000046705
+s__Desulfovibrio_longus 1 GCF_000420485
+s__Burkholderia_phage_phiE125 1 PRJNA14330
+s__Verrucosispora_maris 1 GCF_000204155
+s__Pseudomonas_coronafaciens 1 GCF_000156995
+s__Cowpea_chlorotic_mottle_virus 1 PRJNA14758
+s__Staphylococcus_phage_P68 1 PRJNA14269
+s__Lactococcus_phage_1706 1 PRJNA29283
+s__Simian_hemorrhagic_fever_virus 1 PRJNA14727
+s__Leptospira_wolffii 1 GCF_000306115
+s__Citromicrobium_sp_JLT1363 1 GCF_000186705
+s__Aquifex_aeolicus 1 GCF_000008625
+s__Bradyrhizobium_elkanii 2 GCF_000379145 GCF_000257685
+s__Enterobacteria_phage_SSL_2009a 1 PRJNA34919
+s__Desulfurococcus_fermentans 1 GCF_000231015
+s__Spirosoma_panaciterrae 1 GCF_000374025
+s__Mycobacterium_phage_Bxz2 1 PRJNA14275
+s__Mycobacterium_phage_Bxz1 1 PRJNA14309
+s__Xanthomonas_vesicatoria 1 GCF_000192025
+s__Bacillus_phage_Curly 1 PRJNA192873
+s__Brachybacterium_muris 1 GCF_000338055
+s__Pseudomonas_protegens 2 GCF_000012265 GCF_000397205
+s__Porcine_epidemic_diarrhea_virus 1 PRJNA14739
+s__Dragonfly_associated_mastrevirus 1 PRJNA181243
+s__Roseovarius_nubinhibens 1 GCF_000152625
+s__Cercopithecine_herpesvirus_9 1 PRJNA14596
+s__Cercopithecine_herpesvirus_5 1 PRJNA38429
+s__Allochromatium_vinosum 1 GCF_000025485
+s__Cercopithecine_herpesvirus_2 1 PRJNA14558
+s__Enterobacteria_phage_285P 1 PRJNA64539
+s__Flavobacterium_limnosediminis 1 GCF_000498535
+s__Torque_teno_sus_virus_1a 1 PRJNA48139
+s__Thermofilum_pendens 1 GCF_000015225
+s__Betacoronavirus_1 1 PRJNA15438
+s__Mycobacterium_phage_Corndog 1 PRJNA14272
+s__Pseudomonas_caeni 1 GCF_000421765
+s__Leptotrichia_shahii 1 GCF_000373045
+s__Marinobacter_algicola 1 GCF_000170835
+s__Mycobacterium_phage_Phelemich 1 PRJNA215112
+s__Methylovorus_glucosotrophus 1 GCF_000023745
+s__Sida_mosaic_Alagoas_virus 1 PRJNA81007
+s__Enterococcus_saccharolyticus 3 GCF_000234175 GCF_000407285 GCF_000407005
+s__Clostridium_clostridioforme 9 GCF_000371545 GCF_000371405 GCF_000371565 GCF_000371605 GCF_000371585 GCF_000371505 GCF_000371485 GCF_000371525 GCF_000234155
+s__Lactobacillus_reuteri 12 GCF_000159455 GCF_000179455 GCF_000159615 GCF_000179435 GCF_000236455 GCF_000168255 GCF_000010005 GCF_000016825 GCF_000160715 GCF_000159475 GCF_000410995 GCF_000439275
+s__Tomato_yellow_leaf_curl_Thailand_virus_associated_DNA_1 1 PRJNA14300
+s__Vibrio_sp_MED222 1 GCF_000153005
+s__Nocardiopsis_dassonvillei 1 GCF_000092985
+s__Spiroplasma_melliferum 2 GCF_000328865 GCF_000236085
+s__Agrotis_ipsilon_multiple_nucleopolyhedrovirus 1 PRJNA32171
+s__Stomatobaculum_longum 1 GCF_000242235
+s__Enterobacteria_phage_N15 1 PRJNA14086
+s__Adeno_associated_virus_7 1 PRJNA14454
+s__Adeno_associated_virus_5 1 PRJNA14426
+s__Adeno_associated_virus_4 1 PRJNA14030
+s__Acinetobacter_haemolyticus 6 GCF_000301715 GCF_000369085 GCF_000369065 GCF_000164055 GCF_000309035 GCF_000302315
+s__Adeno_associated_virus_2 1 PRJNA14060
+s__Adeno_associated_virus_1 1 PRJNA15323
+s__Adeno_associated_virus_8 1 PRJNA14455
+s__Halorubrum_kocurii 1 GCF_000337355
+s__Enterococcus_italicus 1 GCF_000185365
+s__Sorghum_mosaic_virus 1 PRJNA15098
+s__Lactococcus_phage_Tuc2009 1 PRJNA14131
+s__Natrialba_chahannaoensis 1 GCF_000337135
+s__Chelativorans_sp_BNC1 1 GCF_000014245
+s__Varibaculum_cambriense 1 GCF_000420065
+s__Dictyostelium_purpureum 1 GCA_000190715
+s__Enterobacteria_phage_mEp237 1 PRJNA183147
+s__Enterobacteria_phage_mEp235 1 PRJNA183146
+s__Corynebacterium_mastitidis 1 GCF_000375365
+s__Piper_yellow_mottle_virus 1 PRJNA219627
+s__Pantoea_ananatis 7 GCF_000285975 GCF_000285475 GCF_000270125 GCF_000233595 GCF_000025405 GCF_000475035 GCF_000283875
+s__Crenarchaeota_archaeon_SCGC_AAA471_B05 1 GCF_000380705
+s__Cronobacter_phage_ENT39118 1 PRJNA184168
+s__Rhodobacterales_bacterium_HTCC2255 1 GCF_000153745
+s__Lymphocytic_choriomeningitis_virus 1 PRJNA14862
+s__Mucilaginibacter_paludis 1 GCF_000166195
+s__Prochlorococcus_phage_MED4_213 1 PRJNA195505
+s__Hyphantria_cunea_nucleopolyhedrovirus 1 PRJNA16343
+s__Bacillus_anthracis 24 GCF_000167295 GCF_000181915 GCF_000167235 GCF_000295695 GCF_000219895 GCF_000181995 GCF_000167335 GCF_000292565 GCF_000021445 GCF_000258885 GCF_000008445 GCF_000022865 GCF_000181675 GCF_000278385 GCF_000182055 GCF_000181935 GCF_000167255 GCF_000008165 GCF_000167315 GCF_000181835 GCF_000167275 GCF_000319695 GCF_000007845 GCF_000319715
+s__Wheat_eqlid_mosaic_virus 1 PRJNA20763
+s__Burkholderia_mallei 10 GCF_000011705 GCF_000167635 GCF_000015605 GCF_000015625 GCF_000153085 GCF_000152305 GCF_000015465 GCF_000169875 GCF_000152385 GCF_000152405
+s__Diaporthe_ambigua_RNA_virus_1 1 PRJNA14962
+s__Tomato_begomovirus_satellite_DNA_beta 1 PRJNA14449
+s__Antheraea_pernyi_nucleopolyhedrovirus 1 PRJNA16793
+s__Macroptilium_yellow_net_virus 1 PRJNA124063
+s__Verrucomicrobia_bacterium_SCGC_AB_629_E09 1 GCF_000371985
+s__Nakamurella_multipartita 1 GCF_000024365
+s__Cycloclasticus_zancles 1 GCF_000442595
+s__Bacteroides_cellulosilyticus 2 GCF_000158035 GCF_000273015
+s__Equus_ferus_caballus_papillomavirus_type_4 1 PRJNA185426
+s__Equus_ferus_caballus_papillomavirus_type_5 1 PRJNA185427
+s__Pseudomonas_alcaliphila 1 GCF_000319815
+s__Equus_ferus_caballus_papillomavirus_type_7 1 PRJNA193979
+s__Frankia_sp_CN3 1 GCF_000235425
+s__Fritillary_virus_Y 1 PRJNA30175
+s__Duck_astrovirus_GII_A 1 PRJNA36399
+s__Enterobacteria_phage_NJ01 1 PRJNA177541
+s__Thioalkalivibrio_thiocyanodenitrificans 1 GCF_000378965
+s__Marinomonas_sp_MWYL1 1 GCF_000017285
+s__Polymorphum_gilvum 1 GCF_000192745
+s__Nocardioides_sp_Iso805N 1 GCF_000364605
+s__Ligustrum_necrotic_ringspot_virus 1 PRJNA28681
+s__Sphingobium_ummariense 1 GCF_000447205
+s__Brevibacillus_sp_BC25 1 GCF_000282075
+s__Thermus_phage_P74_26 1 PRJNA20767
+s__Steller_sea_lion_vesivirus 1 PRJNA30663
+s__Lolium_latent_virus 1 PRJNA28971
+s__Wheat_dwarf_India_virus 1 PRJNA162491
+s__Epinotia_aporema_granulovirus 1 PRJNA177904
+s__Cyanophage_KBS_P_1A 1 PRJNA195501
+s__Southern_tomato_virus 1 PRJNA32821
+s__Brucella_sp_UK1_97 1 GCF_000371045
+s__Mimosa_yellow_leaf_curl_virus_associated_DNA_1 1 PRJNA19817
+s__Banana_streak_OL_virus 1 PRJNA15239
+s__Bovine_herpesvirus_5 1 PRJNA14313
+s__Bovine_herpesvirus_4 1 PRJNA14110
+s__Wolbachia_endosymbiont_of_Diaphorina_citri 1 GCF_000331595
+s__Enterobacteria_phage_K1E 1 PRJNA16228
+s__Ralstonia_phage_p12J 1 PRJNA14307
+s__Enterococcus_phoeniculicola 2 GCF_000407505 GCF_000394035
+s__Pseudomonas_sp_CF161 1 GCF_000416215
+s__Human_parainfluenza_virus_1 1 PRJNA14743
+s__Human_parainfluenza_virus_2 1 PRJNA15421
+s__Human_parainfluenza_virus_3 1 PRJNA14706
+s__Calothrix_parietina 1 GCF_000317435
+s__Xanthomonas_alfalfae 1 GCF_000225915
+s__Pseudomonas_phage_tf 1 PRJNA167604
+s__Ochrobactrum_sp_CDB2 1 GCF_000344725
+s__Mopeia_Lassa_virus_reassortant_29 1 PRJNA15036
+s__Tomato_yellow_dwarf_disease_associated_satellite_DNA_beta_Kochi 1 PRJNA20983
+s__Melon_aphid_borne_yellows_virus 1 PRJNA30049
+s__Rhodococcus_qingshengii 1 GCF_000341815
+s__Turnip_mosaic_virus 1 PRJNA15408
+s__American_bat_vesiculovirus_TFFN_2013 1 PRJNA226731
+s__Lactobacillus_phage_JCL1032 1 PRJNA181076
+s__Synechococcus_sp_BL107 1 GCF_000153805
+s__Luffa_puckering_and_leaf_distortion_associated_DNA_beta 1 PRJNA15779
+s__Burkholderia_phage_BcepGomr 1 PRJNA19579
+s__Rickettsia_endosymbiont_of_Bemisia_tabaci 1 GCF_000265225
+s__Actinosynnema_mirum 1 GCF_000023245
+s__Halocynthia_phage_JM_2012 1 PRJNA167664
+s__Yellowtail_ascites_virus 1 PRJNA14852
+s__Streptomyces_pristinaespiralis 1 GCF_000154945
+s__Photobacterium_damselae 1 GCF_000176795
+s__Mesotoga_prima 1 GCF_000147715
+s__Vibrio_phage_pYD38_B 1 PRJNA209211
+s__Vibrio_phage_pYD38_A 1 PRJNA209063
+s__Synechococcus_sp_WH_5701 1 GCF_000153045
+s__Dyella_ginsengisoli 1 GCF_000334915
+s__Thalassiobium_sp_R2A62 1 GCF_000161835
+s__Diplodia_scrobiculata_RNA_virus_1 1 PRJNA43007
+s__Erwinia_pyrifoliae 2 GCF_000026985 GCF_000027265
+s__Megasphaera_sp_UPII_199_6 1 GCF_000214495
+s__Moroccan_pepper_virus 1 PRJNA185273
+s__Cucumber_mosaic_virus_satellite_RNA 1 PRJNA14568
+s__Tobacco_vein_clearing_virus 1 PRJNA14150
+s__Janthinobacterium_sp_Marseille 1 GCF_000013625
+s__Methanosphaera_stadtmanae 1 GCF_000012545
+s__Vibrio_phage_N4 1 PRJNA42785
+s__Serratia_sp_M24T3 1 GCF_000257645
+s__Mouse_kobuvirus_M_5_USA_2010 1 PRJNA72383
+s__Escherichia_sp_4_1_40B 1 GCF_000158415
+s__Sugarcane_striate_mosaic_associated_virus 1 PRJNA14819
+s__Colwellia_phage_9A 1 PRJNA169428
+s__Stackebrandtia_nassauensis 1 GCF_000024545
+s__Calothrix_sp_PCC_7507 1 GCF_000316575
+s__Lactobacillus_prophage_Lj965 1 PRJNA14351
+s__Bell_pepper_endornavirus 1 PRJNA70001
+s__Neisseria_lactamica 3 GCF_000193795 GCF_000196295 GCF_000173995
+s__Bibersteinia_trehalosi 1 GCF_000347595
+s__Burkholderia_glumae 4 GCF_000300395 GCF_000365245 GCF_000300755 GCF_000022645
+s__Hardenbergia_virus_A 1 PRJNA65813
+s__Moritella_sp_PE36 1 GCF_000170855
+s__Honeysuckle_yellow_vein_Japan_betasatellite 1 PRJNA19603
+s__Lactobacillus_mali 2 GCF_000260415 GCF_000276905
+s__Peptoniphilus_indolicus 1 GCF_000227315
+s__Propionibacterium_sp_409_HC1 1 GCF_000214515
+s__Prevotella_loescheii 1 GCF_000378085
+s__Thauera_sp_27 1 GCF_000310125
+s__Propionibacterium_freudenreichii 1 GCF_000091725
+s__Drosophila_x_virus 1 PRJNA14853
+s__Encephalomyocarditis_virus 1 PRJNA15307
+s__Mulberry_small_circular_viroid_like_RNA_1 1 PRJNA32241
+s__Corynebacterium_crenatum 1 GCF_000380545
+s__Pseudomonas_chlororaphis 4 GCF_000281915 GCF_000506385 GCF_000264555 GCF_000237045
+s__Murray_Valley_encephalitis_virus 1 PRJNA15430
+s__Verrucomicrobium_spinosum 1 GCF_000172155
+s__Clostridium_hathewayi 3 GCF_000235505 GCF_000160095 GCF_000371445
+s__Streptococcus_pyogenes_phage_315_3 1 PRJNA14529
+s__Streptococcus_pyogenes_phage_315_2 1 PRJNA14528
+s__Streptococcus_pyogenes_phage_315_1 1 PRJNA14533
+s__Pyrobaculum_spherical_virus 1 PRJNA14374
+s__Streptococcus_pyogenes_phage_315_6 1 PRJNA14532
+s__Streptococcus_pyogenes_phage_315_5 1 PRJNA14531
+s__Streptococcus_pyogenes_phage_315_4 1 PRJNA14530
+s__Cardiobacterium_hominis 1 GCF_000160655
+s__Human_papillomavirus_132_like_viruses 1 PRJNA62179
+s__Tomato_leaf_curl_Hsinchu_virus 1 PRJNA18627
+s__Porcine_coronavirus_HKU15 1 PRJNA109271
+s__Streptomyces_vitaminophilus 1 GCF_000380165
+s__Penaeus_monodon_hepatopancreatic_parvovirus 1 PRJNA32695
+s__Corynebacterium_doosanense 1 GCF_000372245
+s__Sporosarcina_newyorkensis 1 GCF_000220335
+s__Burkholderia_dolosa 1 GCF_000152585
+s__Salinicoccus_albus 1 GCF_000385175
+s__Pseudogulbenkiania_sp_NH8B 1 GCF_000283535
+s__Sodalis_phage_SO_1 1 PRJNA42597
+s__Serratia_phage_Eta 1 PRJNA209364
+s__Burkholderia_phage_BcepIL02 1 PRJNA38297
+s__Goose_adenovirus_A 1 PRJNA167579
+s__Vibrio_phage_Vf33 1 PRJNA14384
+s__Mycobacterium_phage_Adawi 1 PRJNA219121
+s__Tobacco_bushy_top_virus_satellite_like_RNA 1 PRJNA14511
+s__Mycobacterium_phage_Che12 1 PRJNA17143
+s__Burkholderia_phage_BcepC6B 1 PRJNA14379
+s__Acidothermus_cellulolyticus 1 GCF_000015025
+s__Veillonella_parvula 3 GCF_000177435 GCF_000024945 GCF_000215025
+s__Nitrosopumilus_sp_SJ 1 GCF_000328945
+s__Salmonella_phage_epsilon34 1 PRJNA33779
+s__Vibrio_shilonii 1 GCF_000181535
+s__Pantoea_dispersa 1 GCF_000465555
+s__Lily_mottle_virus 1 PRJNA15495
+s__Pepper_golden_mosaic_virus 1 PRJNA14210
+s__Invertebrate_iridescent_virus_9 1 PRJNA69999
+s__Coprobacter_fastidiosus 1 GCF_000473955
+s__Invertebrate_iridescent_virus_6 1 PRJNA14124
+s__Tsukamurella_phage_TPA2 1 PRJNA63439
+s__Invertebrate_iridescent_virus_3 1 PRJNA17099
+s__Bacillus_phage_GA_1 1 PRJNA15202
+s__Pseudomonas_umsongensis 1 GCF_000377725
+s__Thioalkalivibrio_sp_ALgr3 1 GCF_000377325
+s__Thioalkalivibrio_sp_ALgr1 1 GCF_000378285
+s__Thioalkalivibrio_sp_ALgr5 1 GCF_000381485
+s__Bacillus_phage_0305phi8_36 1 PRJNA20653
+s__Listeria_grayi 1 GCF_000148995
+s__Desulfovibrio_sp_Dsv1 1 GCF_000403945
+s__Trichodesmium_erythraeum 1 GCF_000014265
+s__Piliocolobus_rufomitratus_polyomavirus_1 1 PRJNA183909
+s__Streptococcus_gordonii 1 GCF_000017005
+s__Botrytis_porri_RNA_virus_1 1 PRJNA167870
+s__Bacillus_sp_916 1 GCF_000275785
+s__Tobacco_leaf_curl_Yunnan_virus_satellite_DNA_beta 1 PRJNA14539
+s__Staphylothermus_hellenicus 1 GCF_000092465
+s__Streptomyces_sp_Amel2xE9 1 GCF_000383935
+s__Rose_yellow_mosaic_virus 1 PRJNA178589
+s__Synechococcus_sp_PCC_6312 1 GCF_000316685
+s__Sulfolobus_islandicus_filamentous_virus 1 PRJNA14132
+s__Vervet_monkey_polyomavirus_1 1 PRJNA183709
+s__Porcine_cytomegalovirus 1 PRJNA217990
+s__Panax_virus_Y 1 PRJNA49715
+s__Pelagibacter_phage_HTVC011P 1 PRJNA192867
+s__Corynebacterium_ulceribovis 1 GCF_000372445
+s__Microbacterium_paraoxydans 1 GCF_000380465
+s__Bifidobacterium_thermophilum 1 GCF_000347695
+s__Parvimonas_micra 1 GCF_000154405
+s__Rift_Valley_fever_virus 1 PRJNA14631
+s__Ochrobactrum_intermedium 3 GCF_000332835 GCF_000182645 GCF_000472165
+s__Passion_fruit_woodiness_virus 1 PRJNA61089
+s__Tobacco_leaf_curl_PUSA_alphasatellite 1 PRJNA56023
+s__Gallibacterium_anatis 3 GCF_000379785 GCF_000464615 GCF_000209675
+s__Anaeromusa_acidaminophila 1 GCF_000374545
+s__Chilli_leaf_curl_betasatellite 1 PRJNA14441
+s__Mycobacterium_phage_KayaCho 1 PRJNA215111
+s__Lactobacillus_oris 2 GCF_000221505 GCF_000180015
+s__Poinsettia_latent_virus 1 PRJNA32691
+s__Haemophilus_parainfluenzae 4 GCF_000191405 GCF_000259485 GCF_000210895 GCF_000261285
+s__Mycobacterium_phage_ScottMcG 1 PRJNA31283
+s__Gracilimonas_tropica 1 GCF_000375425
+s__Selenomonas_sp_FOBRC6 1 GCF_000286455
+s__Junin_virus 1 PRJNA15028
+s__Tepidanaerobacter_acetatoxydans 2 GCF_000213235 GCF_000328765
+s__Mosquito_densovirus_BR_07 1 PRJNA62639
+s__Beet_chlorosis_virus 1 PRJNA14712
+s__Brugmansia_mild_mottle_virus 1 PRJNA30157
+s__Mycoplasma_auris 1 GCF_000367765
+s__Hippeastrum_mosaic_virus 1 PRJNA167580
+s__Chickpea_chlorotic_stunt_virus 1 PRJNA17363
+s__Candidatus_Puniceispirillum_marinum 1 GCF_000024465
+s__Lagos_bat_virus 1 PRJNA194143
+s__Hepatitis_C_virus 1 PRJNA15432
+s__Lactobacillus_salivarius 7 GCF_000215465 GCF_000143435 GCF_000260335 GCF_000008925 GCF_000179475 GCF_000159395 GCF_000217735
+s__Ornithinibacillus_scapharcae 1 GCF_000190475
+s__Puumala_virus 1 PRJNA14930
+s__Tomato_yellow_margin_leaf_curl_virus 1 PRJNA14371
+s__Moorea_producens 1 GCF_000211815
+s__Cronobacter_phage_ENT47670 1 PRJNA184169
+s__Cellulophaga_phage_phi4_1 1 PRJNA212952
+s__Oscillatoria_sp_PCC_10802 1 GCF_000332335
+s__Phormidium_phage_Pf_WMP3 1 PRJNA19801
+s__Corynebacterium_striatum 1 GCF_000159135
+s__Cucumber_mottle_virus 1 PRJNA18331
+s__Robiginitomaculum_antarcticum 1 GCF_000365025
+s__Eubacterium_sp_14_2 1 GCF_000403845
+s__Acidovorax_ebreus 1 GCF_000022305
+s__Thermus_phage_P23_77 1 PRJNA40235
+s__Pelagibacterium_halotolerans 1 GCF_000230555
+s__Oryctes_rhinoceros_virus 1 PRJNA32781
+s__Propionibacterium_phage_PA6 1 PRJNA19767
+s__Sphingomonas_wittichii 2 GCF_000259955 GCF_000016765
+s__Lactobacillus_mucosae 1 GCF_000248095
+s__Streptococcus_sp_C150 1 GCF_000187445
+s__Sida_yellow_vein_Madurai_virus 1 PRJNA19405
+s__Epsilonpapillomavirus_1 1 PRJNA14220
+s__Vibrio_breoganii 3 GCF_000280885 GCF_000286995 GCF_000286975
+s__Streptococcus_phage_ALQ13_2 1 PRJNA42593
+s__Stenotrophomonas_sp_SKA14 1 GCF_000158575
+s__Acinetobacter_sp_NIPH_2100 1 GCF_000369805
+s__Vibrio_phage_nt_1 1 PRJNA209064
+s__Polaribacter_sp_MED152 1 GCF_000152945
+s__Sporolactobacillus_laevolacticus 1 GCF_000497245
+s__Desulfovibrio_salexigens 1 GCF_000023445
+s__Yersinia_bercovieri 1 GCF_000167975
+s__Murine_pneumonia_virus 1 PRJNA15251
+s__Skua_adenovirus_A 1 PRJNA78801
+s__Mycobacterium_tuberculosis_bovis_africanum_canetti 82 GCF_000008585 GCF_000304555 GCF_000195835 GCF_000177835 GCF_000177855 GCF_000177975 GCF_000328825 GCF_000184005 GCF_000454345 GCF_000270365 GCF_000155245 GCF_000155225 GCF_000389945 GCF_000389925 GCF_000177895 GCF_000331445 GCF_000328805 GCF_000190335 GCF_000159755 GCF_000253355 GCF_000328785 GCF_000193185 GCF_000454325 GCF_000234725 GCF_000016145 GCF_000155305 GCF_000488915 GCF_000154605 GCF_000155165 GCF_000184025 GCF_000152105 G [...]
+s__Sandfly_fever_Naples_virus 1 PRJNA15053
+s__Sida_yellow_vein_Vietnam_alphasatellite 1 PRJNA19815
+s__Lettuce_yellow_mottle_virus 1 PRJNA32669
+s__Sulfobacillus_acidophilus 2 GCF_000219855 GCF_000237975
+s__Clostridium_bifermentans 2 GCF_000452225 GCF_000452245
+s__Phaeospirillum_molischianum 1 GCF_000294655
+s__Paenibacillus_sp_JCM_10914 1 GCF_000509425
+s__Chamaesiphon_minutus 1 GCF_000317145
+s__Desulfosporosinus_youngiae 1 GCF_000244895
+s__Haloferax_prahovense 1 GCF_000336815
+s__Kalanchoe_top_spotting_virus 1 PRJNA14236
+s__Staphylococcus_phage_phiETA2 1 PRJNA18669
+s__Staphylococcus_phage_phiETA3 1 PRJNA18671
+s__Rubrivivax_benzoatilyticus 1 GCF_000190375
+s__Acetonema_longum 1 GCF_000219125
+s__Enterobacteria_phage_vB_EcoM_VR7 1 PRJNA61099
+s__Enterococcus_sp_7L76 1 GCF_000210115
+s__Phaeocystis_globosa_virus_virophage 1 PRJNA206475
+s__Enterobacteria_phage_JSE 1 PRJNA38263
+s__KI_polyomavirus 1 PRJNA19155
+s__Halothece_sp_PCC_7418 1 GCF_000317635
+s__Enterobacteria_phage_JK06 1 PRJNA15569
+s__Bacteroides_sp_3_1_40A 1 GCF_000186105
+s__Enterococcus_phage_phiFL2A 1 PRJNA42795
+s__Pea_early_browning_virus 1 PRJNA15067
+s__Thielavia_terrestris 1 GCA_000226115
+s__Legionella_pneumophila 13 GCF_000465915 GCF_000306865 GCF_000306845 GCF_000092545 GCF_000347615 GCF_000465695 GCF_000048665 GCF_000465675 GCF_000404245 GCF_000455845 GCF_000239175 GCF_000048645 GCF_000092625
+s__Mexican_papita_viroid 1 PRJNA14773
+s__Mycobacterium_tusciae 1 GCF_000243415
+s__Planctomyces_brasiliensis 1 GCF_000165715
+s__Bacillus_phage_BCJA1c 1 PRJNA14548
+s__Eremothecium_gossypii 1 GCA_000091025
+s__Bacillus_methanolicus 2 GCF_000262755 GCF_000262735
+s__Squash_leaf_curl_virus 1 PRJNA14038
+s__Desulfotomaculum_hydrothermale 1 GCF_000315365
+s__Bhendi_yellow_vein_Bhubhaneswar_virus 1 PRJNA33885
+s__Budgerigar_fledgling_disease_polyomavirus 1 PRJNA14284
+s__Tobacco_curly_shoot_alphasatellite 1 PRJNA15480
+s__Synechococcus_phage_P60 1 PRJNA14628
+s__Cucumber_Bulgarian_virus 1 PRJNA14881
+s__Psychromonas_hadalis 1 GCF_000420245
+s__Kokobera_virus 1 PRJNA18843
+s__Thermanaerovibrio_velox 1 GCF_000237825
+s__Tobacco_leaf_curl_Yunnan_virus 1 PRJNA15258
+s__Streptomyces_flavogriseus 1 GCF_000176115
+s__Thiobacillus_denitrificans 2 GCF_000376425 GCF_000012745
+s__Francisella_novicida 7 GCF_000156415 GCF_000195555 GCF_000014645 GCF_000154265 GCF_000154185 GCF_000155755 GCF_000195535
+s__Nocardia_cyriacigeorgica 1 GCF_000284035
+s__Haemophilus_ducreyi 1 GCF_000007945
+s__Pseudomonas_sp_CF149 1 GCF_000416155
+s__Lactobacillus_johnsonii 6 GCF_000159355 GCF_000091405 GCF_000008065 GCF_000219475 GCF_000204985 GCF_000498675
+s__Halovirus_HCTV_5 1 PRJNA206497
+s__Halovirus_HCTV_2 1 PRJNA206498
+s__Sida_leaf_curl_virus 1 PRJNA16225
+s__Halovirus_HCTV_1 1 PRJNA206499
+s__Brucella_sp_04_5288 1 GCF_000480195
+s__Gordonia_amarae 1 GCF_000241345
+s__Spissistilus_festinus_reovirus 1 PRJNA83187
+s__Akabane_virus 1 PRJNA20971
+s__Grapevine_geminivirus 1 PRJNA165741
+s__Spirochaeta_alkalica 1 GCF_000373545
+s__Eupatorium_yellow_vein_betasatellite 1 PRJNA14447
+s__Chlamydia_trachomatis 73 GCF_000026905 GCF_000318645 GCF_000304495 GCF_000318785 GCF_000318985 GCF_000318905 GCF_000318885 GCF_000319045 GCF_000226605 GCF_000092725 GCF_000092665 GCF_000173535 GCF_000008725 GCF_000441635 GCF_000174055 GCF_000092805 GCF_000318845 GCF_000441815 GCF_000012125 GCF_000092485 GCF_000319005 GCF_000348845 GCF_000026925 GCF_000441655 GCF_000318545 GCF_000318665 GCF_000441615 GCF_000318605 GCF_000318825 GCF_000319125 GCF_000441755 GCF_000092685 GCF_000318565 GC [...]
+s__Spiroplasma_syrphidicola 1 GCF_000400955
+s__Suid_herpesvirus_1 1 PRJNA14424
+s__Sinorhizobium_meliloti 19 GCF_000287375 GCF_000287475 GCF_000287575 GCF_000304415 GCF_000287415 GCF_000375585 GCF_000287535 GCF_000287435 GCF_000147795 GCF_000287515 GCF_000287555 GCF_000287455 GCF_000346065 GCF_000147775 GCF_000320385 GCF_000218265 GCF_000006965 GCF_000236945 GCF_000287495
+s__Lactobacillus_murinus 1 GCF_000364205
+s__Caulobacter_sp_JGI_0001010_J14 1 GCF_000382625
+s__Enterobacteria_phage_13a 1 PRJNA30603
+s__Acinetobacter_guillouiae 2 GCF_000368145 GCF_000368485
+s__Rhodopirellula_baltica 4 GCF_000195185 GCF_000304635 GCF_000330745 GCF_000196115
+s__Crenarchaeota_archaeon_SCGC_AAA471_L14 1 GCF_000399825
+s__Influenza_B_virus 1 PRJNA14656
+s__Bovine_immunodeficiency_virus 1 PRJNA14634
+s__Fowl_adenovirus_A 2 PRJNA14522 PRJNA40323
+s__Fowl_adenovirus_C 1 PRJNA65223
+s__Fowl_adenovirus_B 1 PRJNA203280
+s__Fowl_adenovirus_E 1 PRJNA62241
+s__Fowl_adenovirus_D 2 PRJNA14523 PRJNA40321
+s__Eremothecium_cymbalariae 1 GCA_000235365
+s__Peptostreptococcus_stomatis 1 GCF_000147675
+s__Tomato_golden_leaf_spot_virus 1 PRJNA209726
+s__Subterranean_clover_mottle_virus_satellite_RNA 1 PRJNA14503
+s__Desulfovibrio_piezophilus 1 GCF_000341895
+s__Propionibacterium_phage_PHL010M04 1 PRJNA219117
+s__Enterobacteria_phage_Ike 1 PRJNA14627
+s__Streptococcus_parauberis 4 GCF_000213825 GCF_000343855 GCF_000187935 GCF_000342505
+s__Sphingomonas_melonis 2 GCF_000371765 GCF_000379045
+s__Thermosphaera_aggregans 1 GCF_000092185
+s__Clostridium_bolteae 6 GCF_000371665 GCF_000371645 GCF_000154365 GCF_000371685 GCF_000371705 GCF_000371725
+s__Watermelon_silver_mottle_virus 1 PRJNA15176
+s__Selenomonas_flueggei 1 GCF_000160695
+s__Shallot_yellow_stripe_virus 1 PRJNA15745
+s__Spilanthes_yellow_vein_virus 1 PRJNA19779
+s__Blautia_sp_KLE_1732 1 GCF_000466565
+s__Bacillus_phage_BMBtp2 1 PRJNA184152
+s__Halobacteroides_halobius 1 GCF_000328625
+s__Wolbachia_endosymbiont_of_Culex_quinquefasciatus 2 GCF_000156735 GCF_000073005
+s__Enterococcus_phage_EF62phi 1 PRJNA159663
+s__Aeromonas_hydrophila 9 GCF_000354635 GCF_000354675 GCF_000350405 GCF_000014805 GCF_000315835 GCF_000354695 GCF_000401555 GCF_000298055 GCF_000354715
+s__Desulfovibrio_desulfuricans 4 GCF_000384815 GCF_000022125 GCF_000420465 GCF_000189295
+s__Citrobacter_freundii 5 GCF_000208765 GCF_000238735 GCF_000388155 GCF_000342325 GCF_000312465
+s__Mesocricetus_auratus_papillomavirus_1 1 PRJNA226107
+s__Cotton_leaf_curl_virus_associated_DNA_1_isolate_Lucknow 1 PRJNA65305
+s__Scheffersomyces_stipitis 1 GCA_000209165
+s__Subdoligranulum_variabile 1 GCF_000157955
+s__Virgibacillus_halodenitrificans 1 GCF_000294755
+s__Streptomyces_phage_phiC31 1 PRJNA14606
+s__Erwinia_phage_phiEa21_4 2 PRJNA33537 PRJNA64759
+s__Desulfobulbus_propionicus 1 GCF_000186885
+s__Riemerella_phage_RAP44 1 PRJNA181081
+s__Ectothiorhodospira_sp_PHS_1 1 GCF_000225005
+s__Rice_dwarf_virus 1 PRJNA14797
+s__Bromus_catharticus_striate_mosaic_virus 1 PRJNA61437
+s__Bagaza_virus 1 PRJNA36619
+s__Paenibacillus_elgii 1 GCF_000213315
+s__Caviid_herpesvirus_2 1 PRJNA188730
+s__Satellites_of_Trichomonas_vaginalis_T1_virus 1 PRJNA14201
+s__Louping_ill_virus 1 PRJNA15343
+s__zeta_proteobacterium_SCGC_AB_602_E04 1 GCF_000379265
+s__Catellicoccus_marimammalium 1 GCF_000313915
+s__Francisella_sp_TX077308 1 GCF_000219045
+s__Hirschia_baltica 1 GCF_000023785
+s__Streptococcus_ferus 1 GCF_000372425
+s__Vibrio_campbellii 5 GCF_000334195 GCF_000259875 GCF_000154025 GCF_000464435 GCF_000017705
+s__Aedes_pseudoscutellaris_reovirus 1 PRJNA16243
+s__Anaplasma_centrale 1 GCF_000024505
+s__Candidatus_Arthromitus_sp_SFB_rat_Yit 1 GCF_000283555
+s__Soybean_chlorotic_blotch_virus 1 PRJNA48595
+s__Nesterenkonia_sp_F 1 GCF_000220985
+s__Citrus_yellow_mosaic_virus 1 PRJNA14153
+s__candidate_division_TM7_genomosp_GTL1 1 GCF_000169295
+s__Dahlia_mosaic_virus 1 PRJNA175589
+s__Anaerococcus_lactolyticus 1 GCF_000156575
+s__Amycolatopsis_methanolica 1 GCF_000371885
+s__Leuconostoc_fallax 1 GCF_000165675
+s__Arcticibacter_svalbardensis 1 GCF_000403135
+s__Tupaia_virus 1 PRJNA15415
+s__Borrelia_garinii 4 GCF_000172275 GCF_000172595 GCF_000300045 GCF_000239475
+s__Idiomarina_loihiensis 2 GCF_000008465 GCF_000401175
+s__Pseudomonas_gingeri 1 GCF_000280765
+s__Pelargonium_flower_break_virus 1 PRJNA14928
+s__Marinilabilia_salmonicolor 1 GCF_000259075
+s__Actinobaculum_sp_oral_taxon_183 1 GCF_000466165
+s__Bartonella_washoensis 2 GCF_000278195 GCF_000278135
+s__Acidaminococcus_sp_HPA0509 1 GCF_000411395
+s__Eikenella_corrodens 2 GCF_000504685 GCF_000158615
+s__Enterobacteria_phage_HK140 1 PRJNA183139
+s__Rhodococcus_wratislaviensis 1 GCF_000325625
+s__Acholeplasma_phage_L2 1 PRJNA14066
+s__Bacillus_phage_TP21_L 1 PRJNA33139
+s__Bacillus_phage_phiAGATE 1 PRJNA185318
+s__Thrush_coronavirus_HKU12 1 PRJNA32701
+s__Nilaparvata_lugens_honeydew_virus_2 1 PRJNA209359
+s__Nilaparvata_lugens_honeydew_virus_3 1 PRJNA209357
+s__Apple_latent_spherical_virus 1 PRJNA15367
+s__Malvastrum_yellow_mosaic_alphasatellite 1 PRJNA18129
+s__Guanarito_virus 1 PRJNA14939
+s__Acinetobacter_sp_CIP_102129 1 GCF_000368305
+s__Methylotenera_mobilis 2 GCF_000023705 GCF_000384255
+s__Nostoc_sp_PCC_7107 1 GCF_000316625
+s__Indibacter_alkaliphilus 1 GCF_000295935
+s__Pleurocapsa_sp_PCC_7319 1 GCF_000332195
+s__Prevotella_intermedia 1 GCF_000261025
+s__Radish_leaf_curl_virus 1 PRJNA28279
+s__Propionibacterium_acidipropionici 1 GCF_000310065
+s__Apple_dimple_fruit_viroid 1 PRJNA14971
+s__Aquamavirus_A 1 PRJNA20985
+s__Gloeocapsa_sp_PCC_7428 1 GCF_000317555
+s__Anatid_herpesvirus_1 1 PRJNA39725
+s__Staphylococcus_simiae 1 GCF_000235645
+s__Salmonella_phage_epsilon15 1 PRJNA14285
+s__Butyrivibrio_sp_AE3009 1 GCF_000420845
+s__Acinetobacter_oleivorans 2 GCF_000488235 GCF_000196795
+s__Sida_yellow_mosaic_China_virus 1 PRJNA167735
+s__Lactobacillus_parabrevis 1 GCF_000383435
+s__Streptomyces_sp_SA3_actG 1 GCF_000179195
+s__Streptomyces_sp_SA3_actF 1 GCF_000179215
+s__Marinococcus_halotolerans 1 GCF_000420725
+s__Torque_teno_mini_virus_5 1 PRJNA48177
+s__Bifidobacterium_dentium 4 GCF_000172135 GCF_000149165 GCF_000024445 GCF_000146775
+s__Sutterella_parvirubra 1 GCF_000250875
+s__Granulicatella_adiacens 1 GCF_000160675
+s__Torque_teno_mini_virus_7 1 PRJNA48163
+s__Vibrio_tasmaniensis 5 GCF_000272445 GCF_000272385 GCF_000272405 GCF_000272425 GCF_000272365
+s__Bartonella_taylorii 1 GCF_000278295
+s__Enterococcus_sp_C1 1 GCF_000277605
+s__Streptomyces_sp_FxanaD5 1 GCF_000373465
+s__Mycobacterium_phage_CASbig 1 PRJNA206483
+s__Chino_del_tomate_virus 1 PRJNA14183
+s__Frankia_sp_EuI1c 1 GCF_000166135
+s__Mint_virus_1 1 PRJNA15210
+s__Clostridium_phage_phi24R 1 PRJNA181218
+s__Sphingomonas_elodea 1 GCF_000226955
+s__Curtobacterium_flaccumfaciens 1 GCF_000349565
+s__Ryegrass_mottle_virus 1 PRJNA15375
+s__Pelargonium_chlorotic_ring_pattern_virus 1 PRJNA14922
+s__Lactobacillus_phage_Lv_1 1 PRJNA33535
+s__Infectious_pancreatic_necrosis_virus 1 PRJNA15024
+s__Bacteroides_salyersiae 2 GCF_000381365 GCF_000273235
+s__Thermotoga_thermarum 1 GCF_000217815
+s__Enterobacteria_phage_JL1 1 PRJNA179426
+s__Torque_teno_mini_virus_9 1 PRJNA14058
+s__Tomato_leaf_curl_Karnataka_alphasatellite 1 PRJNA181995
+s__Thioalkalivibrio_sp_HL_Eb18 1 GCF_000364985
+s__Mint_virus_X 1 PRJNA15160
+s__Streptococcus_uberis 1 GCF_000009545
+s__Cotton_leaf_curl_virus 1 PRJNA162501
+s__Azotobacter_vinelandii 2 GCF_000380365 GCF_000021045
+s__Geobacillus_sp_GHH01 1 GCF_000336445
+s__Dyadobacter_fermentans 1 GCF_000023125
+s__Pseudomonas_nitroreducens 1 GCF_000313755
+s__Pepper_mild_mottle_virus 1 PRJNA15148
+s__Human_papillomavirus_type_131 1 PRJNA62177
+s__Human_papillomavirus_type_137 1 PRJNA167867
+s__Sphaeropsis_sapinea_RNA_virus_1 1 PRJNA14722
+s__Human_papillomavirus_type_135 1 PRJNA167865
+s__Human_papillomavirus_type_134 1 PRJNA62181
+s__Strawberry_necrotic_shock_virus 1 PRJNA18507
+s__Potato_yellow_mosaic_Trinidad_virus 1 PRJNA14256
+s__Salmonella_phage_S16 1 PRJNA191122
+s__Bordetella_bronchiseptica_parapertussis 11 GCF_000318015 GCF_000313065 GCF_000195695 GCF_000312945 GCF_000306945 GCF_000317955 GCF_000195675 GCF_000313085 GCF_000479655 GCF_000479735 GCF_000317935
+s__Pseudomonas_phage_vB_Pae_Kakheti25 1 PRJNA167052
+s__Fructobacillus_fructosus 1 GCF_000185045
+s__Vibrio_phage_VvAW1 1 PRJNA192871
+s__Alcanivorax_dieselolei 1 GCF_000300005
+s__Paramecium_bursaria_Chlorella_virus_A1 1 PRJNA18305
+s__Mycobacterium_phage_SDcharge11 1 PRJNA206028
+s__Rhizosolenia_setigera_RNA_virus_01 1 PRJNA175590
+s__Enterobacteria_phage_WA13_sensu_lato 1 PRJNA16595
+s__Cupriavidus_metallidurans 1 GCF_000196015
+s__Vibrio_phage_VCY_phi 1 PRJNA76737
+s__Octadecabacter_antarcticus 1 GCF_000155675
+s__Amsacta_moorei_entomopoxvirus_L 1 PRJNA14097
+s__Synechococcus_phage_S_SSM7 1 PRJNA64711
+s__Synechococcus_phage_S_SSM5 1 PRJNA64715
+s__Rhizobium_freirei 1 GCF_000359745
+s__Lettuce_infectious_yellows_virus 1 PRJNA14768
+s__Labidocera_aestiva_circovirus 1 PRJNA186433
+s__Meleagrid_herpesvirus_1 1 PRJNA14106
+s__Deinococcus_peraridilitoris 1 GCF_000317835
+s__Pyrococcus_furiosus 2 GCF_000275605 GCF_000007305
+s__Pseudonocardia_sp_P1 1 GCF_000178675
+s__Pseudonocardia_sp_P2 1 GCF_000179835
+s__Prochlorococcus_phage_P_GSP1 1 PRJNA195518
+s__Cecembia_lonarensis 1 GCF_000298295
+s__Quang_Binh_virus 1 PRJNA37969
+s__Grapevine_leafroll_associated_virus_3 1 PRJNA14906
+s__Bacteroides_sp_2_1_33B 1 GCF_000162175
+s__Polyomavirus_HPyV6 1 PRJNA51559
+s__Polyomavirus_HPyV7 1 PRJNA51557
+s__White_bream_virus 1 PRJNA18013
+s__Lactobacillus_amylovorus 2 GCF_000194115 GCF_000182855
+s__Pusillimonas_noertemannii 1 GCF_000308195
+s__Sepik_virus 1 PRJNA18513
+s__Cetacean_morbillivirus 1 PRJNA15215
+s__Grapevine_leafroll_associated_virus_7 1 PRJNA78707
+s__Parabacteroides_distasonis 3 GCF_000012845 GCF_000307435 GCF_000307455
+s__Sphaerochaeta_coccoides 1 GCF_000208385
+s__Porcine_bocavirus_5_JS677 1 PRJNA81033
+s__Acidovorax_citrulli 2 GCF_000316055 GCF_000015325
+s__Peruvian_horse_sickness_virus 1 PRJNA16337
+s__Cellulosilyticum_lentocellum 1 GCF_000178835
+s__Mycoplasma_moatsii 1 GCF_000420225
+s__Xanthomonas_phage_vB_XveM_DIBBI 1 PRJNA167661
+s__Enterobacter_sp_MGH_22 1 GCF_000493055
+s__Ageratum_enation_virus 1 PRJNA15192
+s__Streptococcus_sp_GMD6S 1 GCF_000297015
+s__Staphylococcus_phage_phi13 1 PRJNA14248
+s__Sporolactobacillus_vineae 2 GCF_000377985 GCF_000246965
+s__Streptomyces_sp_LaPpAH_202 1 GCF_000373225
+s__Anaerobaculum_hydrogeniformans 1 GCF_000160455
+s__Acinetobacter_sp_NIPH_2168 1 GCF_000369705
+s__Cellvibrio_japonicus 1 GCF_000019225
+s__Chlorobaculum_tepidum 1 GCF_000006985
+s__Tomato_mottle_Taino_virus 1 PRJNA14082
+s__Cytophaga_aurantiaca 1 GCF_000379725
+s__Magnaporthe_oryzae_chrysovirus_1 1 PRJNA51685
+s__Streptococcus_phage_PH10 1 PRJNA38365
+s__Xanthomonas_citri 3 GCF_000349225 GCF_000007165 GCF_000263335
+s__Streptococcus_phage_PH15 1 PRJNA30161
+s__Burkholderia_phage_BcepMu 1 PRJNA14376
+s__Sweet_potato_leaf_curl_Georgia_virus 1 PRJNA14257
+s__Bovine_rhinitis_B_virus 1 PRJNA28835
+s__Deinococcus_deserti 1 GCF_000020685
+s__Thioalkalivibrio_sp_AL5 1 GCF_000378565
+s__Salsuginibacillus_kocurii 1 GCF_000377705
+s__Exiguobacterium_pavilionensis 1 GCF_000416965
+s__Astrovirus_VA1 1 PRJNA39811
+s__Pea_enation_mosaic_virus_satellite_RNA 1 PRJNA14432
+s__Megamonas_funiformis 1 GCF_000245775
+s__Enterobacteria_phage_HK629 1 PRJNA183144
+s__Sandarakinorhabdus_sp_AAP62 1 GCF_000331225
+s__Clostridium_cellulolyticum 1 GCF_000022065
+s__Mycobacterium_phage_Job42 1 PRJNA209072
+s__Bermanella_marisrubri 1 GCF_000153565
+s__Lactobacillus_kisonensis 1 GCF_000242275
+s__Arthrospira_platensis 3 GCF_000210375 GCF_000175415 GCF_000307915
+s__Acinetobacter_brisouii 2 GCF_000368645 GCF_000488275
+s__Prochlorococcus_phage_P_HM1 1 PRJNA64697
+s__Prochlorococcus_phage_P_HM2 1 PRJNA64705
+s__Tomato_leaf_curl_Sri_Lanka_virus 1 PRJNA14259
+s__Streptomyces_mobaraensis 1 GCF_000342125
+s__STL_polyomavirus 1 PRJNA186434
+s__Pseudacidovorax_intermedius 1 GCF_000333675
+s__Cyanothece_sp_ATCC_51142 1 GCF_000017845
+s__Santeuil_nodavirus 1 PRJNA62547
+s__Desmospora_sp_8437 1 GCF_000213595
+s__Blattabacterium_punctulatus 1 GCF_000236405
+s__Onion_yellows_phytoplasma 1 GCF_000009845
+s__Echinicola_vietnamensis 1 GCF_000325705
+s__Tomato_yellow_leaf_curl_Malaga_virus 1 PRJNA14239
+s__Veillonella_sp_HPA0037 1 GCF_000411535
+s__Fusobacterium_russii 1 GCF_000381725
+s__Porcine_teschovirus 1 PRJNA15092
+s__Bulleidia_extructa 1 GCF_000177375
+s__Leuconostoc_citreum 4 GCF_000026405 GCF_000239935 GCF_000239895 GCF_000239915
+s__Bacteriophage_APSE_2 1 PRJNA32705
+s__Parainfluenza_virus_5 1 PRJNA15014
+s__Clostridium_phage_phiCP7R 1 PRJNA167663
+s__Bordetella_phage_BIP_1 1 PRJNA14359
+s__Bordetella_phage_BPP_1 1 PRJNA14353
+s__Ustilago_maydis_virus_H1 1 PRJNA14812
+s__Artibeus_jamaicensis_parvovirus_1 1 PRJNA81739
+s__Pseudoramibacter_alactolyticus 1 GCF_000185505
+s__halophilic_archaeon_DL31 1 GCF_000224475
+s__Ralstonia_phage_RSM1 1 PRJNA18239
+s__Clostridium_leptum 1 GCF_000154345
+s__Commensalibacter_intestini 1 GCF_000231445
+s__Streptomyces_violaceusniger 2 GCF_000478605 GCF_000147815
+s__Methylomicrobium_buryatense 1 GCF_000341735
+s__Mycobacterium_phage_Boomer 1 PRJNA30693
+s__Trichomonas_vaginalis 1 GCA_000002825
+s__Lactobacillus_suebicus 1 GCF_000260395
+s__Citrus_variegation_virus 1 PRJNA19747
+s__Elizabethkingia_anophelis 2 GCF_000331815 GCF_000240095
+s__Acinetobacter_bacteriophage_AP22 1 PRJNA167576
+s__Kribbella_catacumbae 1 GCF_000372465
+s__Serratia_plymuthica 5 GCF_000438825 GCF_000261045 GCF_000214235 GCF_000300895 GCF_000176835
+s__Circovirus_like_genome_RW_E 1 PRJNA39625
+s__Artichoke_mottled_crinkle_virus 1 PRJNA15517
+s__Circovirus_like_genome_RW_A 1 PRJNA39617
+s__Circovirus_like_genome_RW_C 1 PRJNA39621
+s__Circovirus_like_genome_RW_B 1 PRJNA39619
+s__Cucumber_necrosis_virus 1 PRJNA14638
+s__Velvet_tobacco_mottle_virus 1 PRJNA52631
+s__Sebaldella_termitidis 1 GCF_000024405
+s__Candidatus_Symbiobacter_mobilis 1 GCF_000477435
+s__Methanocaldococcus_fervens 1 GCF_000023985
+s__Mycobacterium_phage_Wanda 1 PRJNA215122
+s__Pseudomonas_stutzeri 15 GCF_000195105 GCF_000263395 GCF_000282955 GCF_000280555 GCF_000279165 GCF_000416345 GCF_000267545 GCF_000235745 GCF_000013785 GCF_000327065 GCF_000237885 GCF_000219605 GCF_000307775 GCF_000341615 GCF_000455665
+s__Ehrlichia_ruminantium 3 GCF_000050405 GCF_000050425 GCF_000026005
+s__Edwardsiella_phage_PEi2 1 PRJNA226729
+s__Mycobacterium_phage_Bobi 1 PRJNA215126
+s__Habenaria_mosaic_virus 1 PRJNA212951
+s__Streptomyces_xinghaiensis 1 GCF_000220705
+s__Bacillus_phage_BtCS33 1 PRJNA169233
+s__Psychrobacter_aquaticus 1 GCF_000471625
+s__Mycobacterium_phage_Butters 1 PRJNA197297
+s__Propionimicrobium_lymphophilum 1 GCF_000411175
+s__Lactococcus_phage_bIL67 1 PRJNA32321
+s__Desulfovibrio_termitidis 1 GCF_000504305
+s__Halomonas_sp_GFAJ_1 1 GCF_000236625
+s__Murine_polyomavirus 1 PRJNA15489
+s__Lautropia_mirabilis 1 GCF_000186425
+s__Chitinophaga_pinensis 1 GCF_000024005
+s__Mungbean_yellow_mosaic_India_virus 1 PRJNA15259
+s__Edwardsiella_ictaluri 1 GCF_000022885
+s__Tomato_leaf_curl_Vietnam_virus 1 PRJNA14214
+s__Cypovirus_14 1 PRJNA15250
+s__Rhodocyclus_sp_UW_659_1_F08 1 GCF_000375925
+s__Thioalkalivibrio_sp_ALE10 1 GCF_000381385
+s__Thioalkalivibrio_sp_ALE11 1 GCF_000381205
+s__Thioalkalivibrio_sp_ALE12 1 GCF_000381105
+s__Thioalkalivibrio_sp_ALE14 1 GCF_000376845
+s__Bradyrhizobium_sp_STM_3843 1 GCF_000239815
+s__Thioalkalivibrio_sp_ALE16 1 GCF_000381305
+s__Thioalkalivibrio_sp_ALE18 1 GCF_000381465
+s__Mycobacterium_phage_Spud 1 PRJNA31285
+s__Cotton_leaf_curl_Gezira_alphasatellite 1 PRJNA42507
+s__Borrelia_afzelii 3 GCF_000304735 GCF_000170935 GCF_000222835
+s__Treponema_saccharophilum 1 GCF_000255555
+s__Lactococcus_phage_ul36 1 PRJNA14331
+s__Aquareovirus_C 1 PRJNA14900
+s__Nocardioides_sp_CF8 1 GCF_000389985
+s__Streptocarpus_flower_break_virus 1 PRJNA17803
+s__Human_respiratory_syncytial_virus 1 PRJNA15003
+s__Finch_circovirus 1 PRJNA18021
+s__Vibrio_sp_N418 1 GCF_000222565
+s__Lactobacillus_florum 1 GCF_000304715
+s__gamma_proteobacterium_BDW918 1 GCF_000259575
+s__Spissistilus_festinus_virus_1 1 PRJNA51181
+s__Mycoplasma_penetrans 1 GCF_000011225
+s__Streptococcus_sp_SK140 1 GCF_000259525
+s__Spinach_curly_top_Arizona_virus 1 PRJNA62497
+s__Thioalkalivibrio_sp_ALE30 1 GCF_000377465
+s__Erysipelothrix_rhusiopathiae 2 GCF_000270085 GCF_000160815
+s__Helicobacter_phage_1961P 1 PRJNA181239
+s__Lactobacillus_ruminis 4 GCF_000224985 GCF_000225845 GCF_000217755 GCF_000159375
+s__Streptomyces_sp_TOR3209 1 GCF_000259895
+s__Candidatus_Poribacteria_sp_WGA_4G 1 GCF_000364585
+s__Bifidobacterium_longum 18 GCF_000261265 GCF_000196575 GCF_000092325 GCF_000166315 GCF_000196555 GCF_000155415 GCF_000269965 GCF_000261205 GCF_000478525 GCF_000261225 GCF_000007525 GCF_000261245 GCF_000497735 GCF_000210755 GCF_000219455 GCF_000020425 GCF_000003135 GCF_000008945
+s__Herbaspirillum_sp_YR522 1 GCF_000282575
+s__Mesorhizobium_sp_WSM4349 1 GCF_000373125
+s__Aeromonas_phage_Aes012 1 PRJNA195532
+s__Clitocybe_odora_virus 1 PRJNA129589
+s__Photorhabdus_asymbiotica 1 GCF_000196475
+s__Ornithinimicrobium_pekingense 1 GCF_000421185
+s__Cellulophaga_phage_phi12_1 1 PRJNA212966
+s__Barmah_Forest_virus 1 PRJNA14679
+s__Clostera_anachoreta_granulovirus 1 PRJNA65819
+s__Lactococcus_raffinolactis 1 GCF_000327305
+s__Pseudomonas_phage_PAK_P1 1 PRJNA64763
+s__Actinomyces_sp_oral_taxon_180 1 GCF_000185285
+s__Porphyromonas_endodontalis 1 GCF_000174815
+s__Okra_mosaic_virus 1 PRJNA19761
+s__Heterocapsa_circularisquama_RNA_virus 1 PRJNA16157
+s__Florida_woods_cockroach_associated_cyclovirus 1 PRJNA188548
+s__Naumovozyma_dairenensis 1 GCA_000227115
+s__Tobacco_etch_virus 1 PRJNA15325
+s__Rubber_viroid_India_2009 1 PRJNA48423
+s__Tomato_leaf_curl_Ghana_virus 1 PRJNA28699
+s__Thiomonas_sp_FB_6 1 GCF_000377645
+s__Pseudoalteromonas_sp_NJ631 1 GCF_000276645
+s__Herbaspirillum_sp_GW103 1 GCF_000261365
+s__Deinococcus_maricopensis 1 GCF_000186385
+s__Actinomyces_phage_Av_1 1 PRJNA20057
+s__Kocuria_rhizophila 2 GCF_000010285 GCF_000214115
+s__Acinetobacter_sp_ANC_3994 1 GCF_000367925
+s__Natronococcus_amylolyticus 1 GCF_000337675
+s__Cotton_leaf_curl_Rajasthan_virus 1 PRJNA14130
+s__Holophaga_foetida 1 GCF_000242615
+s__Feline_foamy_virus 1 PRJNA15219
+s__Sida_yellow_mosaic_Alagoas_virus 1 PRJNA189217
+s__Nitratiruptor_sp_SB155_2 1 GCF_000010325
+s__Xanthomonas_sp_SHU166 1 GCF_000364685
+s__Pseudocowpox_virus 1 PRJNA45973
+s__Candidatus_Liberibacter_asiaticus 2 GCF_000023765 GCF_000346595
+s__Ketogulonicigenium_vulgare 2 GCF_000223375 GCF_000164885
+s__Snakehead_virus 1 PRJNA14689
+s__Anaerobaculum_mobile 1 GCF_000266925
+s__Bacteriovorax_sp_BSW11_IV 1 GCF_000447755
+s__Odoribacter_splanchnicus 1 GCF_000190535
+s__Ageratum_yellow_vein_China_virus 1 PRJNA14490
+s__Marinobacterium_stanieri 1 GCF_000220545
+s__Cellulophaga_phage_phi13_2 1 PRJNA212953
+s__Veillonella_sp_oral_taxon_158 1 GCF_000183505
+s__Tomato_leaf_curl_Java_betasatellite 1 PRJNA14452
+s__Tomato_leaf_curl_Togo_betasatellite_Togo_2006 1 PRJNA60629
+s__Felid_herpesvirus_1 1 PRJNA42429
+s__Cucumber_fruit_mottle_mosaic_virus 1 PRJNA14709
+s__Tomato_necrotic_stunt_virus 1 PRJNA162495
+s__Thermococcus_zilligii 1 GCF_000258515
+s__Shallot_virus_X 1 PRJNA14805
+s__Acinetobacter_sp_CIP_102143 1 GCF_000369865
+s__Leishmania_major 1 GCA_000002725
+s__Alicycliphilus_sp_CRZ1 1 GCF_000282995
+s__Zymophilus_raffinosivorans 1 GCF_000381065
+s__Salmonella_bongori 2 GCF_000439255 GCF_000252995
+s__Nostoc_sp_PCC_7120 1 GCF_000009705
+s__Salmonella_phage_PhiSH19 1 PRJNA181236
+s__Helleborus_net_necrosis_virus 1 PRJNA33877
+s__Melioribacter_roseus 1 GCF_000279145
+s__Pegivirus_A 1 PRJNA14647
+s__Squash_vein_yellowing_virus 1 PRJNA29107
+s__Enterobacter_lignolyticus 1 GCF_000164865
+s__Cryphonectria_hypovirus_4 1 PRJNA15007
+s__Clostridium_sp_HGF2 1 GCF_000183585
+s__Cryphonectria_hypovirus_1 1 PRJNA14664
+s__Cryphonectria_hypovirus_3 1 PRJNA14690
+s__Cryphonectria_hypovirus_2 1 PRJNA14754
+s__Bifidobacterium_pseudolongum 1 GCF_000421365
+s__Sphingobium_quisquiliarum 1 GCF_000445065
+s__Deinococcus_radiodurans 1 GCF_000008565
+s__Klebsiella_phage_KP27 1 PRJNA185314
+s__Hylemonella_gracilis 1 GCF_000211835
+s__Staphylococcus_haemolyticus 2 GCF_000009865 GCF_000261465
+s__Senecio_yellow_mosaic_virus 1 PRJNA15233
+s__Chicory_yellow_mottle_virus_satellite_RNA 1 PRJNA14988
+s__Carnation_ringspot_virus 1 PRJNA14753
+s__Alloprevotella_rava 1 GCF_000234115
+s__Bundibugyo_ebolavirus 1 PRJNA51245
+s__Salmonella_phage_Vi06 1 PRJNA64609
+s__Streptococcus_intermedius 8 GCF_000463355 GCF_000234015 GCF_000306805 GCF_000258445 GCF_000313655 GCF_000234035 GCF_000413475 GCF_000463385
+s__Serratia_sp_S4 1 GCF_000347995
+s__Eubacterium_biforme 1 GCF_000156655
+s__Propionibacterium_phage_PHL037M02 1 PRJNA219116
+s__Tulip_virus_X 1 PRJNA14865
+s__Burkholderia_bryophila 1 GCF_000383275
+s__Synergistes_sp_3_1_syn1 1 GCF_000238615
+s__Frankia_alni 1 GCF_000058485
+s__Malvastrum_yellow_mosaic_virus 1 PRJNA18131
+s__Bartonella_schoenbuchensis 1 GCF_000385435
+s__zeta_proteobacterium_SCGC_AB_133_C04 1 GCF_000379285
+s__Fragaria_chiloensis_latent_virus 1 PRJNA15122
+s__Photobacterium_sp_SKA34 1 GCF_000153325
+s__Bacteroides_faecis 1 GCF_000226135
+s__Rodent_herpesvirus_Peru 1 PRJNA62491
+s__Desulfovibrio_africanus 2 GCF_000344315 GCF_000212675
+s__Polaribacter_franzmannii 1 GCF_000377865
+s__Pseudoalteromonas_ruthenica 1 GCF_000336495
+s__Pseudochrobactrum_sp_AO18b 1 GCF_000409565
+s__Anaerotruncus_sp_G3_2012 1 GCF_000403395
+s__Rhizobium_sp_BR816 1 GCF_000378985
+s__Streptomyces_albus 2 GCF_000156475 GCF_000359525
+s__Lentibacillus_jeotgali 1 GCF_000224785
+s__Natrinema_altunense 1 GCF_000337155
+s__Rickettsia_rickettsii 8 GCF_000283835 GCF_000283775 GCF_000018225 GCF_000283795 GCF_000283935 GCF_000283815 GCF_000017445 GCF_000283955
+s__Brucella_sp_UK38_05 1 GCF_000367125
+s__Halomonas_jeotgali 1 GCF_000334215
+s__Acinetobacter_indicus 2 GCF_000413875 GCF_000488255
+s__Providencia_phage_Redjac 1 PRJNA177540
+s__Adlercreutzia_equolifaciens 1 GCF_000478885
+s__Coprococcus_eutactus 1 GCF_000154425
+s__Pelargonium_zonate_spot_virus 1 PRJNA14774
+s__Apricot_latent_virus 1 PRJNA61427
+s__Streptococcus_phage_2972 1 PRJNA15254
+s__Acinetobacter_beijerinckii 2 GCF_000369005 GCF_000368985
+s__Saccharomonospora_paurometabolica 1 GCF_000231035
+s__Avian_encephalomyelitis_virus 1 PRJNA15360
+s__Roseburia_inulinivorans 1 GCF_000174195
+s__Barnesiella_intestinihominis 1 GCF_000296465
+s__Lactobacillus_fructivorans 1 GCF_000185465
+s__Glaciecola_punicea 1 GCF_000252165
+s__Brazoran_virus 1 PRJNA214783
+s__Peste_des_petits_ruminants_virus 1 PRJNA15499
+s__Corynebacterium_phage_P1201 1 PRJNA20781
+s__Crocuta_crocuta_papillomavirus_1 1 PRJNA174774
+s__Cyclobacterium_marinum 1 GCF_000222485
+s__Natronococcus_jeotgali 1 GCF_000337695
+s__Ruminococcus_sp_5_1_39BFAA 1 GCF_000159975
+s__Streptococcus_sp_GMD4S 1 GCF_000296955
+s__Torque_teno_virus_8 1 PRJNA48167
+s__Torque_teno_virus_7 1 PRJNA48159
+s__Torque_teno_virus_6 1 PRJNA48187
+s__Fenneropenaeus_chinensis_hepatopancreatic_densovirus 1 PRJNA51177
+s__Torque_teno_virus_4 1 PRJNA48137
+s__Torque_teno_virus_3 1 PRJNA48161
+s__Torque_teno_virus_2 1 PRJNA51893
+s__Torque_teno_virus_1 1 PRJNA15247
+s__Myroides_injenensis 1 GCF_000246945
+s__Brucella_neotomae 1 GCF_000158715
+s__Mycobacterium_phage_vB_MapS_FF47 1 PRJNA197296
+s__Acinetobacter_sp_NIPH_236 1 GCF_000367965
+s__Caldanaerobacter_subterraneus 3 GCF_000156275 GCF_000007085 GCF_000473865
+s__Flavobacteria_bacterium_BBFL7 1 GCF_000153385
+s__Thiothrix_disciformis 1 GCF_000371925
+s__Xanthomonas_phage_Cf1c 1 PRJNA14329
+s__Ornithogalum_mosaic_virus 1 PRJNA179428
+s__Candidatus_Protochlamydia_amoebophila 1 GCF_000011565
+s__Methanoculleus_bourgensis 1 GCF_000304355
+s__Xanthomonas_oryzae 6 GCF_000007385 GCF_000212755 GCF_000168315 GCF_000212775 GCF_000010025 GCF_000019585
+s__Lactobacillus_curvatus 1 GCF_000235705
+s__Brucella_phage_Pr 1 PRJNA181064
+s__Campylobacter_phage_CP30A 1 PRJNA177545
+s__Duck_hepatitis_B_virus 1 PRJNA14576
+s__Ralstonia_pickettii 4 GCF_000471925 GCF_000020205 GCF_000372665 GCF_000023425
+s__Chayote_yellow_mosaic_virus 1 PRJNA15193
+s__Callitrichine_herpesvirus_3 1 PRJNA14324
+s__Acinetobacter_tjernbergiae 2 GCF_000488175 GCF_000374425
+s__Rhodopirellula_europaea 2 GCF_000346315 GCF_000338295
+s__Treponema_sp_JC4 1 GCF_000260795
+s__Sorghum_chlorotic_spot_virus 1 PRJNA14835
+s__Pseudoalteromonas_marina 1 GCF_000238335
+s__Peach_latent_mosaic_viroid 1 PRJNA14772
+s__Tomato_infectious_chlorosis_virus 1 PRJNA40419
+s__Vibrio_kanaloae 1 GCF_000272165
+s__Thermoplasma_acidophilum 1 GCF_000195915
+s__Venezuelan_equine_encephalitis_virus 1 PRJNA15302
+s__Bacteriovorax_marinus 1 GCF_000210915
+s__Bacteroides_phage_B40_8 1 PRJNA31249
+s__Acinetobacter_sp_ATCC_27244 1 GCF_000156555
+s__Bacillus_phage_Troll 1 PRJNA215668
+s__Lactobacillus_antri 1 GCF_000160835
+s__Microcystis_aeruginosa_phage_Ma_LMM01 1 PRJNA18127
+s__Thottapalayam_virus 1 PRJNA29841
+s__Citrus_psorosis_virus 1 PRJNA15060
+s__Citreicella_sp_SE45 1 GCF_000161755
+s__Ceratocystis_resinifera_virus_1 1 PRJNA29901
+s__Acetobacteraceae_bacterium_AT_5844 1 GCF_000245075
+s__Thermus_phage_phi_OH2 1 PRJNA212950
+s__Lactobacillus_phage_LP65 1 PRJNA14547
+s__Prochlorococcus_phage_Syn33 1 PRJNA64707
+s__Rickettsia_bellii 2 GCF_000012385 GCF_000018245
+s__Kalanchoe_latent_virus 1 PRJNA39583
+s__Campylobacter_curvus 2 GCF_000017465 GCF_000376325
+s__Cyanophage_Syn30 1 PRJNA198437
+s__Tomato_leaf_curl_Nigeria_virus_Nigeria_2006 1 PRJNA34815
+s__Caprine_arthritis_encephalitis_virus 1 PRJNA15243
+s__Hepatitis_delta_virus 1 PRJNA15032
+s__Photobacterium_angustum 1 GCF_000153265
+s__Helicobacter_bizzozeronii 2 GCF_000263275 GCF_000237285
+s__Nocardiopsis_halophila 1 GCF_000341245
+s__Neisseria_wadsworthii 1 GCF_000227765
+s__Idiomarina_baltica 1 GCF_000152885
+s__Streptomyces_phage_phiHau3 1 PRJNA177522
+s__Aeromonas_phage_phiAS5 1 PRJNA59729
+s__Aeromonas_phage_phiAS4 1 PRJNA59727
+s__Aeromonas_phage_phiAS7 1 PRJNA181221
+s__Hop_latent_viroid 1 PRJNA14970
+s__Shewanella_sp_W3_18_1 1 GCF_000015185
+s__Peptoniphilus_timonensis 1 GCF_000312025
+s__Yersinia_phage_phiA1122 1 PRJNA14332
+s__Bacteroides_finegoldii 3 GCF_000269545 GCF_000156195 GCF_000304195
+s__Ruminococcus_sp_JC304 1 GCF_000285855
+s__Methylobacterium_sp_77 1 GCF_000372825
+s__Treponema_phagedenis 1 GCF_000187105
+s__Pseudomonas_alcaligenes 2 GCF_000455385 GCF_000467105
+s__Leptonema_illini 1 GCF_000243335
+s__Chickpea_chlorosis_virus 1 PRJNA60627
+s__Aminomonas_paucivorans 1 GCF_000165795
+s__Spiroplasma_taiwanense 1 GCF_000439435
+s__Bhendi_yellow_vein_India_betasatellite 1 PRJNA61557
+s__Prochlorococcus_marinus 13 GCF_000015705 GCF_000015965 GCF_000015645 GCF_000011465 GCF_000007925 GCF_000011485 GCF_000015685 GCF_000012465 GCF_000018585 GCF_000158595 GCF_000015665 GCF_000012645 GCF_000018065
+s__Vibrio_sp_712i1 1 GCF_000316925
+s__Corynebacterium_terpenotabidum 1 GCF_000418365
+s__Alicycliphilus_denitrificans 2 GCF_000204645 GCF_000179015
+s__Myroides_odoratus 1 GCF_000243275
+s__Providencia_alcalifaciens 2 GCF_000173415 GCF_000314875
+s__Tomato_yellow_leaf_curl_Guangdong_virus 1 PRJNA17801
+s__Shuttleworthia_satelles 1 GCF_000160115
+s__Haliangium_ochraceum 1 GCF_000024805
+s__Carrot_yellow_leaf_virus 1 PRJNA39585
+s__Garlic_common_latent_virus 1 PRJNA78925
+s__Streptomyces_sp_S4 1 GCF_000297715
+s__Orgyia_pseudotsugata_multiple_nucleopolyhedrovirus 1 PRJNA14084
+s__Helicoverpa_zea_single_nucleopolyhedrovirus 1 PRJNA14148
+s__Tobacco_ringspot_virus 1 PRJNA14933
+s__Corynebacterium_amycolatum 1 GCF_000173655
+s__Mopeia_virus 1 PRJNA15037
+s__Acidimicrobium_ferrooxidans 1 GCF_000023265
+s__Leifsonia_rubra 1 GCF_000477555
+s__Cadicivirus_A 1 PRJNA201444
+s__Streptomyces_sp_SS 1 GCF_000302615
+s__Leptolyngbya_sp_PCC_6406 1 GCF_000332095
+s__Tetrasphaera_phage_TJE1 1 PRJNA184167
+s__Pestivirus_strain_Aydin_04_TR 1 PRJNA176618
+s__Spring_viraemia_of_carp_virus 1 PRJNA14726
+s__Bacillus_sp_WBUNB004 1 GCF_000319755
+s__Halorubrum_sp_T3 1 GCF_000296615
+s__Tomato_leaf_curl_Sinaloa_virus 1 PRJNA19971
+s__Mesorhizobium_australicum 1 GCF_000230995
+s__Porphyrobacter_sp_AAP82 1 GCF_000331285
+s__Haloferax_gibbonsii 1 GCF_000336775
+s__Yellow_fever_virus 1 PRJNA15284
+s__Pteronotus_polyomavirus 1 PRJNA185190
+s__Cucurbit_leaf_crumple_virus 1 PRJNA14121
+s__Leptospira_wolbachii 1 GCF_000332515
+s__Stachytarpheta_leaf_curl_virus 1 PRJNA14412
+s__Desulfarculus_baarsii 1 GCF_000143965
+s__Canine_minute_virus 1 PRJNA15465
+s__endosymbiont_of_Tevnia_jerichonana 1 GCF_000224925
+s__Leptospira_vanthielii 1 GCF_000332455
+s__Desulfotomaculum_nigrificans 1 GCF_000189755
+s__Apple_chlorotic_leaf_spot_virus 1 PRJNA14658
+s__Xylanimonas_cellulosilytica 1 GCF_000024965
+s__Joostella_marina 1 GCF_000260115
+s__Escherichia_phage_Lw1 1 PRJNA206486
+s__Haloarcula_amylolytica 1 GCF_000336615
+s__Agrobacterium_sp_224MFTsu3_1 1 GCF_000384555
+s__Ignisphaera_aggregans 1 GCF_000145985
+s__Acinetobacter_sp_NIPH_758 1 GCF_000368345
+s__Lysinibacillus_fusiformis 2 GCF_000178135 GCF_000313955
+s__Borna_disease_virus 1 PRJNA14675
+s__Thioalkalivibrio_sp_ALE31 1 GCF_000377405
+s__Capnocytophaga_sp_oral_taxon_329 1 GCF_000213295
+s__Papaya_leaf_curl_China_virus 1 PRJNA14536
+s__Actinomyces_sp_oral_taxon_181 1 GCF_000318335
+s__Methanocaldococcus_jannaschii 1 GCF_000091665
+s__Salicola_phage_CGphi29 1 PRJNA195485
+s__Tomato_yellow_leaf_curl_Thailand_virus 1 PRJNA15179
+s__Chrysodeixis_chalcites_nucleopolyhedrovirus 1 PRJNA15469
+s__Human_parvovirus_B19 1 PRJNA14090
+s__Actinoplanes_missouriensis 1 GCF_000284295
+s__Algerian_watermelon_mosaic_virus 1 PRJNA29883
+s__Crocosphaera_watsonii 2 GCF_000235665 GCF_000167195
+s__Escherichia_phage_rv5 1 PRJNA30613
+s__Clostridium_tunisiense 1 GCF_000300195
+s__Thiothrix_flexilis 1 GCF_000380185
+s__Zinnia_leaf_curl_virus_associated_DNA_beta 1 PRJNA14538
+s__Leptospira_santarosai 24 GCF_000217455 GCF_000306615 GCF_000332435 GCF_000244795 GCF_000244655 GCF_000244555 GCF_000348015 GCF_000306575 GCF_000246375 GCF_000244575 GCF_000306475 GCF_000332395 GCF_000346915 GCF_000244475 GCF_000244615 GCF_000243835 GCF_000244735 GCF_000343395 GCF_000246395 GCF_000346995 GCF_000306455 GCF_000216275 GCF_000313175 GCF_000244675
+s__Simkania_negevensis 1 GCF_000237205
+s__Caulobacter_sp_AP07 1 GCF_000281955
+s__Orchid_fleck_virus 1 PRJNA19969
+s__Desulfonatronospira_thiodismutans 1 GCF_000174435
+s__Aconitum_latent_virus 1 PRJNA15382
+s__Brucella_sp_NVSL_07_0026 1 GCF_000163135
+s__Zygosaccharomyces_bailii_virus_Z 1 PRJNA14823
+s__Woodsholea_maritima 1 GCF_000382325
+s__Sinorhizobium_fredii 3 GCF_000283895 GCF_000018545 GCF_000265205
+s__Salmonella_phage_SETP13 1 PRJNA226727
+s__Flavobacteria_bacterium_MS024_2A 1 GCF_000173095
+s__Methylobacterium_mesophilicum 1 GCF_000364445
+s__Porphyromonas_catoniae 1 GCF_000318215
+s__Magnetospirillum_magneticum 1 GCF_000009985
+s__Lachnospiraceae_bacterium_oral_taxon_082 1 GCF_000242315
+s__Flavobacterium_antarcticum 1 GCF_000419685
+s__Pseudomonas_phage_JG024 1 PRJNA181067
+s__Babesia_bovis 1 GCA_000165395
+s__Paenibacillus_curdlanolyticus 1 GCF_000179615
+s__Sclerotinia_sclerotiorum_dsRNA_mycovirus_L 1 PRJNA165743
+s__Listeria_phage_B054 1 PRJNA20797
+s__Collinsella_tanakaei 1 GCF_000225705
+s__Mycobacterium_gilvum 2 GCF_000184435 GCF_000016365
+s__Luffa_begomovirus_associated_DNA_beta 1 PRJNA16795
+s__Sida_mottle_virus 1 PRJNA14255
+s__Capnocytophaga_sp_oral_taxon_324 1 GCF_000318315
+s__Thauera_sp_MZ1T 1 GCF_000021765
+s__Capnocytophaga_sp_oral_taxon_326 1 GCF_000318295
+s__Strawberry_mild_yellow_edge_virus 1 PRJNA14999
+s__Tomato_leaf_curl_Cebu_virus 1 PRJNA28987
+s__Riemerella_anatipestifer 7 GCF_000331695 GCF_000183155 GCF_000252855 GCF_000184135 GCF_000191565 GCF_000321285 GCF_000295655
+s__Croton_yellow_vein_virus 1 PRJNA51789
+s__Pepino_mosaic_virus 1 PRJNA15125
+s__Methanosarcina_acetivorans 1 GCF_000007345
+s__Actinobacillus_succinogenes 1 GCF_000017245
+s__Bradyrhizobium_sp_ORS_278 1 GCF_000026145
+s__Turneriella_parva 1 GCF_000266885
+s__Staphylococcus_vitulinus 1 GCF_000286335
+s__Torque_teno_zalophus_virus_1 1 PRJNA34735
+s__Pseudomonas_phage_F116 1 PRJNA15127
+s__Thermodesulfovibrio_yellowstonii 1 GCF_000020985
+s__Okra_leaf_curl_Mali_virus_satellite_DNA_beta 1 PRJNA20323
+s__Yersinia_phage_Yepe2 1 PRJNA62965
+s__Allpahuayo_virus 1 PRJNA28323
+s__Bacteroides_massiliensis 3 GCF_000403195 GCF_000382445 GCF_000373085
+s__Ribgrass_mosaic_virus 1 PRJNA14980
+s__Microbacterium_sp_11MF 1 GCF_000383475
+s__Plutella_xylostella_multiple_nucleopolyhedrovirus 1 PRJNA17671
+s__Candidatus_Blochmannia_vafer 1 GCF_000185985
+s__Synechococcus_phage_S_RSM4 1 PRJNA39923
+s__Sugarcane_streak_virus 1 PRJNA14177
+s__Salmonella_phage_PVP_SE1 1 PRJNA74359
+s__Vanderwaltozyma_polyspora 1 GCA_000150035
+s__Cotton_leaf_curl_Multan_virus 2 PRJNA14242 PRJNA33487
+s__Streptomyces_tsukubaensis 1 GCF_000297155
+s__Rhodococcus_sp_AW25M09 1 GCF_000333955
+s__Coprococcus_catus 1 GCF_000210555
+s__Riemerella_columbina 1 GCF_000374405
+s__Vibrio_sp_Ex25 2 GCF_000152485 GCF_000024825
+s__Mycobacterium_phage_Troll4 1 PRJNA32011
+s__Measles_virus 1 PRJNA15025
+s__Clostridium_papyrosolvens 2 GCF_000421965 GCF_000175795
+s__Phytophthora_infestans_RNA_virus_1 1 PRJNA40329
+s__Prevotella_histicola 1 GCF_000234055
+s__Tobacco_vein_mottling_virus 1 PRJNA15348
+s__Maize_white_line_mosaic_virus 1 PRJNA19755
+s__Snake_adenovirus_A 1 PRJNA27899
+s__Sodalis_glossinidius 1 GCF_000010085
+s__Streptococcus_oligofermentans 1 GCF_000385925
+s__Geobacillus_sp_WSUCF1 1 GCF_000422025
+s__Pseudomonas_sp_CFT9 1 GCF_000416255
+s__Tomato_leaf_curl_Mayotte_virus 1 PRJNA15212
+s__Cronobacter_phage_vB_CskP_GAP227 1 PRJNA185316
+s__Mycobacterium_phage_CrimD 1 PRJNA51669
+s__Phaeobacter_inhibens 1 GCF_000154765
+s__Listeria_phage_2389 1 PRJNA14142
+s__Chryseobacterium_sp_CF314 1 GCF_000282115
+s__Vibrio_phage_douglas_12A4 1 PRJNA198432
+s__Butyricicoccus_pullicaecorum 1 GCF_000398925
+s__Soybean_chlorotic_mottle_virus 1 PRJNA14594
+s__Thermotoga_sp_EMP 1 GCF_000294555
+s__Bacillus_sp_1NLA3E 1 GCF_000242895
+s__Corchorus_yellow_vein_mosaic_virus 1 PRJNA192607
+s__Avian_endogenous_retrovirus_EAV_HP 1 PRJNA15213
+s__Hemileuca_sp_nucleopolyhedrovirus 1 PRJNA214353
+s__Candidatus_Azobacteroides_pseudotrichonymphae 1 GCF_000010645
+s__Leadbetterella_byssophila 1 GCF_000166395
+s__Paenibacillus_massiliensis 1 GCF_000377505
+s__Holospora_obtusa 1 GCF_000469665
+s__Fusobacterium_nucleatum 22 GCF_000158255 GCF_000273625 GCF_000218645 GCF_000479185 GCF_000178895 GCF_000158275 GCF_000182945 GCF_000220825 GCF_000400875 GCF_000162235 GCF_000279975 GCF_000242975 GCF_000273605 GCF_000153625 GCF_000479225 GCF_000158535 GCF_000218655 GCF_000163915 GCF_000479205 GCF_000234075 GCF_000007325 GCF_000162355
+s__Martelella_mediterranea 1 GCF_000376125
+s__Geobacillus_sp_Y412MC61 1 GCF_000024705
+s__Avian_carcinoma_virus 1 PRJNA14632
+s__Catelliglobosispora_koreensis 1 GCF_000379685
+s__Enhydrobacter_aerosaccus 1 GCF_000175915
+s__Dobrava_Belgrade_virus 1 PRJNA15319
+s__Fischerella_thermalis 1 GCF_000317225
+s__Thermoanaerobacter_ethanolicus 2 GCF_000192295 GCF_000175815
+s__Ilumatobacter_coccineus 1 GCF_000348785
+s__Microbacterium_sp_oral_taxon_186 1 GCF_000411455
+s__Macroptilium_yellow_mosaic_virus 1 PRJNA14598
+s__Equine_polyomavirus 1 PRJNA167666
+s__Parabacteroides_merdae 3 GCF_000154105 GCF_000307495 GCF_000307345
+s__Burkholderia_sp_JPY347 1 GCF_000373005
+s__Prevotella_saccharolytica 1 GCF_000318195
+s__Microbacterium_sp_292MF 1 GCF_000380605
+s__Pseudomonas_sp_2_1_26 1 GCF_000233495
+s__Rotavirus_D 1 PRJNA52635
+s__Citrus_dwarfing_viroid 1 PRJNA14974
+s__Lachnospiraceae_bacterium_7_1_58FAA 1 GCF_000242155
+s__Alphapapillomavirus_3 1 PRJNA15451
+s__Alphapapillomavirus_1 1 PRJNA15508
+s__Alphapapillomavirus_6 1 PRJNA15510
+s__Alphapapillomavirus_7 1 PRJNA15506
+s__Alphapapillomavirus_4 1 PRJNA15512
+s__Alphapapillomavirus_5 1 PRJNA15507
+s__Streptococcus_marimammalium 1 GCF_000380045
+s__Enterobacter_sp_R4_368 1 GCF_000410515
+s__Alphapapillomavirus_8 1 PRJNA15450
+s__Alphapapillomavirus_9 1 PRJNA15505
+s__Asystasia_begomovirus_1 1 PRJNA81011
+s__Mannheimia_haemolytica 16 GCF_000443105 GCF_000153645 GCF_000443205 GCF_000176255 GCF_000443225 GCF_000176275 GCF_000349785 GCF_000427275 GCF_000376645 GCF_000439735 GCF_000422145 GCF_000349765 GCF_000341635 GCF_000443085 GCF_000443185 GCF_000422095
+s__Bacteroides_sp_1_1_6 1 GCF_000159875
+s__Pseudomonas_phage_JBD5 1 PRJNA188543
+s__Vibrio_phage_VP5 1 PRJNA14382
+s__Vibrio_phage_VP2 1 PRJNA14473
+s__Streptococcus_sp_GMD2S 1 GCF_000296995
+s__Acinetobacter_sp_TG2027 1 GCF_000302435
+s__Deinococcus_geothermalis 1 GCF_000196275
+s__Maize_fine_streak_virus 1 PRJNA15216
+s__Mycobacterium_phage_Bethlehem 1 PRJNA20945
+s__Starkeya_novella 1 GCF_000092925
+s__Penaeid_shrimp_infectious_myonecrosis_virus 1 PRJNA16652
+s__Maruca_vitrata_nucleopolyhedrovirus 1 PRJNA18533
+s__Vibrio_sp_16 1 GCF_000158115
+s__Natronobacterium_gregoryi 2 GCF_000230715 GCF_000337655
+s__Narcissus_mosaic_virus 1 PRJNA14660
+s__Deinococcus_proteolyticus 1 GCF_000190555
+s__Listeria_phage_A511 1 PRJNA20793
+s__Veillonella_dispar 1 GCF_000160015
+s__Sclerotinia_sclerotiorum_partitivirus_S 1 PRJNA39595
+s__Sida_yellow_vein_virus_satellite_DNA_beta 1 PRJNA15562
+s__Nocardiopsis_sp_CNS639 1 GCF_000381685
+s__Lentisphaera_araneosa 1 GCF_000170755
+s__Henriciella_marina 1 GCF_000376805
+s__Vibrio_albensis 1 GCF_000174235
+s__Alteromonas_macleodii 12 GCF_000300175 GCF_000439595 GCF_000310085 GCF_000439575 GCF_000299955 GCF_000020585 GCF_000439515 GCF_000299995 GCF_000439475 GCF_000172635 GCF_000439535 GCF_000439555
+s__Porphyromonas_uenonis 1 GCF_000174775
+s__Propionibacterium_sp_HGH0353 1 GCF_000413335
+s__Clostridium_sp_L2_50 1 GCF_000154245
+s__Staphylococcus_phage_GH15 1 PRJNA181069
+s__Thermus_sp_CCB_US3_UF1 1 GCF_000236585
+s__Culex_pipiens_densovirus 1 PRJNA37995
+s__zeta_proteobacterium_SCGC_AB_137_J06 1 GCF_000379245
+s__Haemophilus_haemolyticus 1 GCF_000262285
+s__Gammapapillomavirus_10 1 PRJNA49377
+s__Mamastrovirus_1 1 PRJNA15436
+s__Polynucleobacter_necessarius 2 GCF_000016345 GCF_000019745
+s__Neisseria_cinerea 1 GCF_000173895
+s__Ajellomyces_capsulatus 1 GCA_000149585
+s__Janthinobacterium_sp_CG3 1 GCF_000344615
+s__Xanthomonas_citri_phage_CP2 1 PRJNA188546
+s__Modestobacter_multiseptatus 1 GCF_000306785
+s__Desulfitobacterium_dehalogenans 1 GCF_000243155
+s__Dickeya_zeae 4 GCF_000382585 GCF_000400525 GCF_000264075 GCF_000023565
+s__Jaagsiekte_sheep_retrovirus 1 PRJNA14665
+s__Bradyrhizobium_sp_WSM1253 1 GCF_000244935
+s__Talaromyces_marneffei 1 GCA_000001985
+s__Maize_streak_Reunion_virus 1 PRJNA165745
+s__Nocardia_sp_BMG111209 1 GCF_000381925
+s__Mycobacterium_phage_Kostya 1 PRJNA30695
+s__Miniopterus_bat_coronavirus_HKU8 1 PRJNA29245
+s__Tobacco_yellow_crinkle_virus 1 PRJNA67693
+s__Tomato_leaf_curl_Yemen_betasatellite 1 PRJNA177643
+s__Clostridium_ramosum 1 GCF_000154485
+s__Pea_seed_borne_mosaic_virus 1 PRJNA15295
+s__Candidatus_Amoebophilus_asiaticus 1 GCF_000020565
+s__Candidatus_Tremblaya_princeps 2 GCF_000219195 GCF_000220965
+s__Ralstonia_phage_RSS1 1 PRJNA18291
+s__Beet_western_yellows_ST9_associated_virus 1 PRJNA14910
+s__Maracuja_mosaic_virus 1 PRJNA18531
+s__Streptococcus_sp_oral_taxon_058 1 GCF_000235485
+s__Streptococcus_sp_oral_taxon_056 1 GCF_000220065
+s__Enterobacteria_phage_HK225 1 PRJNA183140
+s__Nocardiopsis_valliformis 1 GCF_000340985
+s__Penicillium_chrysogenum_virus 1 PRJNA16141
+s__Synechococcus_phage_S_CAM1 1 PRJNA195484
+s__Synechococcus_phage_S_CAM8 1 PRJNA209065
+s__Corynebacterium_pilosum 1 GCF_000373805
+s__Mouse_parvovirus_2 1 PRJNA17125
+s__Acinetobacter_rudis 1 GCF_000413895
+s__Sphaerobacter_thermophilus 1 GCF_000024985
+s__Brevibacillus_agri 1 GCF_000328345
+s__Pyrobaculum_islandicum 1 GCF_000015205
+s__Campylobacter_upsaliensis 2 GCF_000185345 GCF_000167395
+s__Tomato_leaf_curl_Philippines_virus 1 PRJNA14297
+s__Potato_yellow_mosaic_Panama_virus 1 PRJNA14013
+s__Tioman_virus 1 PRJNA14846
+s__Mokola_virus 1 PRJNA15013
+s__Rhodanobacter_thiooxydans 1 GCF_000264375
+s__Pseudomonas_plecoglossicida 1 GCF_000412715
+s__Marine_RNA_virus_SOG 1 PRJNA20647
+s__Staphylococcus_phage_ROSA 1 PRJNA15274
+s__Weissella_phage_phiYS61 1 PRJNA171973
+s__Streptomyces_sp_CNB091 1 GCF_000377965
+s__Mannheimia_phage_phiMHaA1 1 PRJNA17103
+s__Strawberry_chlorotic_fleck_associated_virus 1 PRJNA17741
+s__Shamonda_virus 1 PRJNA173358
+s__Mycobacterium_phage_Phaedrus 1 PRJNA30697
+s__Cotton_leaf_curl_virus_betasatellite 1 PRJNA162497
+s__Mycobacterium_phage_Breezona 1 PRJNA206035
+s__Ross_River_virus 1 PRJNA15314
+s__Prochlorococcus_sp_W9 1 GCF_000291925
+s__Streptomyces_sp_SPB78 1 GCF_000158855
+s__Streptomyces_sp_SPB74 1 GCF_000154905
+s__Brassica_yellows_virus 1 PRJNA73689
+s__Streptococcus_phage_7201 1 PRJNA14051
+s__Alcaligenes_faecalis 1 GCF_000275465
+s__Bat_coronavirus_CDPHE15_USA_2006 1 PRJNA215863
+s__Okra_enation_leaf_curl_virus 1 PRJNA61775
+s__Azospira_oryzae 1 GCF_000236665
+s__Holdemania_filiformis 1 GCF_000157995
+s__Legionella_drancourtii 1 GCF_000162755
+s__Flavobacteriaceae_bacterium_HQM9 1 GCF_000218485
+s__Natronomonas_pharaonis 1 GCF_000026045
+s__Sunn_hemp_leaf_distortion_virus 1 PRJNA39609
+s__Canine_papillomavirus_10 1 PRJNA74355
+s__Treponema_azotonutricium 1 GCF_000214355
+s__Bacillus_phage_MG_B1 1 PRJNA206485
+s__Parabacteroides_goldsteinii 2 GCF_000307395 GCF_000403825
+s__Prevotella_copri 1 GCF_000157935
+s__Infectious_hypodermal_and_hematopoietic_necrosis_virus 1 PRJNA14436
+s__Salmonella_phage_vB_SenS_Ent1 1 PRJNA181993
+s__Halorubrum_californiense 1 GCF_000336875
+s__Clostridiaceae_bacterium_JC118 1 GCF_000313565
+s__Sweetpotato_badnavirus_A 1 PRJNA68017
+s__Acinetobacter_tandoii 1 GCF_000400735
+s__Bat_coronavirus_BtCoV_133_2005 1 PRJNA17585
+s__Advenella_kashmirensis 2 GCF_000219915 GCF_000506985
+s__Lachnospiraceae_bacterium_4_1_37FAA 1 GCF_000191805
+s__Pantoea_vagans 1 GCF_000148935
+s__Microlunatus_phosphovorus 1 GCF_000270245
+s__Candidatus_Moranella_endobia 2 GCF_000219175 GCF_000364725
+s__Ureaplasma_parvum 4 GCF_000006625 GCF_000019345 GCF_000171355 GCF_000169895
+s__White_ash_mosaic_virus 1 PRJNA32671
+s__Opitutaceae_bacterium_TAV5 1 GCF_000242935
+s__Bacillus_sp_M_2_6 1 GCF_000264255
+s__Oscillibacter_valericigenes 1 GCF_000283575
+s__Lactobacillus_sp_7_1_47FAA 1 GCF_000227195
+s__Leptospira_alstoni 2 GCF_000347175 GCF_000332555
+s__Clostridium_botulinum 29 GCF_000171055 GCF_000020285 GCF_000204395 GCF_000307635 GCF_000204375 GCF_000063585 GCF_000439615 GCF_000017025 GCF_000171095 GCF_000175335 GCF_000309805 GCF_000092345 GCF_000353835 GCF_000020345 GCF_000171075 GCF_000204565 GCF_000175395 GCF_000439815 GCF_000017045 GCF_000019305 GCF_000219255 GCF_000022765 GCF_000019545 GCF_000307655 GCF_000253195 GCF_000020165 GCF_000439635 GCF_000439655 GCF_000017065
+s__Sporolactobacillus_inulinus 1 GCF_000222445
+s__Streptococcus_thoraltensis 1 GCF_000380145
+s__Turdivirus_1 1 PRJNA51587
+s__Turdivirus_2 1 PRJNA51589
+s__Mycoplasma_capricolum 2 GCF_000012765 GCF_000192395
+s__Fervidicoccus_fontis 1 GCF_000258425
+s__Alistipes_sp_JC136 1 GCF_000285455
+s__Parabacteroides_sp_D13 1 GCF_000162275
+s__Cyclovirus_VN 1 PRJNA210797
+s__Geobacillus_virus_E2 1 PRJNA19797
+s__Enterobacterial_phage_mEp390 1 PRJNA183154
+s__Propionibacterium_phage_P100_A 1 PRJNA177535
+s__Leptospira_sp_serovar_Kenya 1 GCF_000347195
+s__Honeysuckle_yellow_vein_virus 1 PRJNA15224
+s__Sacbrood_virus 1 PRJNA14688
+s__Thermotoga_sp_RQ2 1 GCF_000019625
+s__Turkey_adenovirus_4 1 PRJNA225922
+s__Leptospira_borgpetersenii 21 GCF_000306375 GCF_000346975 GCF_000355135 GCF_000306415 GCF_000342885 GCF_000353225 GCF_000244535 GCF_000246695 GCF_000246555 GCF_000244215 GCF_000244495 GCF_000244075 GCF_000013945 GCF_000306335 GCF_000244255 GCF_000013965 GCF_000306675 GCF_000243795 GCF_000244835 GCF_000306315 GCF_000243775
+s__Pantoea_sp_A4 1 GCF_000295955
+s__Cacao_swollen_shoot_virus 1 PRJNA14534
+s__Propionibacterium_phage_B5 1 PRJNA14163
+s__Clostridium_tyrobutyricum 3 GCF_000359585 GCF_000392375 GCF_000332015
+s__Edwardsiella_phage_MSW_3 1 PRJNA185428
+s__Serratia_phage_phiMAM1 1 PRJNA185777
+s__Burkholderia_phage_BcepNY3 1 PRJNA19963
+s__Humibacter_albus 1 GCF_000421825
+s__Carnobacterium_sp_AT7 1 GCF_000171855
+s__Coprococcus_sp_HPP0048 1 GCF_000411355
+s__Thiomicrospira_crunogena 1 GCF_000012605
+s__SAR324_cluster_bacterium_SCGC_AAA240_J09 1 GCF_000213355
+s__Honeysuckle_yellow_vein_Kagoshima_virus 1 PRJNA18657
+s__Pseudomonas_phage_OBP 1 PRJNA81003
+s__Thermobacillus_composti 1 GCF_000227705
+s__Enterobacteria_phage_P22 1 PRJNA14478
+s__Flavobacterium_sp_WG21 1 GCF_000335775
+s__Tomato_leaf_curl_Ranchi_virus 1 PRJNA89399
+s__Patulibacter_medicamentivorans 1 GCF_000240225
+s__Nitrosospira_multiformis 1 GCF_000196355
+s__Rhodothermus_phage_RM378 1 PRJNA14420
+s__Peru_tomato_mosaic_virus 1 PRJNA15406
+s__Bacillus_phage_Wip1 1 PRJNA215653
+s__Halobacterium_sp_DL1 1 GCF_000230955
+s__Acidithiobacillus_sp_GGI_221 1 GCF_000179815
+s__Pseudomonas_phage_JG004 1 PRJNA181068
+s__Streptomyces_cattleya 2 GCF_000240165 GCF_000237305
+s__Neospora_caninum 1 GCA_000208865
+s__Xanthomonas_sp_NCPPB1131 1 GCF_000226895
+s__Xanthomonas_sp_NCPPB1132 1 GCF_000226915
+s__Lysinibacillus_boronitolerans 1 GCF_000286375
+s__Lambdapapillomavirus_1 1 PRJNA14421
+s__Lambdapapillomavirus_2 1 PRJNA14326
+s__Lambdapapillomavirus_3 1 PRJNA40369
+s__Lambdapapillomavirus_4 1 PRJNA15468
+s__Rickettsia_helvetica 1 GCF_000255355
+s__Geobacillus_sp_Y4_1MC1 1 GCF_000166075
+s__Debaryomyces_hansenii 1 GCA_000006445
+s__Ageratum_yellow_vein_virus 1 PRJNA15203
+s__Koala_retrovirus 1 PRJNA210799
+s__Mycobacterium_phage_Butterscotch 1 PRJNA32007
+s__Botryotinia_fuckeliana_partitivirus_1 1 PRJNA28759
+s__Halobacillus_halophilus 1 GCF_000284515
+s__Alistipes_senegalensis 1 GCF_000312145
+s__Lactobacillus_phage_LL_H 1 PRJNA19803
+s__Arenimonas_oryziterrae 1 GCF_000420545
+s__Enterococcus_sp_GMD5E 1 GCF_000302675
+s__Weissella_confusa 1 GCF_000239955
+s__Grapevine_Pinot_gris_virus 1 PRJNA70003
+s__Scrophularia_mottle_virus 1 PRJNA32679
+s__Clostridium_phage_phiCD38_2 1 PRJNA67249
+s__Staphylococcus_phage_66 1 PRJNA15263
+s__Lactobacillus_johnsonii_prophage_Lj771 1 PRJNA28145
+s__Cellulophaga_phage_phi46_1 1 PRJNA212961
+s__Desulfovibrio_fructosivorans 1 GCF_000179555
+s__Cellulophaga_phage_phi46_3 1 PRJNA212963
+s__Clostridium_sp_SS2_1 1 GCF_000154545
+s__Bacillus_phage_Andromeda 1 PRJNA192872
+s__Paenibacillus_polymyxa 5 GCF_000217775 GCF_000164985 GCF_000265445 GCF_000146875 GCF_000237325
+s__Helicobacter_phage_KHP30 1 PRJNA184163
+s__Abutilon_mosaic_Bolivia_virus 1 PRJNA62479
+s__Bat_adenovirus_B 1 PRJNA72369
+s__Leptotrichia_goodfellowii 1 GCF_000176335
+s__Mamestra_configurata_nucleopolyhedrovirus_A 1 PRJNA14168
+s__Calicivirus_isolate_TCG 1 PRJNA15123
+s__Methanobacterium_formicicum 1 GCF_000302455
+s__Bacillus_sp_95MFCvi2_1 1 GCF_000374965
+s__Acidithiobacillus_caldus 2 GCF_000175575 GCF_000221025
+s__Cotton_leaf_curl_Alabad_virus 1 PRJNA14240
+s__Perch_rhabdovirus 1 PRJNA194138
+s__Thermococcus_sp_4557 1 GCF_000221185
+s__Dysgonomonas_gadei 1 GCF_000213555
+s__Roseovarius_sp_217 1 GCF_000152845
+s__gamma_proteobacterium_HIMB30 1 GCF_000227525
+s__Helicobacter_mustelae 1 GCF_000091985
+s__Tetrahymena_thermophila 1 GCA_000189635
+s__Pelagibacter_phage_HTVC019P 1 PRJNA192868
+s__Dictyoglomus_thermophilum 1 GCF_000020965
+s__Chlorobium_phaeobacteroides 2 GCF_000015125 GCF_000020545
+s__Shigella_phage_EP23 1 PRJNA80919
+s__Beet_soil_borne_mosaic_virus 1 PRJNA14750
+s__Desulfovibrio_hydrothermalis 1 GCF_000331025
+s__Pseudomonas_extremaustralis 1 GCF_000242115
+s__Catenibacterium_mitsuokai 1 GCF_000173795
+s__Eilat_virus 1 PRJNA175588
+s__Spodoptera_litura_granulovirus 1 PRJNA19695
+s__Pseudomonas_sp_CFII64 1 GCF_000416235
+s__Neisseria_gonorrhoeae 18 GCF_000185865 GCF_000156815 GCF_000156935 GCF_000156955 GCF_000156975 GCF_000156775 GCF_000006845 GCF_000156915 GCF_000273665 GCF_000159935 GCF_000156875 GCF_000156795 GCF_000273685 GCF_000156895 GCF_000156755 GCF_000156835 GCF_000020105 GCF_000163535
+s__Enterobacteria_phage_PsP3 1 PRJNA14345
+s__Nostoc_sp_PCC_7524 1 GCF_000316645
+s__Methanolobus_psychrophilus 1 GCF_000306725
+s__Deinococcus_aquatilis 1 GCF_000378445
+s__Sphingomonas_phyllosphaerae 1 GCF_000419605
+s__Sathuperi_virus 1 PRJNA173356
+s__Human_papillomavirus_type_154 1 PRJNA208538
+s__Pyrenophora_tritici_repentis 1 GCA_000149985
+s__Burkholderia_sp_Ch1_1 1 GCF_000178415
+s__Simian_immunodeficiency_virus 2 PRJNA14872 PRJNA15501
+s__Columnea_latent_viroid 1 PRJNA14756
+s__Sindbis_virus 1 PRJNA15316
+s__Eubacterium_saphenum 1 GCF_000161975
+s__Zucchini_yellow_mosaic_virus 1 PRJNA15390
+s__Meno_virus 1 PRJNA196419
+s__Helicoverpa_armigera_multiple_nucleopolyhedrovirus 1 PRJNA33003
+s__Turnip_yellow_mosaic_virus 1 PRJNA15293
+s__Zymoseptoria_tritici 1 GCA_000219625
+s__Fer_de_Lance_paramyxovirus 1 PRJNA14985
+s__Merremia_mosaic_virus 1 PRJNA16699
+s__Vibrio_halioticoli 1 GCF_000496695
+s__Aspergillus_fumigatus 1 GCA_000002655
+s__Burkholderia_phage_ST79 1 PRJNA206488
+s__Nesterenkonia_alba 1 GCF_000421745
+s__H_1_parvovirus 1 PRJNA14578
+s__Gordonia_soli 1 GCF_000334455
+s__Bacteroides_sp_2_2_4 1 GCF_000157055
+s__Metascardovia_criceti 1 GCF_000376885
+s__Gluconacetobacter_sp_SXCC_1 1 GCF_000208635
+s__Scallion_mosaic_virus 1 PRJNA15190
+s__Keunjorong_mosaic_virus 1 PRJNA76731
+s__Desulfurobacterium_thermolithotrophum 1 GCF_000191045
+s__Halorubrum_phage_CGphi46 1 PRJNA209066
+s__Mycobacterium_sp_KMS 1 GCF_000015405
+s__Clanis_bilineata_nucleopolyhedrovirus 1 PRJNA17485
+s__Bear_Canyon_virus 1 PRJNA28325
+s__Dehalococcoides_mccartyi 6 GCF_000025025 GCF_000011905 GCF_000009025 GCF_000499365 GCF_000025585 GCF_000016705
+s__Corynebacterium_pseudodiphtheriticum 1 GCF_000466825
+s__Gull_circovirus 1 PRJNA18019
+s__Radish_leaf_curl_betasatellite 1 PRJNA28281
+s__Streptococcus_phage_Dp_1 1 PRJNA64617
+s__Rhizobium_sp_PDO1_076 1 GCF_000247475
+s__Tobacco_leaf_chlorosis_betasatellite 1 PRJNA178075
+s__Wheat_yellow_mosaic_virus 1 PRJNA15358
+s__Foxtail_mosaic_virus 1 PRJNA14640
+s__Lactobacillus_kefiranofaciens 1 GCF_000214785
+s__Thermotoga_maritima 3 GCF_000230655 GCF_000008545 GCF_000390265
+s__Anoxybacillus_sp_DT3_1 1 GCF_000346275
+s__Asparagus_virus_2 1 PRJNA33493
+s__Streptomyces_sp_KhCrAH_340 1 GCF_000373445
+s__Klebsiella_pneumoniae 150 GCF_000406725 GCF_000406745 GCF_000281535 GCF_000492195 GCF_000409105 GCF_000406425 GCF_000465975 GCF_000281615 GCF_000406385 GCF_000493075 GCF_000313365 GCF_000417205 GCF_000406505 GCF_000309505 GCF_000406865 GCF_000474885 GCF_000492295 GCF_000163455 GCF_000281355 GCF_000474015 GCF_000492315 GCF_000406685 GCF_000493155 GCF_000346145 GCF_000417045 GCF_000492535 GCF_000492415 GCF_000409085 GCF_000417545 GCF_000281495 GCF_000294365 GCF_000283455 GCF_000412575 G [...]
+s__Sesbania_mosaic_virus 1 PRJNA15372
+s__Begomovirus_associated_DNA_II 1 PRJNA15161
+s__Bacteroides_propionicifaciens 1 GCF_000375405
+s__Chlorobium_limicola 1 GCF_000020465
+s__Escherichia_sp_TW09231 1 GCF_000208465
+s__Hibiscus_latent_Singapore_virus 1 PRJNA17573
+s__Vibrio_phage_VHML 1 PRJNA14234
+s__Citreicella_sp_357 1 GCF_000259095
+s__Monkeypox_virus 1 PRJNA15142
+s__Astrovirus_MLB3 1 PRJNA178563
+s__Astrovirus_MLB2 1 PRJNA76723
+s__Synechococcus_phage_S_MbCM6 1 PRJNA181072
+s__Halorubrum_sp_J07HR59 1 GCF_000416045
+s__Arabis_mosaic_virus 1 PRJNA14932
+s__Mycobacterium_phage_Brujita 1 PRJNA32005
+s__Lactococcus_phage_phiLC3 1 PRJNA14362
+s__Mycobacterium_phage_Catera 1 PRJNA17141
+s__Vibrio_phage_VpV262 1 PRJNA14316
+s__Beet_cryptic_virus_1 1 PRJNA32709
+s__Lactococcus_lactis 20 GCF_000006865 GCF_000192705 GCF_000468955 GCF_000447885 GCF_000025045 GCF_000447905 GCF_000143205 GCF_000284735 GCF_000488975 GCF_000312685 GCF_000344575 GCF_000479375 GCF_000447825 GCF_000014545 GCF_000236475 GCF_000348965 GCF_000447925 GCF_000447845 GCF_000447985 GCF_000447965
+s__Lactobacillus_vini 1 GCF_000255495
+s__Kyuri_green_mottle_mosaic_virus 1 PRJNA15140
+s__Candidatus_Nitrosoarchaeum_koreensis 2 GCF_000299365 GCF_000220175
+s__Morganella_phage_MmP1 1 PRJNA30793
+s__Roseiflexus_sp_RS_1 1 GCF_000016665
+s__Dasheen_mosaic_virus 1 PRJNA15388
+s__Ilyobacter_polytropus 1 GCF_000165505
+s__Pseudomonas_phage_phiKZ 1 PRJNA14251
+s__Cardiobacterium_valvarum 1 GCF_000239355
+s__Enterobacteria_phage_T5 1 PRJNA15143
+s__Actinomyces_coleocanis 1 GCF_000159015
+s__Clover_yellow_vein_virus 1 PRJNA15353
+s__Pseudanabaena_sp_PCC_6802 1 GCF_000332175
+s__Enterobacteria_phage_T1 1 PRJNA14496
+s__Ostreid_herpesvirus_1 1 PRJNA14552
+s__Meganema_perideroedes 1 GCF_000374145
+s__Human_adenovirus_B 3 PRJNA14607 PRJNA15150 PRJNA31177
+s__Human_adenovirus_A 2 PRJNA14517 PRJNA40315
+s__Human_adenovirus_G 1 PRJNA14626
+s__Human_adenovirus_F 1 PRJNA14487
+s__Human_adenovirus_E 2 PRJNA15152 PRJNA162489
+s__Human_adenovirus_D 3 PRJNA14535 PRJNA15105 PRJNA39353
+s__Bacteroides_barnesiae 1 GCF_000374585
+s__Streptomyces_sp_ScaeMP_e10 1 GCF_000373405
+s__Wigeon_coronavirus_HKU20 1 PRJNA109279
+s__Amphritea_japonica 1 GCF_000381785
+s__Tomato_yellow_spot_virus 1 PRJNA16327
+s__Rickettsia_philipii 1 GCF_000283995
+s__Macaque_simian_foamy_virus 1 PRJNA30115
+s__Hendra_virus 1 PRJNA14911
+s__Afipia_sp_1NLS2 1 GCF_000178995
+s__Phage_Gifsy_1 1 PRJNA32269
+s__Streptococcus_sp_oral_taxon_071 1 GCF_000146755
+s__Dioscorea_bacilliform_virus 1 PRJNA18829
+s__Phage_Gifsy_2 1 PRJNA32271
+s__Methylacidiphilum_fumariolicum 1 GCF_000297415
+s__Cyanothece_sp_ATCC_51472 1 GCF_000231425
+s__Enterobacter_sp_MR1 1 GCF_000390385
+s__Natranaerobius_thermophilus 1 GCF_000020005
+s__Acinetobacter_phage_Acj9 1 PRJNA60121
+s__Paenibacillus_sp_PAMC_26794 1 GCF_000316035
+s__Planococcus_halocryophilus 1 GCF_000342445
+s__Beet_black_scorch_virus 1 PRJNA14949
+s__Methylobacterium_sp_88A 1 GCF_000376345
+s__Candidatus_Hodgkinia_cicadicola 1 GCF_000021505
+s__Candidatus_Phytoplasma_australiense 2 GCF_000069925 GCF_000397185
+s__Rose_rosette_virus 1 PRJNA64937
+s__Peptoniphilus_sp_oral_taxon_836 1 GCF_000179335
+s__Streptococcus_vestibularis 2 GCF_000180075 GCF_000188295
+s__Lujo_virus 1 PRJNA38405
+s__Paenibacillus_sp_HGH0039 1 GCF_000411255
+s__Circulifer_tenellus_virus_1 1 PRJNA51183
+s__Vibrio_phage_K139 1 PRJNA14144
+s__alpha_proteobacterium_SCGC_AAA027_J10 1 GCF_000371825
+s__Acinetobacter_schindleri 3 GCF_000368465 GCF_000301815 GCF_000368625
+s__Fischerella_muscicola 2 GCF_000317245 GCF_000317205
+s__Pyrobaculum_oguniense 1 GCF_000247545
+s__Donkey_orchid_virus_A 1 PRJNA202316
+s__Microbacterium_sp_B19 1 GCF_000333395
+s__Lactobacillus_casei_paracasei 56 GCF_000309725 GCF_000309785 GCF_000410255 GCF_000410235 GCF_000309685 GCF_000410355 GCF_000410015 GCF_000409875 GCF_000410415 GCF_000410295 GCF_000410375 GCF_000410315 GCF_000309765 GCF_000026485 GCF_000409955 GCF_000410175 GCF_000410475 GCF_000410135 GCF_000194785 GCF_000309665 GCF_000309585 GCF_000409975 GCF_000410435 GCF_000410455 GCF_000418515 GCF_000410155 GCF_000410495 GCF_000309625 GCF_000155515 GCF_000019245 GCF_000409995 GCF_000309565 GCF_0004 [...]
+s__Propionibacterium_phage_PHL060L00 1 PRJNA219122
+s__Escherichia_albertii 3 GCF_000208505 GCF_000155105 GCF_000208425
+s__Bacillus_phage_Eoghan 1 PRJNA192874
+s__Deinococcus_wulumuqiensis 1 GCF_000348665
+s__actinobacterium_SCGC_AAA044_N04 1 GCF_000378885
+s__Selenomonas_sp_F0473 1 GCF_000315545
+s__Solenopsis_invicta_densovirus 1 PRJNA226730
+s__Streptomyces_sp_AA0539 1 GCF_000297635
+s__Goose_paramyxovirus_SF02 1 PRJNA14895
+s__Anaerostipes_hadrus 1 GCF_000332875
+s__Dinoroseobacter_shibae 1 GCF_000018145
+s__Pseudomonas_phage_14_1 1 PRJNA33265
+s__Azospirillum_phage_Cd 1 PRJNA28841
+s__Vibrio_tubiashii 2 GCF_000222665 GCF_000259295
+s__Ahrensia_sp_R2A130 1 GCF_000179775
+s__Chipapillomavirus_2 1 PRJNA28243
+s__Burkholderia_sp_BT03 1 GCF_000281995
+s__Clostridium_phage_phiMMP02 1 PRJNA179416
+s__Rhodospirillum_centenum 1 GCF_000016185
+s__Clostridium_phage_phiMMP04 1 PRJNA179417
+s__Methylovulum_miyakonense 1 GCF_000384075
+s__Lachnospiraceae_bacterium_6_1_63FAA 1 GCF_000209425
+s__Grapevine_berry_inner_necrosis_virus 1 PRJNA63625
+s__Cesiribacter_andamanensis 1 GCF_000348925
+s__Bacillus_marmarensis 1 GCF_000474275
+s__Leuconostoc_kimchii 1 GCF_000092505
+s__Mycobacterium_phage_Wildcat 1 PRJNA17175
+s__Thermoanaerobacterium_thermosaccharolyticum 2 GCF_000145615 GCF_000328545
+s__Erysipelotrichaceae_bacterium_3_1_53 1 GCF_000165065
+s__Tomato_leaf_curl_Philippine_betasatellite 1 PRJNA19865
+s__Tetrapisispora_phaffii 1 GCA_000236905
+s__Pseudomonas_pelagia 1 GCF_000410875
+s__Campylobacter_jejuni 81 GCF_000466065 GCF_000254715 GCF_000254575 GCF_000254855 GCF_000254475 GCF_000254935 GCF_000468915 GCF_000285755 GCF_000017485 GCF_000163995 GCF_000017905 GCF_000255075 GCF_000242395 GCF_000254635 GCF_000254555 GCF_000168195 GCF_000466075 GCF_000254895 GCF_000285695 GCF_000254975 GCF_000285715 GCF_000254315 GCF_000254275 GCF_000184085 GCF_000254415 GCF_000254675 GCF_000254695 GCF_000254755 GCF_000255095 GCF_000168135 GCF_000184825 GCF_000302555 GCF_000493495 GCF [...]
+s__Mycobacterium_phage_Adjutor 1 PRJNA29919
+s__Herbaspirillum_lusitanum 1 GCF_000256565
+s__Xanthobacter_autotrophicus 1 GCF_000017645
+s__Mycobacterium_mageritense 1 GCF_000233935
+s__Chlamydia_pneumoniae_phage_CPAR39 1 PRJNA57809
+s__Richelia_intracellularis 2 GCF_000350105 GCF_000350125
+s__Thiocystis_violascens 1 GCF_000227745
+s__Microbacterium_testaceum 1 GCF_000202635
+s__Pseudoalteromonas_phage_pYD6_A 1 PRJNA195478
+s__Vagococcus_lutrae 1 GCF_000498295
+s__Mycobacterium_phage_Pacc40 1 PRJNA32017
+s__Bartonella_rattimassiliensis 2 GCF_000312605 GCF_000278215
+s__Apple_scar_skin_viroid 1 PRJNA14967
+s__Deerpox_virus_W_1170_84 1 PRJNA32597
+s__Natrinema_pallidum 1 GCF_000337615
+s__Mycobacterium_thermoresistibile 1 GCF_000234585
+s__Clostridium_phage_phiCP39_O 1 PRJNA32103
+s__Streptomyces_sp_CNS335 1 GCF_000377125
+s__Saccharomyces_20S_RNA_narnavirus 1 PRJNA14841
+s__Banana_streak_CA_virus 1 PRJNA66617
+s__alpha_proteobacterium_SCGC_AAA024_N17 1 GCF_000372045
+s__Mycoplasma_phage_MAV1 1 PRJNA14395
+s__Oscillochloris_trichoides 1 GCF_000152145
+s__Alphapapillomavirus_2 1 PRJNA15504
+s__Sida_leaf_curl_virus_associated_DNA_1 1 PRJNA16227
+s__Alphapapillomavirus_10 1 PRJNA15454
+s__Alphapapillomavirus_11 1 PRJNA15509
+s__Alphapapillomavirus_12 1 PRJNA14025
+s__Alphapapillomavirus_13 1 PRJNA15466
+s__Alphapapillomavirus_14 1 PRJNA15424
+s__Saccharopolyspora_spinosa 1 GCF_000194155
+s__Lactobacillus_pobuzihii 1 GCF_000349725
+s__Acinetobacter_sp_NIPH_713 1 GCF_000369445
+s__Enterobacteria_phage_K1_5 1 PRJNA17059
+s__Glaciecola_mesophila 1 GCF_000315015
+s__Lettuce_chlorosis_virus 1 PRJNA38899
+s__Brevibacterium_linens 1 GCF_000167575
+s__zeta_proteobacterium_SCGC_AB_137_C09 1 GCF_000379225
+s__Tomato_leaf_curl_Taiwan_virus 1 PRJNA14193
+s__Nitrosomonas_sp_Is79A3 1 GCF_000219585
+s__Chloroherpeton_thalassium 1 GCF_000020525
+s__Erectites_yellow_mosaic_virus_satellite_DNA_beta 1 PRJNA19827
+s__Wolbachia_pipientis 2 GCF_000242415 GCF_000333775
+s__Sowbane_mosaic_virus 1 PRJNA31125
+s__Prochlorococcus_phage_P_RSM4 1 PRJNA64703
+s__Streptomyces_sp_PAMC26508 1 GCF_000364805
+s__Actinoplanes_globisporus 1 GCF_000379645
+s__Leeia_oryzae 1 GCF_000376945
+s__St_Augustine_decline_satellite_virus 1 PRJNA14898
+s__Halosimplex_carlsbadense 1 GCF_000337455
+s__Trichomonas_vaginalis_virus 1 PRJNA14813
+s__Geobacter_sp_M18 1 GCF_000175115
+s__Kordia_algicida 1 GCF_000154725
+s__Natronorubrum_bangense 1 GCF_000337715
+s__Staphylococcus_phage_44AHJD 1 PRJNA14268
+s__Sphingobacterium_paucimobilis 1 GCF_000416985
+s__Rhizobium_gallicum 1 GCF_000373025
+s__Borrelia_recurrentis 1 GCF_000019705
+s__Planktothrix_phage_PaV_LD 1 PRJNA80915
+s__Tuber_aestivum_mitovirus 1 PRJNA67889
+s__Rhodococcus_sp_114MFTsu3_1 1 GCF_000383555
+s__Mesorhizobium_metallidurans 1 GCF_000350085
+s__Saccharomyces_23S_RNA_narnavirus 1 PRJNA14840
+s__Taro_vein_chlorosis_virus 1 PRJNA15163
+s__Sulfitobacter_phage_pCB2047_B 1 PRJNA195474
+s__Sulfitobacter_phage_pCB2047_C 1 PRJNA195472
+s__Sulfitobacter_phage_pCB2047_A 1 PRJNA195473
+s__Yersinia_ruckeri 1 GCF_000173755
+s__Rhodococcus_sp_R04 1 GCF_000219395
+s__Fibrobacter_succinogenes 2 GCF_000024665 GCF_000146505
+s__Aeromonas_phage_CC2 1 PRJNA181987
+s__Chlorobium_ferrooxidans 1 GCF_000168715
+s__Flock_house_virus 1 PRJNA15075
+s__Prochlorococcus_sp_W11 1 GCF_000291945
+s__Prochlorococcus_sp_W10 1 GCF_000291845
+s__Prochlorococcus_sp_W12 1 GCF_000291965
+s__Bacillus_acidiproducens 1 GCF_000374345
+s__Enterococcus_sulfureus 2 GCF_000407605 GCF_000407025
+s__Aphid_lethal_paralysis_virus 1 PRJNA14867
+s__Bacillus_phage_phIS3501 1 PRJNA181213
+s__Shigella_phage_Ag3 1 PRJNA42937
+s__Calibrachoa_mottle_virus 1 PRJNA214239
+s__Bifidobacterium_gallicum 1 GCF_000173375
+s__Zavarzinella_formosa 1 GCF_000255705
+s__Hemidesmus_yellow_mosaic_virus 1 PRJNA215128
+s__Actinomyces_europaeus 1 GCF_000411155
+s__Clostridiales_bacterium_1_7_47FAA 1 GCF_000155435
+s__Stenotrophomonas_phage_phiSMA7 1 PRJNA209360
+s__Salisaeta_icosahedral_phage_1 1 PRJNA167575
+s__Cellulophaga_phage_phi17_2 1 PRJNA212965
+s__Candidatus_Nitrosopumilus_salaria 1 GCF_000242875
+s__Cupriavidus_sp_HMR_1 1 GCF_000319775
+s__Sulfolobus_spindle_shaped_virus_5 1 PRJNA31219
+s__Francisella_philomiragia 2 GCF_000019285 GCF_000156715
+s__Acinetobacter_pittii_calcoaceticus_nosocomialis 28 GCF_000162375 GCF_000248235 GCF_000368605 GCF_000399705 GCF_000399665 GCF_000309015 GCF_000368085 GCF_000248315 GCF_000302375 GCF_000163635 GCF_000341835 GCF_000191145 GCF_000399685 GCF_000302295 GCF_000162035 GCF_000368965 GCF_000301775 GCF_000301695 GCF_000301675 GCF_000369045 GCF_000472005 GCF_000369025 GCF_000368945 GCF_000300635 GCF_000248335 GCF_000248175 GCF_000367865 GCF_000230465
+s__African_oil_palm_ringspot_virus 1 PRJNA36557
+s__Fervidobacterium_pennivorans 1 GCF_000235405
+s__Okra_leaf_curl_betasatellite 1 PRJNA14209
+s__Simian_virus_41 1 PRJNA15220
+s__Pseudomonas_sp_GM30 1 GCF_000282275
+s__Pseudomonas_sp_GM33 1 GCF_000282295
+s__Tomato_apical_stunt_viroid 1 PRJNA14670
+s__Listeria_innocua 3 GCF_000195795 GCF_000183885 GCF_000241405
+s__Rice_stripe_virus 1 PRJNA14795
+s__Flavobacterium_sp_SCGC_AAA536_P05 1 GCF_000384835
+s__Pelobacter_carbinolicus 1 GCF_000012885
+s__Candidatus_Arthromitus_sp_SFB_mouse 3 GCF_000284435 GCF_000270205 GCF_000225365
+s__Wenxinia_marina 1 GCF_000379485
+s__Staphylococcus_phage_TEM123 1 PRJNA167573
+s__Blautia_producta 1 GCF_000373885
+s__Xanthomonas_albilineans 1 GCF_000087965
+s__Acinetobacter_gerneri 1 GCF_000368565
+s__Macroptilium_yellow_vein_virus 1 PRJNA124061
+s__Pseudoalteromonas_sp_BSi20652 1 GCF_000239855
+s__Corynebacterium_phage_BFK20 1 PRJNA20757
+s__Segetibacter_koreensis 1 GCF_000374045
+s__Rickettsia_typhi 3 GCF_000277305 GCF_000008045 GCF_000277285
+s__Mycoplasma_canis 5 GCF_000258965 GCF_000258985 GCF_000258945 GCF_000259005 GCF_000258925
+s__Serratia_symbiotica 2 GCF_000186485 GCF_000238975
+s__Nautilia_profundicola 1 GCF_000021725
+s__Mycobacterium_phage_Barnyard 1 PRJNA14274
+s__Acinetobacter_sp_ANC_4105 1 GCF_000369485
+s__Tomato_leaf_curl_Guangdong_virus 1 PRJNA17805
+s__Diascia_yellow_mottle_virus 1 PRJNA30795
+s__Grapevine_leafroll_associated_virus_10 1 PRJNA33263
+s__Clostridium_sp_D5 1 GCF_000190355
+s__Fragaria_chiloensis_cryptic_virus 1 PRJNA19741
+s__Thermodesulfobium_narugense 1 GCF_000212395
+s__Slackia_exigua 1 GCF_000162875
+s__Synechococcus_sp_CC9902 1 GCF_000012505
+s__Sinorhizobium_phage_PBC5 1 PRJNA14146
+s__Beijerinckia_indica 1 GCF_000019845
+s__Synechococcus_sp_PCC_7002 1 GCF_000019485
+s__Burkholderia_phage_Bcep781 1 PRJNA14405
+s__Mycobacteriophage_Velveteen 1 PRJNA215123
+s__Rubidibacter_lacunae 1 GCF_000473895
+s__Bacteroides_oleiciplenus 1 GCF_000315485
+s__Enterobacteria_phage_IME08 1 PRJNA50177
+s__Mycobacterium_phage_PhrostyMug 1 PRJNA219114
+s__Oscillatoriales_cyanobacterium_JSC_12 1 GCF_000309945
+s__Lassa_virus 1 PRJNA14864
+s__Desulfovibrio_magneticus 2 GCF_000010665 GCF_000307955
+s__Tamiami_virus 1 PRJNA29831
+s__Oliveros_virus 1 PRJNA28319
+s__Bacteroides_xylanisolvens 4 GCF_000273315 GCF_000210075 GCF_000178295 GCF_000178215
+s__Piscirickettsia_salmonis 2 GCF_000300295 GCF_000401515
+s__Thermus_scotoductus 2 GCF_000187005 GCF_000381045
+s__Rose_yellow_vein_virus 1 PRJNA196972
+s__Clostridium_acidurici 1 GCF_000299355
+s__Barley_yellow_dwarf_virus_PAV 1 PRJNA15196
+s__Aeromonas_aquariorum 1 GCF_000315195
+s__Pseudomonas_synxantha 1 GCF_000263715
+s__Gossypium_punctatum_mild_leaf_curl_virus 1 PRJNA33489
+s__Tomato_leaf_curl_Gujarat_virus 1 PRJNA14238
+s__Lactobacillus_crispatus 11 GCF_000162255 GCF_000177575 GCF_000497065 GCF_000165885 GCF_000162315 GCF_000091765 GCF_000301135 GCF_000160515 GCF_000301115 GCF_000466885 GCF_000161915
+s__Microcystis_sp_T1_4 1 GCF_000297435
+s__Blattabacterium_sp_Periplaneta_americana 1 GCF_000093165
+s__Halanaerobium_saccharolyticum 1 GCF_000350165
+s__Porphyromonas_bennonis 1 GCF_000375645
+s__Avian_metapneumovirus 1 PRJNA16240
+s__Bean_yellow_dwarf_virus 1 PRJNA14605
+s__Coriobacteriaceae_bacterium_phI 1 GCF_000311845
+s__Psychrobacter_sp_G 1 GCF_000418305
+s__Peptoniphilus_harei 1 GCF_000183565
+s__Rabbit_hemorrhagic_disease_virus 1 PRJNA15313
+s__Mesorhizobium_amorphae 1 GCF_000233995
+s__Desulfocapsa_sulfexigens 1 GCF_000341395
+s__Methanothermococcus_thermolithotrophicus 1 GCF_000376965
+s__Reyranella_massiliensis 1 GCF_000312425
+s__Raspberry_ringspot_virus 1 PRJNA14934
+s__Brevibacterium_mcbrellneri 1 GCF_000178455
+s__Phenylobacterium_zucineum 1 GCF_000017265
+s__Sida_mosaic_Sinaloa_virus 1 PRJNA16937
+s__Aspergillus_clavatus 1 GCA_000002715
+s__Brucella_canis 12 GCF_000018525 GCF_000370605 GCF_000370585 GCF_000292185 GCF_000298575 GCF_000367285 GCF_000480295 GCF_000367305 GCF_000480275 GCF_000367265 GCF_000238195 GCF_000366825
+s__Pyrococcus_sp_NA2 1 GCF_000211475
+s__Prevotella_baroniae 1 GCF_000468635
+s__Tomato_blistering_mosaic_virus 1 PRJNA213013
+s__Clostridium_phytofermentans 1 GCF_000018685
+s__Thermoanaerobacterium_xylanolyticum 1 GCF_000189775
+s__Halalkalicoccus_jeotgali 2 GCF_000337255 GCF_000196895
+s__Tomato_leaf_curl_Pune_virus 1 PRJNA18015
+s__Enterobacteria_phage_RTP 1 PRJNA16178
+s__Oat_dwarf_virus 1 PRJNA30037
+s__Thermomonospora_curvata 1 GCF_000024385
+s__Brucella_sp_F23_97 1 GCF_000370965
+s__Acetobacter_pasteurianus 9 GCF_000010845 GCF_000010865 GCF_000010885 GCF_000010945 GCF_000010925 GCF_000010825 GCF_000285315 GCF_000010905 GCF_000010965
+s__Mycoplasma_gallisepticum 11 GCF_000025385 GCF_000286735 GCF_000286815 GCF_000286675 GCF_000092585 GCF_000286695 GCF_000286775 GCF_000286795 GCF_000286715 GCF_000025365 GCF_000286755
+s__Deerpox_virus_W_848_83 1 PRJNA15462
+s__Equine_papillomavirus_type_6 1 PRJNA193978
+s__Banna_virus 1 PRJNA15178
+s__Semliki_forest_virus 1 PRJNA15282
+s__Thermosipho_melanesiensis 1 GCF_000016905
+s__Nitrosopumilus_maritimus 1 GCF_000018465
+s__Atopobium_sp_oral_taxon_810 1 GCF_000466405
+s__Rhizoctonia_cerealis_endornavirus_1 1 PRJNA225929
+s__Frankia_symbiont_of_Datisca_glomerata 1 GCF_000177615
+s__Lewinella_cohaerens 1 GCF_000379805
+s__Cycloclasticus_sp_P1 1 GCF_000299965
+s__Lactococcus_phage_P680 1 PRJNA213080
+s__delta_proteobacterium_NaphS2 1 GCF_000179315
+s__Citrus_leaf_rugose_virus 1 PRJNA14759
+s__Roseobacter_sp_SK209_2_6 1 GCF_000169455
+s__Halovivax_ruber 1 GCF_000328525
+s__Lachnospiraceae_bacterium_3_1_57FAA_CT1 1 GCF_000218405
+s__Red_clover_mottle_virus 1 PRJNA15291
+s__Mycoplasma_ovipneumoniae 1 GCF_000218525
+s__Anabaena_sp_90 1 GCF_000312705
+s__Xanthomonas_phage_CP1 1 PRJNA184158
+s__His2_virus 1 PRJNA16651
+s__Enterobacteria_phage_phiP27 1 PRJNA14599
+s__Rhodococcus_sp_P27 1 GCF_000454285
+s__Cosavirus_A 1 PRJNA38497
+s__Alkalibacillus_haloalkaliphilus 1 GCF_000269905
+s__Pseudomonas_phage_LIT1 1 PRJNA42949
+s__Streptococcus_equi 4 GCF_000026585 GCF_000219765 GCF_000445225 GCF_000020765
+s__Curvularia_thermal_tolerance_virus 1 PRJNA30363
+s__Clostridium_carboxidivorans 2 GCF_000175595 GCF_000163855
+s__Theilovirus 2 PRJNA15292 PRJNA30053
+s__Kappapapillomavirus_1 1 PRJNA14057
+s__Ralstonia_sp_5_7_47FAA 1 GCF_000165085
+s__Prevotella_sp_oral_taxon_306 1 GCF_000257925
+s__Mycoplasma_synoviae 2 GCF_000385095 GCF_000008245
+s__Bean_golden_mosaic_virus 1 PRJNA14199
+s__Rhodococcus_triatomae 1 GCF_000341795
+s__Methylococcus_capsulatus 2 GCF_000297615 GCF_000008325
+s__Shigella_sp_D9 1 GCF_000158395
+s__Bifidobacterium_asteroides 1 GCF_000304215
+s__Segniliparus_rotundus 1 GCF_000092825
+s__Aeromicrobium_marinum 1 GCF_000160775
+s__Pseudomonas_phage_LUZ24 1 PRJNA28739
+s__Ahrensia_kielensis 1 GCF_000374465
+s__Cyanophage_MED4_117 1 PRJNA195503
+s__Red_clover_necrotic_mosaic_virus 1 PRJNA14796
+s__Clostridium_tetani 1 GCF_000007625
+s__Mycobacterium_phage_Pipefish 1 PRJNA17171
+s__Saccharomonospora_glauca 1 GCF_000243395
+s__Phlebiopsis_gigantea_mycovirus_dsRNA_1 1 PRJNA46855
+s__Brucella_sp_83_13 1 GCF_000157875
+s__Natrinema_pellirubrum 2 GCF_000337635 GCF_000230735
+s__Micromonas_pusilla_reovirus 1 PRJNA17091
+s__Cryphonectria_parasitica_mitovirus_1_NB631 1 PRJNA14838
+s__Bacillus_selenitireducens 1 GCF_000093085
+s__Rabbit_fibroma_virus 1 PRJNA14590
+s__Operophtera_brumata_reovirus 1 PRJNA16145
+s__Bacteroides_sp_3_1_33FAA 1 GCF_000162195
+s__Cassava_mosaic_Madagascar_alphasatellite 1 PRJNA175666
+s__African_swine_fever_virus 1 PRJNA15242
+s__Megavirus_lba 1 PRJNA188728
+s__Propionibacterium_phage_ATCC29399B_T 1 PRJNA177538
+s__Haloferax_mucosum 1 GCF_000337815
+s__Acinetobacter_sp_CIP_64_7 1 GCF_000369745
+s__Acinetobacter_sp_CIP_64_2 1 GCF_000369645
+s__Pseudomonas_syringae_group_genomosp_3 6 GCF_000145845 GCF_000177455 GCF_000177475 GCF_000007805 GCF_000172895 GCF_000177495
+s__Halorubrum_lipolyticum 1 GCF_000337375
+s__Candidatus_Nitrosoarchaeum_limnia 2 GCF_000241145 GCF_000204585
+s__Panine_herpesvirus_2 1 PRJNA14404
+s__Wolbachia_endosymbiont_of_Brugia_malayi 1 GCF_000008385
+s__Staphylococcus_equorum 2 GCF_000467635 GCF_000297455
+s__Methanoculleus_marisnigri 1 GCF_000015825
+s__Lactococcus_phage_phi7 1 PRJNA213073
+s__Chiltepin_yellow_mosaic_virus 1 PRJNA48419
+s__Acidianus_bottle_shaped_virus 1 PRJNA19605
+s__Kaistia_granuli 1 GCF_000380505
+s__Erwinia_phage_vB_EamM_Y2 1 PRJNA181231
+s__Enterococcus_columbae 3 GCF_000406925 GCF_000407225 GCF_000373065
+s__Cryptophlebia_leucotreta_granulovirus 1 PRJNA14302
+s__Chthoniobacter_flavus 1 GCF_000173075
+s__Renibacterium_salmoninarum 1 GCF_000018885
+s__Tobacco_necrosis_satellite_virus 1 PRJNA14672
+s__Capnocytophaga_sp_oral_taxon_332 1 GCF_000318275
+s__Marinimicrobia_bacterium_SCGC_AAA160_I06 1 GCF_000402815
+s__Potato_aucuba_mosaic_virus 1 PRJNA14771
+s__Thermodesulfatator_atlanticus 1 GCF_000421585
+s__Pseudomonas_mosselii 1 GCF_000498975
+s__Fusarium_graminearum_dsRNA_mycovirus_3 1 PRJNA41629
+s__Pseudomonas_sp_35MFCvi1_1 1 GCF_000378525
+s__Fusarium_graminearum_dsRNA_mycovirus_4 1 PRJNA41631
+s__Helminthosporium_victoriae_virus_190S 1 PRJNA14763
+s__Methanospirillum_hungatei 1 GCF_000013445
+s__Methanofollis_liminatans 1 GCF_000275865
+s__Synechococcus_phage_KBS_M_1A 1 PRJNA195500
+s__Methylobacterium_nodulans 1 GCF_000022085
+s__Clostridium_citroniae 1 GCF_000233455
+s__Thermovirga_lienii 1 GCF_000233775
+s__Desulfosporosinus_acidiphilus 1 GCF_000255115
+s__Soybean_mild_mottle_virus 1 PRJNA48593
+s__Candidatus_Mycoplasma_haemolamae 1 GCF_000281235
+s__Providencia_burhodogranariea 1 GCF_000314855
+s__Halovirus_HSTV_1 1 PRJNA207837
+s__Aravan_virus 1 PRJNA194139
+s__Enterobacteria_phage_RB51 1 PRJNA37819
+s__Emilia_yellow_vein_virus 1 PRJNA28689
+s__Tomato_leaf_curl_China_virus_OX2 1 PRJNA202888
+s__Raspberry_bushy_dwarf_virus 1 PRJNA14791
+s__Rhizobium_sp_AP16 1 GCF_000281735
+s__Drosophila_obscura_sigmavirus 1 PRJNA224247
+s__Spleen_focus_forming_virus 1 PRJNA14641
+s__Avocado_sunblotch_viroid 1 PRJNA14908
+s__Rice_black_streaked_dwarf_virus 1 PRJNA14790
+s__Tomato_leaf_curl_Joydebpur_virus 1 PRJNA16324
+s__Corynebacterium_casei 1 GCF_000234765
+s__Bovine_leukemia_virus 1 PRJNA14916
+s__Chickpea_redleaf_virus 1 PRJNA60625
+s__Enterobacteria_phage_ime09 1 PRJNA181233
+s__Candidatus_Blochmannia_chromaiodes 1 GCF_000331065
+s__Pig_stool_associated_circular_ssDNA_virus 1 PRJNA165737
+s__Rose_cryptic_virus_1 1 PRJNA28761
+s__Leptotrichia_buccalis 1 GCF_000023905
+s__Eubacterium_yurii 1 GCF_000146855
+s__Pseudomonas_phage_MR299_2 1 PRJNA183543
+s__Vibrio_ezurae 1 GCF_000467185
+s__Spirochaeta_thermophila 2 GCF_000147075 GCF_000184345
+s__Alistipes_sp_AP11 1 GCF_000321205
+s__Haloarcula_hispanica 1 GCF_000223905
+s__Geobacillus_thermoleovorans 1 GCF_000236605
+s__Reinekea_blandensis 1 GCF_000153185
+s__Cronobacter_phage_vB_CsaP_GAP52 1 PRJNA179411
+s__Mannheimia_succiniciproducens 1 GCF_000007745
+s__Pseudomonas_thermotolerans 1 GCF_000364625
+s__Oyster_mushroom_spherical_virus 1 PRJNA14951
+s__Curvibacter_lanceolatus 1 GCF_000381265
+s__Xanthomonas_perforans 1 GCF_000192045
+s__Torulaspora_delbrueckii 1 GCA_000243375
+s__Bergeyella_zoohelcum 2 GCF_000301095 GCF_000301075
+s__Rhodovulum_phage_RS1 1 PRJNA195480
+s__Torque_teno_sus_virus_k2 1 PRJNA48301
+s__Radish_mosaic_virus 1 PRJNA29843
+s__Neodiprion_abietis_NPV 1 PRJNA17361
+s__Clostridium_sp_ASF502 1 GCF_000364245
+s__Streptomyces_sp_Wigar10 1 GCF_000226995
+s__Galleria_mellonella_densovirus 1 PRJNA14221
+s__Escherichia_sp_TW11588 1 GCF_000208585
+s__Mycobacterium_phage_Dumbo 1 PRJNA206034
+s__Salivirus_A 2 PRJNA39349 PRJNA39553
+s__Natrinema_gari 1 GCF_000337175
+s__Yersinia_phage_phiR1_37 1 PRJNA76739
+s__Chapare_virus 1 PRJNA29223
+s__Pseudomonas_phage_vB_PaeM_C2_10_Ab1 1 PRJNA184146
+s__Corynebacterium_sp_KPL1998 1 GCF_000477895
+s__Methylomonas_sp_MK1 1 GCF_000365425
+s__Sugarcane_yellow_leaf_virus 1 PRJNA15363
+s__Tobacco_streak_virus 1 PRJNA15472
+s__Corynebacterium_sp_KPL1995 1 GCF_000477935
+s__Corynebacterium_sp_KPL1996 1 GCF_000477915
+s__Pseudomonas_sp_GM17 1 GCF_000282175
+s__Pseudomonas_sp_GM16 1 GCF_000282155
+s__Acinetobacter_bereziniae 3 GCF_000368505 GCF_000368925 GCF_000248295
+s__Mycobacterium_phage_Catdawg 1 PRJNA215124
+s__Pseudomonas_sp_GM18 1 GCF_000282195
+s__Maize_yellow_dwarf_virus_RMV 1 PRJNA208537
+s__Atopobium_vaginae 3 GCF_000179715 GCF_000159235 GCF_000178335
+s__Aestuariimicrobium_kwangyangense 1 GCF_000421525
+s__Ralstonia_phage_RSS0 1 PRJNA181985
+s__Chlamydia_ibidis 1 GCF_000454725
+s__Kluyveromyces_lactis 1 GCA_000002515
+s__Bacillus_isronensis 1 GCF_000298255
+s__Citrus_viroid_V 1 PRJNA28115
+s__Cherry_mottle_leaf_virus 1 PRJNA14695
+s__Cotton_leaf_curl_Bangalore_betasatellite 1 PRJNA15557
+s__JC_polyomavirus 1 PRJNA15477
+s__Staphylococcus_phage_phiPVL_CN125 1 PRJNA38431
+s__Candidatus_Pelagibacter_sp_IMCC9063 1 GCF_000195085
+s__Squash_mosaic_virus 1 PRJNA15384
+s__Streptococcus_phage_Abc2 1 PRJNA42791
+s__Ralstonia_phage_PE226 1 PRJNA64769
+s__Chlamydia_muridarum 2 GCF_000175535 GCF_000174995
+s__Aeromonas_phage_vB_AsaM_56 1 PRJNA181214
+s__Streptomyces_rapamycinicus 1 GCF_000418455
+s__Vibrio_phage_martha_12B12 1 PRJNA198434
+s__Arthrobacter_sp_135MFCol5_1 1 GCF_000374865
+s__Edwardsiella_phage_KF_1 1 PRJNA179430
+s__Acinetobacter_phage_Ac42 1 PRJNA60115
+s__Ktedonobacter_racemifer 1 GCF_000178855
+s__Streptococcus_phage_Sfi21 1 PRJNA14133
+s__Mycobacterium_phage_PattyP 1 PRJNA206030
+s__Broad_bean_wilt_virus_2 1 PRJNA15380
+s__Broad_bean_wilt_virus_1 1 PRJNA14905
+s__Vibrio_phage_VBP32 1 PRJNA195492
+s__Streptococcus_phage_MM1 1 PRJNA14601
+s__Mythimna_separata_entomopoxvirus_L 1 PRJNA203667
+s__Enterococcus_phage_phiFL4A 1 PRJNA42793
+s__Streptococcus_sp_2_1_36FAA 1 GCF_000161955
+s__Deltapapillomavirus_1 1 PRJNA15453
+s__gamma_proteobacterium_NOR5_3 1 GCF_000158155
+s__Deltapapillomavirus_3 1 PRJNA15460
+s__Deltapapillomavirus_4 1 PRJNA15513
+s__Deltapapillomavirus_5 1 PRJNA30665
+s__Bacteroides_sp_3_1_23 1 GCF_000162555
+s__delta_proteobacterium_MLMS_1 1 GCF_000168275
+s__Loktanella_hongkongensis 1 GCF_000365005
+s__Cotton_leaf_curl_Burewala_virus 1 PRJNA34757
+s__Wolbachia_endosymbiont_of_Onchocerca_ochengi 1 GCF_000306885
+s__Cowpea_aphid_borne_mosaic_virus 1 PRJNA15394
+s__Salmonella_phage_FSL_SP_031 1 PRJNA212717
+s__Salmonella_phage_FSL_SP_030 1 PRJNA212718
+s__Pleurotus_ostreatus_virus_1 1 PRJNA15169
+s__Gordonia_polyisoprenivorans 3 GCF_000385355 GCF_000241325 GCF_000247715
+s__Sida_golden_mottle_virus 1 PRJNA48421
+s__Dyodeltapapillomavirus_1 1 PRJNA32003
+s__Thermococcus_kodakarensis 1 GCF_000009965
+s__Stenotrophomonas_phage_phiSMA9 1 PRJNA15493
+s__Acetivibrio_cellulolyticus 1 GCF_000179595
+s__Hibiscus_chlorotic_ringspot_virus 1 PRJNA15208
+s__Southern_elephant_seal_virus 1 PRJNA88117
+s__Trypanosoma_cruzi 1 GCA_000209065
+s__Frankia_sp_BMG5_12 1 GCF_000374165
+s__Wolbachia_endosymbiont_of_Culex_pipiens_molestus 1 GCF_000208785
+s__Treponema_denticola 17 GCF_000191825 GCF_000338455 GCF_000338615 GCF_000340725 GCF_000340745 GCF_000340605 GCF_000338475 GCF_000340705 GCF_000413095 GCF_000413075 GCF_000340685 GCF_000338595 GCF_000340645 GCF_000008185 GCF_000338635 GCF_000338515 GCF_000413115
+s__Nafulsella_turpanensis 1 GCF_000346615
+s__Epinephelus_tauvina_nervous_necrosis_virus 1 PRJNA14849
+s__Digitaria_didactyla_striate_mosaic_virus 1 PRJNA53503
+s__Xanthomonas_euvesicatoria 1 GCF_000009165
+s__Blueberry_shock_virus 1 PRJNA218015
+s__Newbury_1_virus 2 PRJNA14845 PRJNA16653
+s__Mycobacterium_fortuitum 1 GCF_000295855
+s__Infectious_bursal_disease_virus 1 PRJNA14990
+s__Geobacter_metallireducens 2 GCF_000243475 GCF_000012925
+s__Clostridium_phage_PhiS63 1 PRJNA167577
+s__Sweet_potato_leaf_curl_China_Henan_virus 1 PRJNA210929
+s__Arthrospira_sp_PCC_8005 1 GCF_000176895
+s__Rhodomicrobium_vannielii 1 GCF_000166055
+s__Tomato_leaf_curl_Arusha_virus 1 PRJNA18861
+s__Shewanella_oneidensis 1 GCF_000146165
+s__Rice_yellow_mottle_virus 1 PRJNA15327
+s__Pediococcus_phage_clP1 1 PRJNA76735
+s__Thermosipho_africanus 2 GCF_000021285 GCF_000300715
+s__Mycoplasma_flocculare 1 GCF_000367185
+s__Leptospira_meyeri 2 GCF_000304275 GCF_000347075
+s__Methylobacterium_extorquens 5 GCF_000021845 GCF_000018845 GCF_000243435 GCF_000083545 GCF_000022685
+s__Heliothis_virescens_ascovirus_3a 1 PRJNA19151
+s__Atopobium_sp_ICM58 1 GCF_000283035
+s__Paenibacillus_sp_Aloe_11 1 GCF_000245715
+s__Burkholderia_phage_phi644_2 1 PRJNA62941
+s__Butyrivibrio_sp_XPD2006 1 GCF_000420865
+s__Methylobacterium_sp_MB200 1 GCF_000333655
+s__Asticcacaulis_sp_AC460 1 GCF_000495795
+s__Gallionella_capsiferriformans 1 GCF_000145255
+s__Turicibacter_sp_HGF1 1 GCF_000191865
+s__Pseudaminobacter_salicylatoxidans 1 GCF_000304395
+s__Maize_dwarf_mosaic_virus 1 PRJNA15355
+s__Vibrio_azureus 1 GCF_000467165
+s__Corynebacterium_genitalium 1 GCF_000143825
+s__Staphylococcus_hominis 4 GCF_000183685 GCF_000247085 GCF_000174735 GCF_000269685
+s__Kenaf_leaf_curl_virus 1 PRJNA28991
+s__zeta_proteobacterium_SCGC_AB_604_O16 1 GCF_000372125
+s__Macroptilium_yellow_mosaic_Florida_virus 1 PRJNA14399
+s__Campylobacter_fetus 3 GCF_000015085 GCF_000222425 GCF_000174675
+s__Marburg_marburgvirus 1 PRJNA15199
+s__Escherichia_sp_TW09276 1 GCF_000208445
+s__Enterobacteria_phage_Min27 1 PRJNA29143
+s__Bifidobacterium_minimum 1 GCF_000421685
+s__Toxoplasma_gondii 1 GCA_000006565
+s__Syntrophothermus_lipocalidus 1 GCF_000092405
+s__Cassava_brown_streak_virus 1 PRJNA38085
+s__Lettuce_virus_X 1 PRJNA30177
+s__Ethanoligenens_harbinense 1 GCF_000178115
+s__Rickettsia_heilongjiangensis 1 GCF_000221205
+s__Thermus_sp_RL 1 GCF_000252835
+s__Chocolate_lily_virus_A 1 PRJNA78931
+s__Geitlerinema_sp_PCC_7105 1 GCF_000332355
+s__Rabbit_calicivirus_Australia_1_MIC_07 1 PRJNA33267
+s__Bacillus_sp_WBUNB009 1 GCF_000319735
+s__Acidovorax_sp_NO_1 1 GCF_000238595
+s__Eidolon_helvum_parvovirus_1 1 PRJNA81567
+s__Blattabacterium_sp_Mastotermes_darwiniensis 1 GCF_000233435
+s__Acidianus_filamentous_virus_1 1 PRJNA14363
+s__Apple_green_crinkle_associated_virus 1 PRJNA176615
+s__Dragonfly_associated_alphasatellite 1 PRJNA181244
+s__Streptococcus_macacae 1 GCF_000187995
+s__Agrotis_segetum_nucleopolyhedrovirus 1 PRJNA16661
+s__secondary_endosymbiont_of_Heteropsylla_cubana 1 GCF_000287355
+s__Thermovibrio_ammonificans 1 GCF_000185805
+s__Enterobacteria_phage_SP6 1 PRJNA14291
+s__Fiji_disease_virus 1 PRJNA15473
+s__Lactococcus_phage_asccphi28 1 PRJNA28985
+s__Actinobacillus_suis 1 GCF_000307145
+s__Selenomonas_sputigena 2 GCF_000160495 GCF_000208405
+s__Acholeplasma_phage_MV_L1 1 PRJNA14573
+s__Shewanella_decolorationis 1 GCF_000485795
+s__Propionibacterium_phage_P1_1 1 PRJNA177537
+s__Spiroplasma_phage_1_C74 1 PRJNA14178
+s__Sweet_potato_leaf_curl_Spain_virus 1 PRJNA30673
+s__Enterobacteria_phage_TLS 1 PRJNA19775
+s__Streptomyces_sviceus 1 GCF_000154965
+s__Burkholderia_sp_SJ98 1 GCF_000256585
+s__Sugarcane_bacilliform_MO_virus 1 PRJNA16750
+s__Marinimicrobia_bacterium_JGI_0000039_D08 1 GCF_000405265
+s__Methylobacterium_sp_GXF4 1 GCF_000272495
+s__Microbacterium_phage_Min1 1 PRJNA19961
+s__Tomato_mild_mosaic_virus 1 PRJNA30187
+s__Leptotrichia_sp_oral_taxon_215 1 GCF_000469505
+s__Actinoplanes_sp_SE50_110 1 GCF_000237145
+s__Pseudomonas_phage_PT5 1 PRJNA30847
+s__Pseudomonas_phage_PT2 1 PRJNA30851
+s__Leucobacter_salsicius 1 GCF_000350525
+s__Tobacco_mosaic_virus 1 PRJNA15071
+s__Rio_Bravo_virus 1 PRJNA15368
+s__Prevotella_pleuritidis 1 GCF_000468135
+s__Succinatimonas_hippei 1 GCF_000188195
+s__Puniceispirillum_phage_HMO_2011 1 PRJNA213071
+s__Scytonema_hofmanni 1 GCF_000346485
+s__Lactobacillus_brevis 3 GCF_000014465 GCF_000469365 GCF_000159175
+s__Rhizobium_mongolense 1 GCF_000419765
+s__Mycobacterium_phage_Dylan 1 PRJNA219120
+s__Lachnospiraceae_bacterium_5_1_57FAA 1 GCF_000218425
+s__Melon_yellow_spot_virus 1 PRJNA17545
+s__Citrobacter_koseri 1 GCF_000018045
+s__Bluetongue_virus 1 PRJNA14938
+s__Beet_soil_borne_virus 1 PRJNA14751
+s__Methanococcus_maripaludis 5 GCF_000220645 GCF_000017225 GCF_000011585 GCF_000018485 GCF_000016125
+s__Prevotella_veroralis 2 GCF_000162935 GCF_000377625
+s__Fusobacterium_periodonticum 4 GCF_000297655 GCF_000158215 GCF_000163935 GCF_000160475
+s__Selenomonas_bovis 1 GCF_000381005
+s__Lymantria_xylina_MNPV 1 PRJNA46671
+s__Cellulophaga_phage_phi12a_1 1 PRJNA212956
+s__Thermotoga_neapolitana 1 GCF_000018945
+s__Hantaan_virus 1 PRJNA14929
+s__Mycobacterium_phage_Qyrzula 1 PRJNA17173
+s__Phaius_virus_X 1 PRJNA28617
+s__Bordetella_phage_BMP_1 1 PRJNA14358
+s__Nitrosococcus_halophilus 1 GCF_000024725
+s__Mycoplasma_columbinum 1 GCF_000222995
+s__Paprika_mild_mottle_virus 1 PRJNA14935
+s__Mycobacterium_phage_Myrna 1 PRJNA31279
+s__Euphorbia_mosaic_virus_associated_DNA_1 1 PRJNA59505
+s__Streptococcus_ratti 2 GCF_000286075 GCF_000347915
+s__Thermoanaerobacter_sp_X561 1 GCF_000175775
+s__Burkholderia_phage_phiE202 1 PRJNA19163
+s__Vibrio_anguillarum 4 GCF_000217675 GCF_000462975 GCF_000257165 GCF_000257185
+s__Microchaete_sp_PCC_7126 1 GCF_000332295
+s__Equine_papillomavirus_2 1 PRJNA34709
+s__Equine_papillomavirus_3 1 PRJNA163309
+s__Ageratum_enation_alphasatellite 1 PRJNA181994
+s__Prochlorococcus_phage_P_SSP10 1 PRJNA195499
+s__South_African_cassava_mosaic_virus 1 PRJNA14179
+s__Actinomyces_sp_oral_taxon_849 1 GCF_000239715
+s__Actinomyces_sp_oral_taxon_848 1 GCF_000162895
+s__Oscillibacter_sp_1_3 1 GCF_000403435
+s__Synechococcus_sp_JA_2_3B_a_2_13 1 GCF_000013225
+s__Pedobacter_sp_BAL39 1 GCF_000170795
+s__actinobacterium_SCGC_AAA027_L06 1 GCF_000294575
+s__Haemophilus_parahaemolyticus 1 GCF_000262265
+s__Bordetella_pertussis 35 GCF_000479895 GCF_000479415 GCF_000193515 GCF_000479835 GCF_000212975 GCF_000479395 GCF_000479715 GCF_000479675 GCF_000479455 GCF_000479695 GCF_000193535 GCF_000193595 GCF_000479535 GCF_000479555 GCF_000479495 GCF_000479595 GCF_000479795 GCF_000193575 GCF_000195715 GCF_000479475 GCF_000479855 GCF_000479915 GCF_000479635 GCF_000479815 GCF_000504325 GCF_000479875 GCF_000479575 GCF_000193555 GCF_000479755 GCF_000479515 GCF_000479775 GCF_000479615 GCF_000479435 GCF [...]
+s__Caulobacter_sp_JGI_0001013_O16 1 GCF_000376365
+s__Scardovia_wiggsiae 2 GCF_000275805 GCF_000269605
+s__Brucella_inopinata 1 GCF_000182725
+s__Cellvibrio_sp_BR 1 GCF_000263355
+s__Sulfolobales_Mexican_rudivirus_1 1 PRJNA179431
+s__Rosellinia_necatrix_victorivirus_1 1 PRJNA209362
+s__planctomycete_KSU_1 1 GCF_000296795
+s__Halonotius_sp_J07HN4 1 GCF_000416065
+s__Pseudomonas_phage_MP38 1 PRJNA32995
+s__Halonotius_sp_J07HN6 1 GCF_000416025
+s__Vibrio_phage_VBP47 1 PRJNA195493
+s__Facklamia_languida 1 GCF_000245795
+s__Torque_teno_canis_virus 1 PRJNA48141
+s__Porcine_circovirus_type_1_2a 1 PRJNA45807
+s__Actinomyces_massiliensis 2 GCF_000296275 GCF_000269805
+s__Ideonella_sp_B508_1 1 GCF_000333615
+s__Tobacco_leaf_curl_Yunnan_virus_associated_DNA_1 1 PRJNA15482
+s__Paenibacillus_popilliae 1 GCF_000315235
+s__Thalassolituus_oleivorans 1 GCF_000355675
+s__Porcine_circovirus_2 1 PRJNA15442
+s__Porcine_circovirus_1 1 PRJNA14053
+s__Exiguobacterium_sp_S17 1 GCF_000411915
+s__Trichophyton_verrucosum 1 GCA_000151505
+s__Bacteroides_sp_D2 1 GCF_000159075
+s__Bacteroides_sp_D1 1 GCF_000157095
+s__gamma_proteobacterium_IMCC1989 1 GCF_000209515
+s__Marinithermus_hydrothermalis 1 GCF_000195335
+s__Halomonas_sp_TD01 1 GCF_000219565
+s__Burkholderia_pyrrocinia 1 GCF_000297475
+s__Neisseria_sp_oral_taxon_014 1 GCF_000090875
+s__Megasphaera_sp_BV3C16_1 1 GCF_000478965
+s__European_brown_hare_syndrome_virus 1 PRJNA15087
+s__Pandoravirus_salinus 1 PRJNA215788
+s__Cellulophaga_phage_phi38_1 1 PRJNA212958
+s__Shewanella_sp_MR_7 1 GCF_000014665
+s__Shewanella_sp_MR_4 1 GCF_000014685
+s__Flexithrix_dorotheae 1 GCF_000379765
+s__Mycoplasma_alkalescens 1 GCF_000367445
+s__Shigella_phage_Shfl2 1 PRJNA66347
+s__Halobiforma_lacisalsi 2 GCF_000226975 GCF_000336655
+s__Prevotella_nanceiensis 1 GCF_000379965
+s__Desulfohalobium_retbaense 1 GCF_000024325
+s__Nocardia_phage_NBR1 1 PRJNA80925
+s__Saccharopolyspora_erythraea 3 GCF_000171635 GCF_000448385 GCF_000062885
+s__Acidithiobacillus_ferrivorans 1 GCF_000214095
+s__Bacillus_megaterium 4 GCF_000025805 GCF_000225265 GCF_000025825 GCF_000334875
+s__Spiroplasma_chrysopicola 1 GCF_000400935
+s__Drosophila_melanogaster_sigmavirus 1 PRJNA40127
+s__Alkalilimnicola_ehrlichii 1 GCF_000014785
+s__Mesorhizobium_loti 51 GCF_000504265 GCF_000502715 GCF_000502415 GCF_000502835 GCF_000502355 GCF_000503035 GCF_000502915 GCF_000009625 GCF_000502995 GCF_000502975 GCF_000502955 GCF_000502575 GCF_000502675 GCF_000502935 GCF_000502735 GCF_000502475 GCF_000502375 GCF_000502555 GCF_000502215 GCF_000503135 GCF_000503155 GCF_000502895 GCF_000502795 GCF_000503015 GCF_000503095 GCF_000502335 GCF_000502815 GCF_000502455 GCF_000502615 GCF_000502495 GCF_000502435 GCF_000502515 GCF_000502295 GCF_0 [...]
+s__Coconut_tinangaja_viroid 1 PRJNA14662
+s__Rhodobacterales_bacterium_Y4I 1 GCF_000156135
+s__J_virus 1 PRJNA15892
+s__Mycobacterium_phage_Murphy 1 PRJNA206024
+s__Mycobacterium_phage_HINdeR 1 PRJNA206031
+s__Odontoglossum_ringspot_virus 1 PRJNA15201
+s__Ludwigia_yellow_vein_virus_associated_DNA_beta 1 PRJNA15561
+s__Mycobacterium_vaccae 1 GCF_000295825
+s__Wongabel_virus 1 PRJNA33129
+s__Rickettsia_japonica 1 GCF_000283595
+s__Japanese_holly_fern_mottle_virus 1 PRJNA40117
+s__Eubacterium_cylindroides 1 GCF_000469305
+s__Bacillus_sp_10403023 1 GCF_000285535
+s__Burkholderia_sp_CCGE1001 1 GCF_000176935
+s__Burkholderia_sp_CCGE1002 1 GCF_000092885
+s__Burkholderia_sp_CCGE1003 1 GCF_000148685
+s__Halobacterium_salinarum 2 GCF_000069025 GCF_000006805
+s__Cherry_necrotic_rusty_mottle_virus 1 PRJNA14729
+s__Labrenzia_aggregata 1 GCF_000168975
+s__Pseudomonas_sp_HYS 1 GCF_000259195
+s__Cellulophaga_lytica 1 GCF_000190595
+s__Acinetobacter_phage_Abp1 1 PRJNA206470
+s__Digitaria_ciliaris_striate_mosaic_virus 1 PRJNA174778
+s__Rickettsia_canadensis 2 GCF_000283915 GCF_000014345
+s__Neurospora_crassa 1 GCA_000182925
+s__Acetobacter_aceti 2 GCF_000379545 GCF_000193495
+s__Anaerotruncus_colihominis 1 GCF_000154565
+s__Janthinobacterium_sp_HH01 1 GCF_000335815
+s__Bacillus_phage_Finn 1 PRJNA192875
+s__Geovibrio_sp_L21_Ace_BES 1 GCF_000421105
+s__Felis_catus_papillomavirus_4 1 PRJNA221115
+s__Thermosynechococcus_elongatus 1 GCF_000011345
+s__Felis_catus_papillomavirus_3 1 PRJNA207833
+s__Burkholderia_vietnamiensis 1 GCF_000016205
+s__Prevotella_timonensis 1 GCF_000177055
+s__Cowpea_mild_mottle_virus 1 PRJNA60623
+s__Pseudomonas_sp_GM74 1 GCF_000282455
+s__Pseudomonas_sp_GM79 1 GCF_000282495
+s__Nitrosopumilus_sp_AR 1 GCF_000328925
+s__Leuconostoc_pseudomesenteroides 2 GCF_000185065 GCF_000297375
+s__Commelina_yellow_mottle_virus 1 PRJNA14575
+s__Reston_ebolavirus 1 PRJNA15006
+s__Walleye_dermal_sarcoma_virus 1 PRJNA14718
+s__Acidithiobacillus_thiooxidans 1 GCF_000227215
+s__Fibrella_aestuarina 1 GCF_000331105
+s__Acinetobacter_sp_ANC_3862 1 GCF_000369565
+s__Thermobaculum_terrenum 1 GCF_000025005
+s__Peptoniphilus_sp_oral_taxon_386 1 GCF_000090945
+s__Arthrospira_maxima 1 GCF_000173555
+s__Gordonia_amicalis 1 GCF_000332995
+s__Ruminococcus_flavefaciens 2 GCF_000174895 GCF_000247525
+s__Vibrio_phage_VP93 1 PRJNA37885
+s__Leishmania_donovani 1 GCA_000227135
+s__Enterobacteria_phage_HK544 1 PRJNA183160
+s__Enterobacteria_phage_HK542 1 PRJNA183159
+s__Buchnera_aphidicola 13 GCF_000007725 GCF_000007365 GCF_000217635 GCF_000225465 GCF_000183245 GCF_000183305 GCF_000183285 GCF_000090965 GCF_000174075 GCF_000021065 GCF_000021085 GCF_000225445 GCF_000183225
+s__Dorea_longicatena 1 GCF_000154065
+s__Goose_circovirus 1 PRJNA14125
+s__Enterobacteria_phage_PRD1 1 PRJNA14062
+s__Ageratum_yellow_vein_Hualian_virus 1 PRJNA30057
+s__Sweet_potato_virus_2 1 PRJNA167581
+s__Lactobacillus_phage_Lc_Nu 2 PRJNA14475 PRJNA16114
+s__Pseudoalteromonas_spongiae 1 GCF_000238255
+s__Rhodococcus_phage_REQ1 1 PRJNA81177
+s__Snake_parvovirus_1 1 PRJNA14477
+s__Parabacteroides_sp_ASF519 1 GCF_000364265
+s__Nodosilinea_nodulosa 1 GCF_000309385
+s__Vibrio_coralliilyticus 3 GCF_000461895 GCF_000195475 GCF_000176135
+s__Catenulispora_acidiphila 1 GCF_000024025
+s__Phocoena_phocoena_papillomavirus_1 1 PRJNA168666
+s__Phocoena_phocoena_papillomavirus_2 1 PRJNA168667
+s__Phocoena_phocoena_papillomavirus_4 1 PRJNA168668
+s__Enterococcus_sp_GMD3E 1 GCF_000296915
+s__Sweet_potato_virus_G 1 PRJNA169624
+s__Sweet_potato_virus_C 1 PRJNA60649
+s__Enterobacteria_phage_K1F 1 PRJNA15880
+s__Desulfobulbus_sp_oral_taxon_041 2 GCF_000349365 GCF_000349345
+s__Geminocystis_herdmanii 1 GCF_000332235
+s__Cycad_leaf_necrosis_virus 1 PRJNA30835
+s__Aureimonas_ureilytica 1 GCF_000382705
+s__gamma_proteobacterium_HIMB55 1 GCF_000227505
+s__Anaeromyxobacter_sp_K 1 GCF_000020805
+s__Bovine_parainfluenza_virus_3 1 PRJNA15001
+s__Beggiatoa_sp_SS 1 GCF_000170695
+s__Feline_bocavirus 1 PRJNA162493
+s__Sheeppox_virus 1 PRJNA14196
+s__Gordonibacter_pamelaeae 1 GCF_000210055
+s__Solibacillus_silvestris 1 GCF_000271325
+s__Bovine_respiratory_syncytial_virus 1 PRJNA14697
+s__Streptococcus_sp_AS14 1 GCF_000286495
+s__Listeria_phage_B025 1 PRJNA20795
+s__Selenomonas_noxia 2 GCF_000160555 GCF_000234135
+s__Mycoplasma_bovis 3 GCF_000219375 GCF_000270525 GCF_000183385
+s__Flavobacteriales_bacterium_ALC_1 1 GCF_000171875
+s__Stigmatella_aurantiaca 1 GCF_000165485
+s__Mycoplasma_agalactiae 3 GCF_000063605 GCF_000266865 GCF_000089865
+s__Acinetobacter_junii 5 GCF_000162075 GCF_000368665 GCF_000302355 GCF_000368745 GCF_000368765
+s__Salinibacter_ruber 2 GCF_000090405 GCF_000013045
+s__Helcococcus_kunzii 1 GCF_000245755
+s__Saimiriine_herpesvirus_1 1 PRJNA54017
+s__Streptomyces_sp_HPH0547 1 GCF_000411495
+s__Pseudomonas_phage_M6 1 PRJNA16387
+s__Staphylococcus_phage_vB_SauM_Romulus 1 PRJNA195528
+s__Gemmata_obscuriglobus 1 GCF_000171775
+s__Methanothermus_fervidus 1 GCF_000166095
+s__Campylobacter_coli 50 GCF_000254015 GCF_000253535 GCF_000253595 GCF_000253455 GCF_000470055 GCF_000253515 GCF_000253895 GCF_000254035 GCF_000254195 GCF_000253435 GCF_000254155 GCF_000253415 GCF_000254235 GCF_000254175 GCF_000253635 GCF_000254075 GCF_000253475 GCF_000253655 GCF_000253575 GCF_000464875 GCF_000253695 GCF_000253915 GCF_000253975 GCF_000254095 GCF_000253995 GCF_000253875 GCF_000253815 GCF_000253835 GCF_000254135 GCF_000253555 GCF_000253495 GCF_000505625 GCF_000254215 GCF_0 [...]
+s__Marinomonas_mediterranea 1 GCF_000192865
+s__Lucerne_transient_streak_virus 1 PRJNA15337
+s__Gordonia_rubripertincta 1 GCF_000327325
+s__Spiroplasma_diminutum 1 GCF_000439455
+s__Chaetoceros_socialis_f_radians_RNA_virus_01 1 PRJNA34845
+s__Shigella_flexneri 25 GCF_000213755 GCF_000252895 GCF_000022245 GCF_000007405 GCF_000013585 GCF_000213475 GCF_000268165 GCF_000268085 GCF_000267985 GCF_000183785 GCF_000213715 GCF_000213435 GCF_000213675 GCF_000213495 GCF_000268025 GCF_000193935 GCF_000213695 GCF_000281795 GCF_000296305 GCF_000268065 GCF_000217895 GCF_000268245 GCF_000006925 GCF_000213735 GCF_000213455
+s__Bacillus_sp_B14905 1 GCF_000169315
+s__Thermocrinis_albus 1 GCF_000025605
+s__Methanolobus_tindarius 1 GCF_000504205
+s__Halanaerobium_praevalens 1 GCF_000165465
+s__Microviridae_phi_CA82 1 PRJNA70009
+s__Pantoea_phage_LIMElight 1 PRJNA181079
+s__Enterobacteria_phage_ES18 1 PRJNA15174
+s__Sida_golden_mosaic_Florida_virus 1 PRJNA51627
+s__Lactobacillus_amylolyticus 1 GCF_000178475
+s__Sweetpotato_badnavirus_B 1 PRJNA38241
+s__Corynebacterium_ciconiae 1 GCF_000372385
+s__Streptococcus_pseudopneumoniae 7 GCF_000506745 GCF_000258265 GCF_000506665 GCF_000506705 GCF_000257825 GCF_000221985 GCF_000506685
+s__Rhodanobacter_fulvus 1 GCF_000264315
+s__Anaerococcus_obesiensis 1 GCF_000311745
+s__Desulfovibrio_sp_X2 1 GCF_000422205
+s__Nocardiopsis_alba 2 GCF_000341225 GCF_000294515
+s__Legionella_longbeachae 2 GCF_000091785 GCF_000176095
+s__Fusobacterium_sp_oral_taxon_370 1 GCF_000235465
+s__Mycobacterium_phage_Contagion 1 PRJNA215114
+s__Lactococcus_phage_bIL312 1 PRJNA14113
+s__Arthrobacter_sp_SJCon 1 GCF_000332815
+s__Artemisia_virus_A 1 PRJNA165739
+s__African_green_monkey_polyomavirus 1 PRJNA15320
+s__Soybean_mosaic_virus 1 PRJNA15377
+s__Palyam_virus 1 PRJNA14923
+s__Bean_chlorosis_virus 1 PRJNA182753
+s__Mycobacterium_phage_SargentShorty9 1 PRJNA219108
+s__Deformed_wing_virus 2 PRJNA14891 PRJNA14957
+s__Human_T_lymphotropic_virus_4 1 PRJNA33481
+s__Methyloversatilis_universalis 3 GCF_000385375 GCF_000214035 GCF_000378945
+s__Black_raspberry_necrosis_virus 1 PRJNA17093
+s__Bradyrhizobium_sp_BTAi1 1 GCF_000015165
+s__Torque_teno_mini_virus_4 1 PRJNA48179
+s__Capnocytophaga_ochracea 3 GCF_000023285 GCF_000277585 GCF_000183985
+s__Torque_teno_mini_virus_6 1 PRJNA48189
+s__Sphingopyxis_baekryungensis 1 GCF_000420305
+s__Torque_teno_mini_virus_1 1 PRJNA48193
+s__Torque_teno_mini_virus_2 1 PRJNA48171
+s__Torque_teno_mini_virus_3 1 PRJNA48175
+s__Torque_teno_mini_virus_8 1 PRJNA48135
+s__Dyadobacter_beijingensis 1 GCF_000382205
+s__Bovine_viral_diarrhea_virus_3 1 PRJNA38557
+s__Bovine_viral_diarrhea_virus_2 1 PRJNA15089
+s__Bovine_viral_diarrhea_virus_1 1 PRJNA15305
+s__Herbaspirillum_seropedicae 3 GCF_000300435 GCF_000143225 GCF_000300415
+s__Providence_virus 1 PRJNA48417
+s__Clostridium_phage_phiSM101 1 PRJNA58117
+s__Arthrobacter_sp_161MFSha2_1 1 GCF_000374945
+s__Marinomonas_posidonica 1 GCF_000214215
+s__Pectobacterium_carotovorum 4 GCF_000173135 GCF_000294535 GCF_000023605 GCF_000173155
+s__Gastropod_associated_circular_ssDNA_virus 1 PRJNA192606
+s__Thioalkalivibrio_sp_ALMg13_2 1 GCF_000381185
+s__Western_equine_encephalitis_virus 1 PRJNA14831
+s__Taura_syndrome_virus 1 PRJNA14713
+s__Carnation_Italian_ringspot_virus 1 PRJNA15077
+s__Pseudoalteromonas_flavipulchra 1 GCF_000259115
+s__Staphylococcus_phage_phi5967PVL 1 PRJNA184165
+s__Mycobacterium_phage_DNAIII 1 PRJNA213079
+s__Pepper_yellow_mosaic_virus 1 PRJNA50567
+s__Pepper_severe_mosaic_virus 1 PRJNA17809
+s__Tomato_mosaic_Havana_virus 1 PRJNA14188
+s__East_African_cassava_mosaic_Zanzibar_virus 1 PRJNA14526
+s__Agrobacterium_phage_7_7_1 1 PRJNA181226
+s__Dragonfly_associated_microphage_1 1 PRJNA177547
+s__Vibrio_phage_vB_VchM_138 1 PRJNA181217
+s__Klebsiella_sp_OBRC7 1 GCF_000293135
+s__Bacillus_weihenstephanensis 1 GCF_000018825
+s__Bean_calico_mosaic_virus 1 PRJNA14165
+s__Longispora_albida 1 GCF_000379825
+s__Photorhabdus_luminescens 1 GCF_000196155
+s__Escherichia_phage_Cba120 1 PRJNA81001
+s__Firmicutes_bacterium_M10_2 1 GCF_000403415
+s__Staphylococcus_delphini 1 GCF_000308115
+s__Veillonella_sp_ACP1 1 GCF_000286635
+s__Acinetobacter_lwoffii 9 GCF_000369145 GCF_000369105 GCF_000368165 GCF_000248355 GCF_000301755 GCF_000369125 GCF_000487975 GCF_000219275 GCF_000162095
+s__Cupriavidus_sp_BIS7 1 GCF_000292345
+s__Anabaena_sp_PCC_7108 1 GCF_000332135
+s__Passiflora_latent_carlavirus 1 PRJNA17487
+s__Hop_mosaic_virus 1 PRJNA29191
+s__Burkholderia_phymatum 1 GCF_000020045
+s__Alistipes_shahii 1 GCF_000210575
+s__Nipah_virus 1 PRJNA15443
+s__Tomato_leaf_curl_Java_virus 1 PRJNA14296
+s__Oceanobacillus_sp_Ndiop 1 GCF_000285495
+s__Geopsychrobacter_electrodiphilus 1 GCF_000384395
+s__Psychrobacter_sp_PAMC_21119 1 GCF_000247495
+s__Alicyclobacillus_acidoterrestris 1 GCF_000444055
+s__Pantoea_stewartii 1 GCF_000248395
+s__Paenibacillus_sanguinis 1 GCF_000374825
+s__Acinetobacter_sp_TG19627 1 GCF_000302415
+s__Oscillatoria_acuminata 1 GCF_000317105
+s__Ambystoma_tigrinum_virus 1 PRJNA14364
+s__Tomato_leaf_curl_China_virus 1 PRJNA14342
+s__American_plum_line_pattern_virus 1 PRJNA14742
+s__Streptococcus_urinalis 2 GCF_000188055 GCF_000314815
+s__Metallosphaera_yellowstonensis 1 GCF_000243315
+s__Sideroxydans_lithotrophicus 1 GCF_000025705
+s__Haloarcula_marismortui 1 GCF_000011085
+s__Mobuck_virus 1 PRJNA225930
+s__Banana_bunchy_top_virus 1 PRJNA14621
+s__Helicobacter_felis 1 GCF_000200595
+s__Pipapillomavirus_1 1 PRJNA18011
+s__Pipapillomavirus_2 2 PRJNA18259 PRJNA50561
+s__Vulcanisaeta_distributa 1 GCF_000148385
+s__Salmonella_phage_SFP10 1 PRJNA74351
+s__Passion_fruit_mosaic_virus 1 PRJNA67109
+s__Pepper_leaf_curl_Lahore_virus 1 PRJNA89655
+s__Pseudomonas_brassicacearum 1 GCF_000194805
+s__Candiru_virus 1 PRJNA65423
+s__Neodiprion_sertifer_nucleopolyhedrovirus 1 PRJNA14383
+s__Helicobasidium_mompa_totivirus_1_17 1 PRJNA14918
+s__Porcine_bocavirus_4 1 PRJNA73549
+s__Acidocella_sp_MX_AZ02 1 GCF_000306035
+s__Bacillus_phage_SPbeta 1 PRJNA14034
+s__Paralichthys_olivaceus_birnavirus 1 PRJNA21035
+s__Acinetobacter_sp_NIPH_1867 1 GCF_000369545
+s__Staphylococcus_pasteuri 1 GCF_000494875
+s__Bacillus_phage_PM1 1 PRJNA195536
+s__Blackcurrant_reversion_virus 1 PRJNA14749
+s__Astrovirus_wild_boar_WBAstV_1_2011_HUN 1 PRJNA84401
+s__Leifsonia_xyli 2 GCF_000470775 GCF_000007665
+s__Rhodococcus_imtechensis 1 GCF_000260815
+s__Caulobacter_phage_CcrSwift 1 PRJNA179423
+s__Halobacillus_sp_BAB_2008 1 GCF_000328325
+s__Clerodendron_yellow_mosaic_virus 1 PRJNA19599
+s__Okra_yellow_vein_mosaic_virus 1 PRJNA14266
+s__Acinetobacter_sp_NIPH_809 1 GCF_000367945
+s__Dickeya_phage_Limestone 1 PRJNA185317
+s__Proteus_penneri 1 GCF_000155835
+s__Camelus_dromedarius_papillomavirus_type_2 1 PRJNA64599
+s__Camelus_dromedarius_papillomavirus_type_1 1 PRJNA64597
+s__Asclepias_asymptomatic_virus 1 PRJNA66899
+s__Caulobacter_phage_CcrRogue 1 PRJNA179422
+s__Ferrimonas_balearica 1 GCF_000148645
+s__Bacillus_sp_123MFChir2 1 GCF_000383235
+s__Lactobacillus_ingluviei 1 GCF_000312405
+s__Mycobacterium_phage_LittleCherry 1 PRJNA215674
+s__Peptoniphilus_sp_ph5 1 GCF_000311825
+s__Enterobacteria_phage_RB16 1 PRJNA51699
+s__Ruegeria_lacuscaerulensis 1 GCF_000161775
+s__Enterobacteria_phage_RB14 1 PRJNA37825
+s__Tomato_yellow_leaf_curl_virus_associated_DNA_beta 1 PRJNA28045
+s__Prevotella_disiens 2 GCF_000467875 GCF_000179675
+s__Acinetobacter_parvus 3 GCF_000248155 GCF_000368005 GCF_000368025
+s__Rhodopirellula_maiorica 1 GCF_000346295
+s__Vibrio_sp_RC586 1 GCF_000176715
+s__Acinetobacter_phage_133 1 PRJNA64541
+s__Brevibacterium_casei 1 GCF_000314575
+s__Rhodococcus_rhodochrous 1 GCF_000239135
+s__Mycobacterium_phage_Newman 1 PRJNA206032
+s__Prevotella_sp_BV3P1 1 GCF_000479005
+s__Chimpanzee_polyomavirus 1 PRJNA60731
+s__Veillonella_sp_3_1_44 1 GCF_000163715
+s__Mushroom_bacilliform_virus 1 PRJNA14676
+s__Machupo_virus 1 PRJNA14931
+s__Agrotis_segetum_granulovirus 1 PRJNA14481
+s__Nocardiopsis_chromatogenes 1 GCF_000341185
+s__Clavispora_lusitaniae 1 GCA_000003835
+s__Sphingobium_sp_YL23 1 GCF_000412635
+s__Tomato_leaf_curl_Patna_betasatellite 1 PRJNA36541
+s__Paracoccus_sp_N5 1 GCF_000371965
+s__Pepper_leaf_curl_Yunnan_virus_satellite_DNA_beta 1 PRJNA29415
+s__Grapevine_fanleaf_virus 1 PRJNA15286
+s__Parrot_hepatitis_B_virus 1 PRJNA80909
+s__Cherry_rusty_mottle_associated_virus 1 PRJNA196970
+s__Streptococcus_minor 1 GCF_000377005
+s__Desulfitobacterium_dichloroeliminans 1 GCF_000243135
+s__Sulfolobus_islandicus 20 GCF_000245095 GCF_000245155 GCF_000024305 GCF_000245215 GCF_000364745 GCF_000022385 GCF_000022405 GCF_000245235 GCF_000245275 GCF_000189555 GCF_000022425 GCF_000022445 GCF_000245135 GCF_000022465 GCF_000245195 GCF_000245255 GCF_000245175 GCF_000022485 GCF_000245115 GCF_000189575
+s__Afipia_clevelandensis 1 GCF_000336555
+s__Nitrobacter_hamburgensis 1 GCF_000013885
+s__Deinococcus_sp_2009 1 GCF_000419625
+s__Narcissus_symptomless_virus 1 PRJNA18071
+s__Acinetobacter_sp_GG2 1 GCF_000292385
+s__Peristrophe_mosaic_virus 1 PRJNA178459
+s__Barbel_circovirus 1 PRJNA65821
+s__Pseudoalteromonas_piscicida 2 GCF_000238315 GCF_000382005
+s__Colombian_datura_virus 1 PRJNA185274
+s__Thioalkalivibrio_sp_ALD1 1 GCF_000381245
+s__Nocardiopsis_gilva 1 GCF_000341165
+s__Salmonella_phage_L13 1 PRJNA206468
+s__Cellulophaga_phage_phi48_2 1 PRJNA212947
+s__Lactobacillus_equi 1 GCF_000504525
+s__Physalis_mottle_virus 1 PRJNA15090
+s__Phaeospirillum_fulvum 1 GCF_000442515
+s__Enterobacteria_phage_mEp460 1 PRJNA183148
+s__Feldmannia_species_virus 1 PRJNA31093
+s__Methylobacterium_populi 1 GCF_000019945
+s__Blainvillea_yellow_spot_virus 1 PRJNA30183
+s__Miscanthus_streak_virus 1 PRJNA14151
+s__Cyanothece_sp_PCC_8801 1 GCF_000021805
+s__Cyanothece_sp_PCC_8802 1 GCF_000024045
+s__Sugarcane_bacilliform_virus 1 PRJNA41599
+s__Maize_chlorotic_mottle_virus 1 PRJNA15117
+s__Bacteroides_sp_2_1_56FAA 1 GCF_000218345
+s__Pseudomonas_sp_GM50 1 GCF_000282375
+s__Beet_pseudoyellows_virus 1 PRJNA14901
+s__Methanoregula_boonei 1 GCF_000017625
+s__Pseudomonas_sp_GM55 1 GCF_000282395
+s__Nitrosospira_sp_APG3 1 GCF_000355765
+s__Rhodobacter_sp_SW2 1 GCF_000176015
+s__Clostridium_saccharoperbutylacetonicum 2 GCF_000334435 GCF_000340885
+s__Glaciecola_pallidula 1 GCF_000315035
+s__Potato_spindle_tuber_viroid 1 PRJNA14966
+s__Leishmania_mexicana 1 GCA_000234665
+s__Enterobacteria_phage_Hgal1 1 PRJNA184161
+s__Tetragenococcus_halophilus 1 GCF_000283615
+s__Corynebacterium_massiliense 1 GCF_000420605
+s__Frog_virus_3 1 PRJNA14560
+s__Ectromelia_virus 1 PRJNA14211
+s__Microscilla_marina 1 GCF_000169175
+s__Campylobacter_concisus 1 GCF_000017725
+s__Soil_borne_cereal_mosaic_virus 1 PRJNA14693
+s__Apple_stem_pitting_virus 1 PRJNA14744
+s__Brevibacterium_sp_JC43 1 GCF_000285835
+s__Brugmansia_suaveolens_mottle_virus 1 PRJNA52813
+s__Bombyx_mori_cypovirus_1_satellite_RNA 1 PRJNA14557
+s__Enterobacteria_phage_K30 1 PRJNA68413
+s__Pyrolobus_fumarii 1 GCF_000223395
+s__Pseudomonas_phage_F10 1 PRJNA16383
+s__Listeria_phage_P35 1 PRJNA20799
+s__Enterobacter_sp_MGH_34 1 GCF_000492895
+s__Paracoccus_aminophilus 1 GCF_000444995
+s__Sauropus_leaf_curl_disease_associated_DNA_beta 1 PRJNA176433
+s__Enterobacter_sp_MGH_38 1 GCF_000492855
+s__Gillisia_marina 1 GCF_000258765
+s__Enterobacteria_phage_HX01 1 PRJNA177542
+s__Actinobacillus_pleuropneumoniae 15 GCF_000178575 GCF_000016685 GCF_000178655 GCF_000020405 GCF_000178495 GCF_000178535 GCF_000179295 GCF_000178615 GCF_000295915 GCF_000178635 GCF_000015885 GCF_000178515 GCF_000179275 GCF_000178555 GCF_000178595
+s__Nocardiopsis_potens 1 GCF_000341105
+s__Thermoanaerobacter_mathranii 1 GCF_000092965
+s__Freesia_mosaic_virus 1 PRJNA48387
+s__Yersinia_kristensenii 1 GCF_000173715
+s__Salmonella_phage_FSL_SP_076 1 PRJNA212719
+s__Pelotomaculum_thermopropionicum 1 GCF_000010565
+s__Siniperca_chuatsi_rhabdovirus 1 PRJNA18009
+s__Bifidobacterium_breve 7 GCF_000411435 GCF_000158015 GCF_000213865 GCF_000226175 GCF_000220135 GCF_000247755 GCF_000466545
+s__Porphyromonas_levii 1 GCF_000379925
+s__Spodoptera_litura_nucleopolyhedrovirus_II 1 PRJNA33005
+s__Corynebacterium_diphtheriae 17 GCF_000255155 GCF_000255235 GCF_000455785 GCF_000241915 GCF_000455805 GCF_000257885 GCF_000195815 GCF_000241935 GCF_000255215 GCF_000242775 GCF_000241895 GCF_000255255 GCF_000241875 GCF_000255175 GCF_000255195 GCF_000255275 GCF_000263415
+s__Torque_teno_midi_virus_2 1 PRJNA48185
+s__Torque_teno_midi_virus_1 1 PRJNA19131
+s__Amphibacillus_xylanus 1 GCF_000307165
+s__Black_queen_cell_virus 1 PRJNA14803
+s__Pseudomonas_phage_vB_PaeS_PMG1 1 PRJNA82649
+s__Peptostreptococcus_anaerobius 3 GCF_000381525 GCF_000178095 GCF_000318115
+s__Tsukamurella_paurometabola 1 GCF_000092225
+s__Burkholderia_gladioli 2 GCF_000365265 GCF_000194745
+s__Bean_yellow_disorder_virus 1 PRJNA29237
+s__Megavirus_chiliensis 1 PRJNA74349
+s__Mavirus 1 PRJNA64497
+s__Cryphonectria_parasitica_bipartite_mycovirus_1 1 PRJNA203281
+s__Klebsiella_sp_MS_92_3 1 GCF_000195655
+s__Leishmania_infantum 1 GCA_000002875
+s__Methanocaldococcus_infernus 1 GCF_000092305
+s__Bifidobacterium_pseudocatenulatum 1 GCF_000173435
+s__Lactococcus_phage_P087 1 PRJNA37887
+s__Cydia_pomonella_granulovirus 1 PRJNA14118
+s__Escherichia_phage_PhaxI 1 PRJNA181080
+s__Brevundimonas_sp_BAL3 1 GCF_000155575
+s__Pseudomonas_sp_CBZ_4 1 GCF_000346755
+s__Melandrium_yellow_fleck_virus 1 PRJNA40661
+s__Infectious_flacherie_virus 1 PRJNA14800
+s__Sweet_potato_leaf_curl_Lanzarote_virus 1 PRJNA41625
+s__Pseudoxanthomonas_spadix 1 GCF_000233915
+s__Trichodysplasia_spinulosa_associated_polyomavirus 1 PRJNA51185
+s__Asticcacaulis_sp_AC466 1 GCF_000495815
+s__Iresine_viroid_1 1 PRJNA14765
+s__Bacillus_sp_JS 1 GCF_000259365
+s__Microbacterium_yannicii 1 GCF_000304335
+s__Vibrio_phage_VFJ 1 PRJNA209358
+s__Wolinella_succinogenes 1 GCF_000196135
+s__Aurantiochytrium_single_stranded_RNA_virus_01 1 PRJNA16134
+s__Pseudomonas_phage_LKD16 1 PRJNA21043
+s__Vibrio_phage_ICP3 1 PRJNA63233
+s__Macrococcus_caseolyticus 1 GCF_000010585
+s__Pseudomonas_monteilii 1 GCF_000262005
+s__Halococcus_hamelinensis 2 GCF_000259215 GCF_000336675
+s__Tomato_mottle_virus 1 PRJNA14079
+s__Clostridium_phage_c_st 1 PRJNA16151
+s__Musca_hytrovirus 1 PRJNA29631
+s__Rhodopseudomonas_sp_B29 1 GCF_000333455
+s__Heterosigma_akashiwo_RNA_virus 1 PRJNA15425
+s__Enterobacter_aerogenes 3 GCF_000215745 GCF_000334515 GCF_000383335
+s__Salinimonas_chungwhensis 1 GCF_000378185
+s__Tomato_aspermy_virus 1 PRJNA14815
+s__Nocardia_sp_348MFTsu5_1 1 GCF_000383535
+s__Enterobacter_cloacae 19 GCF_000025565 GCF_000210775 GCF_000422225 GCF_000492455 GCF_000264705 GCF_000492715 GCF_000467655 GCF_000492495 GCF_000496775 GCF_000492435 GCF_000286275 GCF_000492675 GCF_000492575 GCF_000492615 GCF_000315775 GCF_000390425 GCF_000300455 GCF_000235765 GCF_000239975
+s__Spirosoma_linguale 1 GCF_000024525
+s__Amycolatopsis_benzoatilytica 1 GCF_000383915
+s__Acinetobacter_towneri 1 GCF_000368785
+s__Hibiscus_green_spot_virus 1 PRJNA76343
+s__Isfahan_virus 1 PRJNA194141
+s__Donghicola_sp_S598 1 GCF_000308135
+s__Mycobacterium_phage_Redno2 1 PRJNA215125
+s__Rousettus_bat_coronavirus_HKU9 1 PRJNA18867
+s__Rhodococcus_phage_REQ3 1 PRJNA81175
+s__Rhodococcus_phage_REQ2 1 PRJNA81171
+s__Bradyrhizobium_japonicum 3 GCF_000284375 GCF_000374205 GCF_000379585
+s__Pelistega_sp_HM_7 1 GCF_000506865
+s__Vibrio_phage_helene_12B3 1 PRJNA198433
+s__Pseudomonas_sp_PAMC_26793 1 GCF_000313235
+s__Rhizobium_sp_CF080 1 GCF_000282095
+s__Thermoanaerobacter_siderophilus 1 GCF_000262445
+s__Kudzu_mosaic_virus 1 PRJNA20053
+s__Pantoea_sp_At_9b 1 GCF_000175935
+s__Gill_associated_virus 1 PRJNA28679
+s__Clostridium_phage_phiCD6356 1 PRJNA64557
+s__Rhizobium_sp_CCGE_510 1 GCF_000292525
+s__Rickettsia_felis 1 GCF_000012145
+s__Candidatus_Portiera_aleyrodidarum 5 GCF_000349745 GCF_000300075 GCF_000300035 GCF_000298385 GCF_000292685
+s__Watermelon_mosaic_virus 1 PRJNA15046
+s__Halovirus_HGTV_1 1 PRJNA206496
+s__Staphylococcus_arlettae 1 GCF_000295715
+s__Peptoniphilus_lacrimalis 2 GCF_000176955 GCF_000378725
+s__Encephalitozoon_hellem 1 GCA_000277815
+s__Sphingomonas_sp_MM_1 1 GCF_000347675
+s__Aggregatibacter_sp_oral_taxon_458 1 GCF_000466335
+s__Aleutian_mink_disease_virus 1 PRJNA14077
+s__Serratia_proteamaculans 1 GCF_000018085
+s__Mycobacterium_phlei 1 GCF_000257725
+s__Xanthomonas_phage_Xp10 1 PRJNA14292
+s__Mycobacterium_phage_244 1 PRJNA17115
+s__Xanthomonas_phage_Xp15 1 PRJNA15255
+s__Lactobacillus_jensenii 7 GCF_000161895 GCF_000466805 GCF_000175035 GCF_000159335 GCF_000155915 GCF_000162435 GCF_000162335
+s__Ralstonia_solanacearum 13 GCF_000212635 GCF_000215325 GCF_000285815 GCF_000223115 GCF_000331875 GCF_000167955 GCF_000283475 GCF_000348545 GCF_000197855 GCF_000009125 GCF_000331895 GCF_000427195 GCF_000430925
+s__Acinetobacter_sp_NCTC_7422 1 GCF_000248195
+s__Polaromonas_naphthalenivorans 1 GCF_000015505
+s__Kocuria_sp_UCD_OTCP 1 GCF_000349605
+s__Chaetoceros_tenuissimus_DNA_virus 1 PRJNA60753
+s__Halovirus_PH1 1 PRJNA196975
+s__Archaeoglobus_fulgidus 1 GCF_000008665
+s__Borrelia_spielmanii 1 GCF_000181895
+s__Methanomassiliicoccus_luminyensis 1 GCF_000308215
+s__Choristoneura_fumiferana_multiple_nucleopolyhedrovirus 1 PRJNA15133
+s__Euphorbia_mosaic_virus 1 PRJNA17551
+s__Sulfolobus_virus_Kamchatka_1 1 PRJNA14355
+s__White_clover_mosaic_virus 1 PRJNA15069
+s__Pseudomonas_phage_PP7 1 PRJNA15076
+s__Papaya_leaf_crumple_virus 1 PRJNA60361
+s__Paspalum_dilatatum_striate_mosaic_virus 1 PRJNA174777
+s__Honeysuckle_yellow_vein_mosaic_virus 1 PRJNA14172
+s__Roseburia_intestinalis 3 GCF_000209995 GCF_000156535 GCF_000210655
+s__Euprosterna_elaeasa_virus 1 PRJNA14737
+s__Acidithiobacillus_ferrooxidans 2 GCF_000021485 GCF_000020825
+s__Powassan_virus 1 PRJNA15304
+s__Mycobacterium_phage_Llij 1 PRJNA17149
+s__Actinomyces_oris 1 GCF_000180155
+s__Enzootic_nasal_tumour_virus_of_goats 1 PRJNA14893
+s__Human_herpesvirus_8 1 PRJNA14158
+s__Microbacterium_laevaniformans 1 GCF_000255595
+s__Human_herpesvirus_4 2 PRJNA14413 PRJNA20959
+s__Human_herpesvirus_5 1 PRJNA14559
+s__Human_herpesvirus_7 1 PRJNA14625
+s__Pelosinus_fermentans 6 GCF_000271525 GCF_000271505 GCF_000271465 GCF_000271485 GCF_000271545 GCF_000271665
+s__Human_herpesvirus_1 1 PRJNA15217
+s__Human_herpesvirus_2 1 PRJNA15218
+s__Human_herpesvirus_3 1 PRJNA15198
+s__Brenneria_sp_EniD312 1 GCF_000225565
+s__Sphingomonas_sp_PAMC_26617 1 GCF_000242835
+s__Pseudoalteromonas_sp_BSi20429 1 GCF_000238895
+s__Begomovirus_associated_DNA_III 1 PRJNA15162
+s__Sweet_potato_caulimo_like_virus 1 PRJNA65307
+s__Metallosphaera_cuprina 1 GCF_000204925
+s__Avian_sapelovirus 1 PRJNA15039
+s__Desulfotomaculum_acetoxidans 1 GCF_000024205
+s__Rickettsiella_grylli 1 GCF_000168295
+s__Escherichia_phage_wV8 1 PRJNA38281
+s__Cricket_paralysis_virus 1 PRJNA14832
+s__Escherichia_phage_wV7 1 PRJNA181232
+s__Loktanella_cinnabarina 1 GCF_000466965
+s__Xanthomonas_sp_SHU199 1 GCF_000364665
+s__Haloferax_elongans 1 GCF_000336755
+s__Tobacco_vein_banding_mosaic_virus 1 PRJNA27895
+s__Corynebacterium_glutamicum 7 GCF_000404145 GCF_000404185 GCF_000224315 GCF_000417765 GCF_000233355 GCF_000445015 GCF_000010225
+s__Pseudomonas_veronii 1 GCF_000350565
+s__Corynebacterium_accolens 2 GCF_000159115 GCF_000146485
+s__Spiroplasma_phage_1_R8A2B 1 PRJNA14580
+s__Jonesia_denitrificans 1 GCF_000024065
+s__Listeria_fleischmannii 2 GCF_000344175 GCF_000252625
+s__Halorubrum_distributum 2 GCF_000337055 GCF_000337335
+s__Acinetobacter_sp_NIPH_3623 1 GCF_000369785
+s__Eubacteriaceae_bacterium_OBRC8 1 GCF_000293035
+s__Chili_leaf_curl_Bhatinda_betasatellite 1 PRJNA206467
+s__Nocardiopsis_lucentensis 1 GCF_000341125
+s__Konjac_mosaic_virus 1 PRJNA16643
+s__Streptomyces_bottropensis 2 GCF_000383595 GCF_000340335
+s__Bradyrhizobium_oligotrophicum 1 GCF_000344805
+s__Maize_white_line_mosaic_satellite_virus 1 PRJNA14770
+s__Paenibacillus_sp_JC66 1 GCF_000285515
+s__Actinomyces_odontolyticus 2 GCF_000154225 GCF_000163415
+s__Acinetobacter_sp_NIPH_1847 1 GCF_000369605
+s__Peanut_chlorotic_streak_virus 1 PRJNA14388
+s__Plasmodium_knowlesi 1 GCA_000006355
+s__Naegleria_gruberi 1 GCA_000004985
+s__Nitrospina_sp_SCGC_AAA288_L16 1 GCF_000372225
+s__Wheat_yellow_dwarf_virus_GPV 1 PRJNA39307
+s__Ureaplasma_urealyticum 14 GCF_000169935 GCF_000171395 GCF_000255395 GCF_000169595 GCF_000021265 GCF_000169535 GCF_000169575 GCF_000169955 GCF_000255375 GCF_000171555 GCF_000255415 GCF_000169915 GCF_000255435 GCF_000169555
+s__Lausannevirus 1 PRJNA65279
+s__Olive_latent_virus_3 1 PRJNA46223
+s__Olive_latent_virus_2 1 PRJNA14778
+s__Olive_latent_virus_1 1 PRJNA15084
+s__Xenorhabdus_nematophila 1 GCF_000252955
+s__Burkholderia_phage_BcepMigl 1 PRJNA184149
+s__Pseudomonas_sp_EGD_AK9 1 GCF_000465935
+s__Erwinia_phage_phiEaH2 1 PRJNA184155
+s__Melegrivirus_A 1 PRJNA202886
+s__Potato_virus_M 1 PRJNA15324
+s__Potato_virus_H 1 PRJNA171010
+s__Cucurbita_yellow_vein_virus_associated_DNA_beta 1 PRJNA14525
+s__Neisseria_macacae 1 GCF_000220865
+s__Potato_virus_A 1 PRJNA15376
+s__Leptospira_interrogans 189 GCF_000244135 GCF_000343435 GCF_000246595 GCF_000246475 GCF_000246455 GCF_000343085 GCF_000343715 GCF_000343145 GCF_000216815 GCF_000246215 GCF_000306135 GCF_000347115 GCF_000342865 GCF_000216635 GCF_000217495 GCF_000347055 GCF_000217095 GCF_000342925 GCF_000342705 GCF_000244375 GCF_000243955 GCF_000216355 GCF_000246275 GCF_000244635 GCF_000217575 GCF_000342665 GCF_000343795 GCF_000343675 GCF_000216975 GCF_000343775 GCF_000343595 GCF_000343165 GCF_000231175 [...]
+s__Potato_virus_X 1 PRJNA15503
+s__Potato_virus_Y 1 PRJNA15290
+s__Potato_virus_V 1 PRJNA15379
+s__Potato_virus_T 1 PRJNA30735
+s__Xanthomonas_sacchari 1 GCF_000225975
+s__Potato_virus_S 1 PRJNA15574
+s__Hippea_maritima 1 GCF_000194135
+s__Clostridium_sticklandii 1 GCF_000196455
+s__Brucella_sp_UK5_01 1 GCF_000367105
+s__Oribacterium_sp_ACB1 1 GCF_000238055
+s__Streptomyces_sp_MspMP_M5 1 GCF_000373585
+s__Pepper_leaf_curl_virus 1 PRJNA14046
+s__Rice_gall_dwarf_virus 1 PRJNA19149
+s__Candidatus_Photodesmus_katoptron 1 GCF_000478685
+s__Candidatus_Ruthia_magnifica 1 GCF_000015105
+s__Methylocystis_rosea 1 GCF_000372845
+s__Circovirus_like_genome_SAR_A 1 PRJNA39631
+s__Mycobacterium_phage_Halo 1 PRJNA17147
+s__Circovirus_like_genome_SAR_B 1 PRJNA39607
+s__Synechococcus_phage_S_ShM2 1 PRJNA64699
+s__Black_beetle_virus 1 PRJNA14961
+s__Streptomyces_roseochromogenes 1 GCF_000497445
+s__Usutu_virus 1 PRJNA15047
+s__Atopobium_sp_BV3Ac4 1 GCF_000468815
+s__Orgyia_leucostigma_NPV 1 PRJNA28501
+s__Brachyspira_innocens 1 GCF_000384655
+s__Acinetobacter_johnsonii 5 GCF_000302335 GCF_000162055 GCF_000368045 GCF_000301735 GCF_000368805
+s__Bacillus_phage_CampHawk 1 PRJNA227116
+s__Vibrio_phage_VfO4K68 1 PRJNA14094
+s__Hollyhock_leaf_crumple_virus 1 PRJNA14206
+s__Mycobacterium_sp_JLS 1 GCF_000016005
+s__Porcine_parvovirus 1 PRJNA14055
+s__Sudan_ebolavirus 1 PRJNA15012
+s__Bacillus_sp_ZYK 1 GCF_000331575
+s__Halogeometricum_borinquense 2 GCF_000337855 GCF_000172995
+s__Haemophilus_pittmaniae 1 GCF_000223275
+s__Oribacterium_sp_oral_taxon_108 1 GCF_000214455
+s__Malvastrum_yellow_vein_Changa_Manga_virus 1 PRJNA60047
+s__Beet_curly_top_Iran_virus 1 PRJNA28973
+s__Gordonia_phage_GTE7 1 PRJNA76745
+s__Chicory_yellow_mottle_virus_large_satellite_RNA 1 PRJNA14798
+s__Gordonia_phage_GTE5 1 PRJNA78693
+s__Gordonia_phage_GTE2 1 PRJNA68415
+s__Candidatus_Methylomirabilis_oxyfera 1 GCF_000091165
+s__Taylorella_equigenitalis 2 GCF_000276685 GCF_000185745
+s__Cellulophaga_phage_phi14_2 1 PRJNA212954
+s__Equine_rhinitis_A_virus 1 PRJNA15205
+s__Microvirga_sp_WSM3557 1 GCF_000262405
+s__Dickeya_dadantii 3 GCF_000025065 GCF_000023545 GCF_000147055
+s__Bean_pod_mottle_virus 1 PRJNA15294
+s__Streptomyces_somaliensis 1 GCF_000258595
+s__Casphalia_extranea_densovirus 1 PRJNA14222
+s__Dill_cryptic_virus_1 1 PRJNA225921
+s__Clostridium_sartagoforme 1 GCF_000401215
+s__Dill_cryptic_virus_2 1 PRJNA198774
+s__Thioalkalivibrio_sp_ALJ3 1 GCF_000377205
+s__Thioalkalivibrio_sp_ALJ2 1 GCF_000378325
+s__Lachnospiraceae_bacterium_2_1_46FAA 1 GCF_000209385
+s__Thioalkalivibrio_sp_ALJ7 1 GCF_000376865
+s__Thioalkalivibrio_sp_ALJ6 1 GCF_000377365
+s__Thioalkalivibrio_sp_ALJ5 1 GCF_000377245
+s__Thioalkalivibrio_sp_ALJ4 1 GCF_000377225
+s__Thioalkalivibrio_sp_ALJ9 1 GCF_000380585
+s__Thioalkalivibrio_sp_ALJ8 1 GCF_000377385
+s__Abutilon_mosaic_virus 1 PRJNA14603
+s__Staphylococcus_phage_SAP_26 1 PRJNA51671
+s__Photobacterium_profundum 2 GCF_000196255 GCF_000153425
+s__TYLCCNV_Y322_satellite_DNA_beta 1 PRJNA16338
+s__Bacteroides_fragilis 14 GCF_000009925 GCF_000273095 GCF_000269525 GCF_000273765 GCF_000210835 GCF_000273115 GCF_000025985 GCF_000273155 GCF_000297695 GCF_000263115 GCF_000157015 GCF_000297755 GCF_000297735 GCF_000273135
+s__Streptomyces_viridosporus 1 GCF_000316095
+s__Xanthophyllomyces_dendrorhous_virus_L1A 1 PRJNA196417
+s__Bean_yellow_mosaic_virus 1 PRJNA15339
+s__Bifidobacterium_angulatum 1 GCF_000156635
+s__Desulfospira_joergensenii 1 GCF_000420085
+s__Haemophilus_sputorum 2 GCF_000287615 GCF_000238795
+s__Acinetobacter_sp_CIP_101966 1 GCF_000369725
+s__Grapevine_Syrah_virus_1 1 PRJNA36515
+s__Ageratum_leaf_curl_Cameroon_betasatellite 1 PRJNA36669
+s__Streptomyces_prunicolor 1 GCF_000367365
+s__Natrialba_phage_PhiCh1 1 PRJNA14207
+s__Candidatus_Microgenomatus_auricola 1 GCF_000380825
+s__Euphorbia_yellow_mosaic_virus 1 PRJNA36655
+s__Thioalkalivibrio_sp_ALJT 1 GCF_000381825
+s__Veillonella_atypica 3 GCF_000179735 GCF_000318355 GCF_000179755
+s__Streptomyces_sp_AA1529 1 GCF_000280905
+s__Mycobacterium_phage_Muddy 1 PRJNA215120
+s__Coleus_vein_necrosis_virus 1 PRJNA20665
+s__Tomato_leaf_curl_Cameroon_virus 1 PRJNA42743
+s__Lactobacillus_acidipiscis 1 GCF_000260635
+s__Ruegeria_sp_R11 1 GCF_000156255
+s__Corynebacterium_sp_KPL1824 1 GCF_000478095
+s__Bartonella_tribocorum 1 GCF_000196435
+s__Human_endogenous_retrovirus_K 1 PRJNA222261
+s__Endoriftia_persephone 1 GCF_000168735
+s__Coconut_cadang_cadang_viroid 1 PRJNA14629
+s__Canary_circovirus 1 PRJNA14513
+s__Kingella_kingae 3 GCF_000213535 GCF_000255635 GCF_000283375
+s__Yersinia_mollaretii 1 GCF_000167995
+s__Alcanivorax_hongdengensis 1 GCF_000300995
+s__Janibacter_hoylei 1 GCF_000297495
+s__Streptococcus_cristatus 2 GCF_000187855 GCF_000222765
+s__Panicum_mosaic_satellite_virus 1 PRJNA14816
+s__Mayaro_virus 1 PRJNA15392
+s__Fusobacterium_mortiferum 1 GCF_000158195
+s__Eggplant_latent_viroid 1 PRJNA14977
+s__Ovine_enzootic_nasal_tumor_virus 1 PRJNA15410
+s__Burkholderia_phytofirmans 1 GCF_000020125
+s__Enterobacter_sp_MGH_16 1 GCF_000493175
+s__Enterobacter_sp_MGH_14 1 GCF_000474785
+s__Staphylococcus_phage_phiPVL108 1 PRJNA18463
+s__Otomops_polyomavirus_1 1 PRJNA185192
+s__Otomops_polyomavirus_2 1 PRJNA185193
+s__Tomato_yellow_leaf_curl_Vietnam_virus_satellite_DNA_beta 1 PRJNA19829
+s__Hahella_chejuensis 1 GCF_000012985
+s__Pyrococcus_sp_ST04 1 GCF_000263735
+s__Salmonella_phage_FSL_SP_058 1 PRJNA212712
+s__Ralstonia_phage_RSB1 1 PRJNA31163
+s__Pseudomonas_savastanoi 6 GCF_000225805 GCF_000187065 GCF_000012205 GCF_000187045 GCF_000143005 GCF_000164015
+s__Catonella_morbi 1 GCF_000160035
+s__Caldalkalibacillus_thermarum 1 GCF_000218765
+s__Human_cyclovirus_VS5700009 1 PRJNA209365
+s__Acartia_tonsa_copepod_circovirus 1 PRJNA186432
+s__Great_Island_virus 1 PRJNA52641
+s__Filifactor_alocis 1 GCF_000163895
+s__Aeromonas_molluscorum 1 GCF_000388115
+s__Cupriavidus_pinatubonensis 1 GCF_000203875
+s__Klebsiella_variicola 1 GCF_000025465
+s__Tomato_yellow_leaf_curl_Kanchanaburi_virus 1 PRJNA14360
+s__Okra_yellow_vein_disease_associated_sequence 1 PRJNA14443
+s__Syntrophus_aciditrophicus 1 GCF_000013405
+s__Paenibacillus_daejeonensis 1 GCF_000378385
+s__Armigeres_subalbatus_virus_SaX06_AK20 1 PRJNA56065
+s__Serratia_sp_DD3 1 GCF_000496755
+s__Enterobacteria_phage_Chi 1 PRJNA206471
+s__Paracoccus_denitrificans 2 GCF_000203895 GCF_000219825
+s__Phascolarctobacterium_succinatutens 1 GCF_000188175
+s__Aeromonas_diversa 1 GCF_000367845
+s__Simian_foamy_virus 1 PRJNA14699
+s__Bacteriovorax_sp_BAL6_X 1 GCF_000443995
+s__Asticcacaulis_sp_AC402 1 GCF_000495835
+s__Methylocella_silvestris 1 GCF_000021745
+s__Staphylococcus_phage_SA13 1 PRJNA213078
+s__Staphylococcus_phage_SA12 1 PRJNA212955
+s__Staphylococcus_phage_SA11 1 PRJNA181242
+s__Kitasatospora_setae 1 GCF_000269985
+s__Xanthomonas_hortorum 1 GCF_000505565
+s__Aspergillus_flavus 1 GCA_000006275
+s__Sonchus_yellow_net_virus 1 PRJNA14642
+s__Papio_hamadryas_papillomavirus_type_1 1 PRJNA159111
+s__Paenibacillus_ginsengihumi 1 GCF_000380965
+s__Staphylococcus_sp_AL1 1 GCF_000292305
+s__Banana_streak_virus_strain_Acuminata_Vietnam 1 PRJNA15240
+s__Lactobacillus_sp_66c 1 GCF_000312625
+s__Thioalkalivibrio_sp_ALJ17 1 GCF_000377945
+s__Thioalkalivibrio_sp_ALJ16 1 GCF_000377345
+s__Thioalkalivibrio_sp_ALJ15 1 GCF_000383695
+s__Thioalkalivibrio_sp_ALJ12 1 GCF_000378305
+s__Thioalkalivibrio_sp_ALJ11 1 GCF_000376925
+s__Thioalkalivibrio_sp_ALJ10 1 GCF_000377305
+s__Acaryochloris_marina 1 GCF_000018105
+s__Sulfolobus_spindle_shaped_virus_4 1 PRJNA27893
+s__Enterobacteria_phage_RB32 1 PRJNA17997
+s__Staphylococcus_prophage_phiPV83 1 PRJNA14135
+s__Lactobacillus_saerimneri 1 GCF_000317165
+s__Feline_leukemia_virus 1 PRJNA14686
+s__Caulobacter_sp_JGI_0001013_D04 1 GCF_000376305
+s__Robiginitalea_biformata 1 GCF_000024125
+s__Mycobacterium_phage_DD5 1 PRJNA30513
+s__Corynebacterium_halotolerans 1 GCF_000341345
+s__Merkel_cell_polyomavirus 1 PRJNA28509
+s__Beluga_Whale_coronavirus_SW1 1 PRJNA29509
+s__Pseudomonas_phage_AF 1 PRJNA184151
+s__Desulfurispora_thermophila 1 GCF_000376385
+s__Pantoea_phage_LIMEzero 1 PRJNA67417
+s__Leishmania_RNA_virus_2_1 1 PRJNA14696
+s__Vibrio_rumoiensis 1 GCF_000286955
+s__Streptomyces_griseoaurantiacus 1 GCF_000204605
+s__Actinomyces_timonensis 1 GCF_000295095
+s__Afipia_broomeae 1 GCF_000314675
+s__Campylobacter_sp_FOBRC14 1 GCF_000287855
+s__Methanosaeta_concilii 1 GCF_000204415
+s__Brome_streak_mosaic_virus 1 PRJNA15336
+s__Vibrio_owensii 1 GCF_000400385
+s__Yokose_virus 1 PRJNA15118
+s__Truepera_radiovictrix 1 GCF_000092425
+s__Adeno_associated_virus_3 1 PRJNA14319
+s__Tetraselmis_viridis_virus_S1 1 PRJNA195496
+s__Cotton_leaf_curl_Kokhran_virus 1 PRJNA14241
+s__Neisseria_flavescens 2 GCF_000175275 GCF_000173935
+s__Okra_yellow_mosaic_Mexico_virus 1 PRJNA48103
+s__Bartonella_bovis 2 GCF_000385395 GCF_000384965
+s__Cocksfoot_mottle_virus 1 PRJNA15078
+s__Influenza_C_virus 1 PRJNA15055
+s__Methylohalobius_crimeensis 1 GCF_000421465
+s__Capsicum_chlorosis_virus 1 PRJNA17547
+s__Streptomyces_lividans 2 GCF_000403665 GCF_000158935
+s__Halogeometricum_pleomorphic_virus_1 1 PRJNA157263
+s__Simian_adenovirus_18 1 PRJNA218146
+s__Wisteria_vein_mosaic_virus 1 PRJNA15532
+s__Theileria_annulata 1 GCA_000003225
+s__Sheldgoose_hepatitis_B_virus 1 PRJNA14618
+s__Mycobacterium_phage_Gizmo 1 PRJNA206479
+s__Subterranean_clover_stunt_virus 1 PRJNA14180
+s__Oceanimonas_smirnovii 1 GCF_000381965
+s__Enterococcus_faecium 241 GCF_000394435 GCF_000394695 GCF_000396765 GCF_000250945 GCF_000295015 GCF_000392195 GCF_000396925 GCF_000322045 GCF_000395465 GCF_000396965 GCF_000415285 GCF_000415365 GCF_000394715 GCF_000394555 GCF_000295395 GCF_000295275 GCF_000321805 GCF_000295215 GCF_000394655 GCF_000321765 GCF_000392105 GCF_000392165 GCF_000395885 GCF_000295435 GCF_000321685 GCF_000295055 GCF_000295575 GCF_000397025 GCF_000392065 GCF_000392085 GCF_000295455 GCF_000294815 GCF_000407105 GC [...]
+s__Ectropis_obliqua_virus 1 PRJNA14953
+s__Ralstonia_sp_GA3_3 1 GCF_000389805
+s__Bradyrhizobium_sp_WSM471 1 GCF_000244915
+s__Corynebacterium_capitovis 1 GCF_000372085
+s__Amycolicicoccus_subflavus 1 GCF_000214175
+s__Candidatus_Desulforudis_audaxviator 1 GCF_000018425
+s__Thin_paspalum_asymptomatic_virus 1 PRJNA210800
+s__Horseradish_latent_virus 1 PRJNA177549
+s__Lactobacillus_plantarum 16 GCF_000474695 GCF_000347515 GCF_000410795 GCF_000412205 GCF_000469115 GCF_000507045 GCF_000247735 GCF_000338115 GCF_000203855 GCF_000463075 GCF_000143745 GCF_000466845 GCF_000148815 GCF_000392485 GCF_000023085 GCF_000466905
+s__Halorhodospira_halophila 1 GCF_000015585
+s__Pelodictyon_luteolum 1 GCF_000012485
+s__Idiomarina_xiamenensis 1 GCF_000299895
+s__Banana_streak_virus 1 PRJNA16747
+s__Corchorus_golden_mosaic_virus 1 PRJNA20051
+s__Tomato_golden_mosaic_virus 1 PRJNA14072
+s__Campylobacter_phage_NCTC12673 1 PRJNA66395
+s__Rice_grassy_stunt_virus 1 PRJNA14692
+s__Rhodobacteraceae_bacterium_HTCC2083 1 GCF_000156115
+s__Flavobacteriaceae_bacterium_3519_10 1 GCF_000023725
+s__Thermoanaerobacterium_saccharolyticum 1 GCF_000307585
+s__Methylopila_sp_M107 1 GCF_000384475
+s__Wolbachia_sp_wRi 1 GCF_000022285
+s__Cyanophage_SS120_1 1 PRJNA195516
+s__Nocardia_farcinica 1 GCF_000009805
+s__Campylobacter_lari 1 GCF_000019205
+s__Marinobacter_sp_ELB17 1 GCF_000169375
+s__Adoxophyes_orana_nucleopolyhedrovirus 1 PRJNA32387
+s__Halorubrum_litoreum 1 GCF_000337395
+s__Streptococcus_thermophilus 9 GCF_000284675 GCF_000011825 GCF_000014485 GCF_000253395 GCF_000011845 GCF_000182875 GCF_000262675 GCF_000335515 GCF_000335495
+s__Clover_yellow_mosaic_virus 1 PRJNA14645
+s__Tai_Forest_ebolavirus 1 PRJNA51257
+s__Emilia_yellow_vein_virus_associated_DNA_beta 1 PRJNA37893
+s__Wohlfahrtiimonas_chitiniclastica 2 GCF_000334955 GCF_000375345
+s__Candidatus_Glomeribacter_gigasporarum 1 GCF_000227585
+s__Atlantic_salmon_swim_bladder_sarcoma_virus 1 PRJNA16247
+s__Actinomyces_georgiae 1 GCF_000277685
+s__Azoarcus_sp_BH72 1 GCF_000061505
+s__Oribacterium_sp_ACB7 1 GCF_000238075
+s__Pontibacter_roseus 1 GCF_000373265
+s__Fusobacterium_ulcerans 2 GCF_000158315 GCF_000242995
+s__Apple_mosaic_virus 1 PRJNA14745
+s__Eubacterium_limosum 1 GCF_000152245
+s__Mycoplasma_pneumoniae 7 GCF_000319655 GCF_000143945 GCF_000331085 GCF_000319675 GCF_000387745 GCF_000027345 GCF_000283755
+s__Streptomyces_sulphureus 2 GCF_000262345 GCF_000381025
+s__Scardovia_inopinata 1 GCF_000163755
+s__Desulfovibrio_aespoeensis 1 GCF_000176915
+s__Aggregatibacter_phage_S1249 1 PRJNA41333
+s__Bacillus_siamensis 1 GCF_000262045
+s__Tomato_common_mosaic_virus 1 PRJNA30185
+s__Bartonella_birtlesii 3 GCF_000296235 GCF_000273375 GCF_000278095
+s__Mesorhizobium_ciceri 1 GCF_000185905
+s__Arthrobacter_sp_Rue61a 1 GCF_000294695
+s__Megasphaera_sp_UPII_135_E 1 GCF_000221545
+s__Neosartorya_fischeri 1 GCA_000149645
+s__Heliobacterium_modesticaldum 1 GCF_000019165
+s__Anaerofustis_stercorihominis 1 GCF_000154825
+s__Oribacterium_sp_ACB8 1 GCF_000277505
+s__Citrobacter_rodentium 1 GCF_000027085
+s__Pelobacter_propionicus 1 GCF_000015045
+s__Parasutterella_excrementihominis 1 GCF_000205025
+s__Anaeromyxobacter_sp_Fw109_5 1 GCF_000017505
+s__Photorhabdus_temperata 2 GCF_000478765 GCF_000447415
+s__Streptococcus_gallolyticus 4 GCF_000203195 GCF_000146525 GCF_000027185 GCF_000270145
+s__Mycoplasma_hominis 2 GCF_000085865 GCF_000385075
+s__Atopobium_minutum 1 GCF_000364325
+s__Lactobacillus_coryniformis 3 GCF_000283115 GCF_000166795 GCF_000184285
+s__Anaeromyxobacter_dehalogenans 2 GCF_000013385 GCF_000022145
+s__Flavobacterium_columnare 1 GCF_000240075
+s__Nyamanini_virus 1 PRJNA38109
+s__Arthrobacter_sp_131MFCol6_1 1 GCF_000374925
+s__Staphylococcus_sp_OJ82 1 GCF_000294465
+s__Figwort_mosaic_virus 1 PRJNA14512
+s__Chromohalobacter_salexigens 1 GCF_000055785
+s__Bean_dwarf_mosaic_virus 1 PRJNA14037
+s__Pseudaletia_unipuncta_granulovirus 1 PRJNA43731
+s__Gayadomonas_joobiniege 1 GCF_000300815
+s__Vibrio_phage_CTX 1 PRJNA63437
+s__Tobacco_leaf_curl_betasatellite 1 PRJNA45925
+s__Etapapillomavirus_1 1 PRJNA14205
+s__Tobacco_mild_green_mosaic_virus 1 PRJNA14671
+s__alpha_proteobacterium_BAL199 1 GCF_000171835
+s__Royal_Farm_virus 1 PRJNA15149
+s__Lactobacillus_coleohominis 1 GCF_000161935
+s__Granulibacter_bethesdensis 1 GCF_000014285
+s__Acinetobacter_sp_NBRC_100985 1 GCF_000241225
+s__Parabacteroides_sp_20_3 1 GCF_000162535
+s__Prevotella_oralis 3 GCF_000185145 GCF_000507905 GCF_000413355
+s__Methanosaeta_thermophila 1 GCF_000014945
+s__Pseudomonas_sp_CFII68 1 GCF_000416195
+s__Curtobacterium_sp_B18 1 GCF_000333375
+s__Tobacco_leaf_curl_disease_associated_sequence 1 PRJNA14442
+s__Francisella_tularensis 26 GCF_000016105 GCF_000009245 GCF_000305875 GCF_000380385 GCF_000248415 GCF_000009325 GCF_000023305 GCF_000168775 GCF_000017785 GCF_000380405 GCF_000155535 GCF_000380425 GCF_000380445 GCF_000154165 GCF_000305915 GCF_000154145 GCF_000170295 GCF_000018925 GCF_000305835 GCF_000305855 GCF_000014605 GCF_000313275 GCF_000305895 GCF_000346525 GCF_000153845 GCF_000313385
+s__Frankia_sp_QA3 1 GCF_000262465
+s__Cronobacter_phage_ESP2949_1 1 PRJNA181234
+s__Mycoplasma_phage_phiMFV1 1 PRJNA14387
+s__Triticum_mosaic_virus 1 PRJNA38495
+s__Proteus_mirabilis 7 GCF_000372565 GCF_000313255 GCF_000160755 GCF_000444425 GCF_000297835 GCF_000069965 GCF_000297815
+s__Pseudomonas_phage_Pf1 1 PRJNA14571
+s__Pseudomonas_phage_phiKMV 1 PRJNA15226
+s__Pseudomonas_phage_Pf3 1 PRJNA14061
+s__Ilheus_virus 1 PRJNA18845
+s__Geobacter_daltonii 1 GCF_000022265
+s__Ludwigia_yellow_vein_virus 1 PRJNA15559
+s__Maize_streak_virus 1 PRJNA14577
+s__Mycobacterium_phage_Bane1 1 PRJNA219118
+s__Mycobacterium_phage_Bane2 1 PRJNA219119
+s__Flavobacterium_cauense 1 GCF_000498475
+s__Synechococcus_phage_S_RIP1 1 PRJNA195487
+s__Synechococcus_phage_S_RIP2 1 PRJNA195486
+s__Beet_western_yellows_virus 1 PRJNA14885
+s__Pyrococcus_abyssi 1 GCF_000195935
+s__Olsenella_profusa 1 GCF_000468755
+s__Calothrix_sp_PCC_7103 1 GCF_000331305
+s__Discula_destructiva_virus_2 1 PRJNA14787
+s__Ostreococcus_virus_OsV5 1 PRJNA28159
+s__Discula_destructiva_virus_1 1 PRJNA14117
+s__Haloferax_sulfurifontis 1 GCF_000337835
+s__Taylorella_asinigenitalis 1 GCF_000226625
+s__Plautia_stali_intestine_virus 1 PRJNA14799
+s__Tacaribe_virus 1 PRJNA14863
+s__Lachnospiraceae_bacterium_9_1_43BFAA 1 GCF_000209445
+s__Tomato_chlorotic_mottle_virus 1 PRJNA14175
+s__Pelodictyon_phaeoclathratiforme 1 GCF_000020645
+s__Enterobacter_mori 1 GCF_000211415
+s__Beak_and_feather_disease_virus 1 PRJNA14453
+s__Turicibacter_sanguinis 1 GCF_000178255
+s__Salimicrobium_sp_MJ3 1 GCF_000299295
+s__Bdellovibrio_phage_phi1422 1 PRJNA181215
+s__Apium_virus_Y 1 PRJNA61905
+s__Tomato_mosaic_leaf_curl_virus 1 PRJNA14370
+s__Amasya_cherry_disease_associated_chrysovirus 1 PRJNA21113
+s__Rhizobium_sp_JGI_0001005_H05 1 GCF_000375385
+s__Mycobacterium_phage_Cooper 1 PRJNA17145
+s__Cylindrospermum_stagnale 1 GCF_000317535
+s__Xanthomonas_phage_Xop411 1 PRJNA19771
+s__Irkut_virus 1 PRJNA194140
+s__Acinetobacter_sp_ANC_3880 1 GCF_000369845
+s__Staphylococcus_phage_phiSauS_IPLA88 1 PRJNA33001
+s__Humulus_japonicus_latent_virus 1 PRJNA14958
+s__Staphylococcus_phage_JS01 1 PRJNA212710
+s__Marine_Group_II_euryarchaeote_SCGC_AB_629_J06 1 GCF_000376045
+s__Streptomyces_sp_LaPpAH_108 1 GCF_000373625
+s__Anabaena_variabilis 1 GCF_000204075
+s__Glaciecola_arctica 1 GCF_000314995
+s__Desulfuromonas_acetoxidans 1 GCF_000167355
+s__Banana_bract_mosaic_virus 1 PRJNA20617
+s__Citrobacter_sp_KTE151 1 GCF_000398845
+s__Loktanella_phage_pCB2051_A 1 PRJNA195476
+s__Streptococcus_mitis_oralis_pneumoniae 301 GCF_000251625 GCF_000185265 GCF_000252305 GCF_000251285 GCF_000251565 GCF_000278885 GCF_000251825 GCF_000506605 GCF_000232325 GCF_000019265 GCF_000252265 GCF_000257495 GCF_000171655 GCF_000506765 GCF_000170035 GCF_000232725 GCF_000147095 GCF_000232645 GCF_000334555 GCF_000385715 GCF_000232785 GCF_000232765 GCF_000385835 GCF_000334735 GCF_000210995 GCF_000232945 GCF_000430345 GCF_000180575 GCF_000222785 GCF_000210955 GCF_000334695 GCF_000251085 [...]
+s__Halorubrum_phage_HF2 1 PRJNA14147
+s__Eubacterium_sp_3_1_31 2 GCF_000273585 GCF_000242955
+s__Rothia_aeria 2 GCF_000479025 GCF_000258205
+s__Vibrio_phage_VPUSM_8 1 PRJNA227006
+s__Narcissus_common_latent_virus 1 PRJNA17373
+s__Brucella_sp_BO2 1 GCF_000177135
+s__Listeria_phage_P70 1 PRJNA177526
+s__Penaeus_vannamei_nodavirus 1 PRJNA62263
+s__Celery_mosaic_virus 1 PRJNA65809
+s__Bacteroides_sp_4_1_36 1 GCF_000185585
+s__Mycobacterium_chubuense 1 GCF_000266905
+s__Gossypium_davidsonii_symptomless_alphasatellite 1 PRJNA39589
+s__Salmonella_phage_SPN1S 1 PRJNA82639
+s__Candida_tropicalis 1 GCA_000006335
+s__Erwinia_toletana 1 GCF_000336255
+s__Streptomyces_sp_SirexAA_E 1 GCF_000177195
+s__Cyanophage_KBS_S_2A 1 PRJNA195502
+s__Amycolatopsis_vancoresmycina 1 GCF_000388135
+s__Propionibacterium_phage_P9_1 1 PRJNA177529
+s__Caulobacter_phage_phiCb5 1 PRJNA181078
+s__Nocardiopsis_salina 1 GCF_000341025
+s__Hantavirus_Z10 1 PRJNA15044
+s__Salmonella_phage_SPN19 1 PRJNA179408
+s__Sphingomonas_sp_KC8 1 GCF_000214335
+s__Silicibacter_phage_DSS3phi2 1 PRJNA38081
+s__Alicyclobacillus_hesperidum 1 GCF_000294675
+s__Ralstonia_phage_RSL1 1 PRJNA30059
+s__Myxococcus_xanthus 3 GCF_000278585 GCF_000340515 GCF_000012685
+s__Shigella_dysenteriae 6 GCF_000193895 GCF_000012005 GCF_000164925 GCF_000268105 GCF_000168075 GCF_000211935
+s__Turkey_astrovirus 1 PRJNA15096
+s__Roseobacter_denitrificans 1 GCF_000014045
+s__Mycobacterium_phage_Chy5 1 PRJNA206476
+s__Hepatitis_B_virus 1 PRJNA15428
+s__Chickpea_chlorosis_Australia_virus 1 PRJNA216948
+s__Streptomyces_phage_Zemlya 1 PRJNA206481
+s__Squirrel_monkey_retrovirus 1 PRJNA14914
+s__Andean_potato_latent_virus 1 PRJNA192611
+s__Ferret_papillomavirus 1 PRJNA218024
+s__Oligella_ureolytica 1 GCF_000373745
+s__Caulobacter_phage_CcrMagneto 1 PRJNA179421
+s__Streptococcus_parasanguinis 7 GCF_000260695 GCF_000262145 GCF_000180035 GCF_000187505 GCF_000507765 GCF_000164675 GCF_000222725
+s__Sphingobium_yanoikuyae 2 GCF_000315525 GCF_000224695
+s__Thermoanaerobacter_italicus 1 GCF_000025645
+s__Leptolyngbya_boryana 1 GCF_000353285
+s__Coleofasciculus_chthonoplastes 1 GCF_000155555
+s__Diadromus_pulchellus_ascovirus_4a 1 PRJNA32133
+s__Mycobacterium_phage_Cjw1 1 PRJNA14270
+s__Pseudomonas_tolaasii 2 GCF_000316215 GCF_000276565
+s__Ehrlichia_chaffeensis 2 GCF_000013145 GCF_000167655
+s__Solanum_nodiflorum_mottle_virus_satellite_RNA 1 PRJNA14184
+s__Zymomonas_mobilis 6 GCF_000218875 GCF_000024245 GCF_000303025 GCF_000007105 GCF_000277755 GCF_000175255
+s__Citrobacter_sp_L17 1 GCF_000313895
+s__Flavonifractor_plautii 1 GCF_000239295
+s__Phlox_virus_S 1 PRJNA19427
+s__Mycoplasma_hyorhinis 5 GCF_000383515 GCF_000241125 GCF_000145705 GCF_000211295 GCF_000313635
+s__Papiine_herpesvirus_2 1 PRJNA16246
+s__Bordetella_holmesii 3 GCF_000341485 GCF_000341465 GCF_000317335
+s__Rothia_mucilaginosa 3 GCF_000011025 GCF_000175615 GCF_000231235
+s__Enterococcus_phage_BC_611 1 PRJNA169229
+s__Mycobacterium_phage_ET08 1 PRJNA42783
+s__Ikoma_lyssavirus 1 PRJNA175665
+s__Methyloferula_stellata 1 GCF_000385335
+s__Pseudomonas_sp_45MFCol3_1 1 GCF_000382025
+s__Banana_streak_GF_virus 1 PRJNA15411
+s__Paracoccidioides_sp_lutzii 1 GCA_000150705
+s__Bat_picornavirus_2 1 PRJNA72393
+s__Bat_picornavirus_3 1 PRJNA72379
+s__Bat_picornavirus_1 1 PRJNA72391
+s__Rhodopirellula_sallentina 1 GCF_000346505
+s__Peptoniphilus_sp_oral_taxon_375 1 GCF_000221565
+s__Olsenella_sp_oral_taxon_809 1 GCF_000233535
+s__Salmonella_phage_phiSG_JL2 1 PRJNA30063
+s__Plasmodium_berghei 1 GCA_000005395
+s__Jannaschia_sp_CCS1 1 GCF_000013565
+s__Burkholderia_phage_KS14 1 PRJNA64613
+s__Escherichia_sp_3_2_53FAA 1 GCF_000157115
+s__Microplitis_demolitor_bracovirus 1 PRJNA15245
+s__Burkholderia_phage_KS10 1 PRJNA31221
+s__Verrucomicrobia_bacterium_SCGC_AAA300_K03 1 GCF_000382665
+s__Yersinia_enterocolitica 12 GCF_000285015 GCF_000192105 GCF_000401995 GCF_000253175 GCF_000284995 GCF_000009345 GCF_000401935 GCF_000401955 GCF_000230775 GCF_000330605 GCF_000401975 GCF_000297175
+s__Pseudomonas_sp_S13_1_2 1 GCF_000292285
+s__Caulobacter_vibrioides 2 GCF_000372645 GCF_000022005
+s__Rhynchosia_golden_mosaic_Yucatan_virus 1 PRJNA36505
+s__Staphylococcus_phage_P954 1 PRJNA40231
+s__Lactococcus_phage_P335_sensu_lato 1 PRJNA14281
+s__Wolbachia_endosymbiont_of_Nasonia_vitripennis 1 GCF_000204545
+s__Taro_bacilliform_virus 1 PRJNA14233
+s__Bombyx_mori_Macula_like_virus 1 PRJNA66973
+s__Blueberry_latent_virus 1 PRJNA56015
+s__Actinomyces_sp_ICM47 1 GCF_000278725
+s__Amycolatopsis_halophila 1 GCF_000504245
+s__Mycobacterium_phage_DrDrey 1 PRJNA215108
+s__Thermus_phage_phiYS40 1 PRJNA18277
+s__Frankia_sp_EAN1pec 1 GCF_000018005
+s__Propionibacterium_phage_PAD20 1 PRJNA66341
+s__Pineapple_mealybug_wilt_associated_virus_1 1 PRJNA28147
+s__Bacteroides_salanitronis 1 GCF_000190575
+s__Candidatus_Poribacteria_sp_WGA_A3 1 GCF_000177275
+s__Acidovorax_avenae 2 GCF_000218805 GCF_000176855
+s__Fusobacterium_varium 1 GCF_000159915
+s__Actinomyces_sp_oral_taxon_172 1 GCF_000466265
+s__Actinomyces_sp_oral_taxon_171 1 GCF_000186965
+s__Actinomyces_sp_oral_taxon_170 1 GCF_000195595
+s__Citrobacter_youngae 1 GCF_000155975
+s__Methylobacter_sp_UW_659_2_H10 1 GCF_000375885
+s__Actinomyces_sp_oral_taxon_175 1 GCF_000223355
+s__Actinomyces_sp_oral_taxon_178 1 GCF_000186685
+s__Flavobacterium_phage_6H 1 PRJNA213018
+s__Chloris_striate_mosaic_virus 1 PRJNA14068
+s__Vibrio_mimicus 8 GCF_000176415 GCF_000338875 GCF_000175975 GCF_000473785 GCF_000222145 GCF_000176375 GCF_000473825 GCF_000175995
+s__Sida_mottle_Alagoas_virus 1 PRJNA189218
+s__Natronolimnobius_innermongolicus 1 GCF_000337215
+s__Roseobacter_phage_RDJL_Phi_1 1 PRJNA66399
+s__Spirochaeta_smaragdinae 1 GCF_000143985
+s__Petrotoga_mobilis 1 GCF_000018605
+s__Akkermansia_muciniphila 1 GCF_000020225
+s__Actinomyces_viscosus 1 GCF_000175315
+s__Clostridium_sp_7_2_43FAA 1 GCF_000158375
+s__Enterobacteria_phage_MS2 1 PRJNA14659
+s__Helicobacter_winghamensis 1 GCF_000158455
+s__White_clover_cryptic_virus_1 1 PRJNA15061
+s__Human_papillomavirus_161_like_viruses 1 PRJNA178458
+s__Yoka_poxvirus 1 PRJNA72715
+s__Pseudomonas_phage_KPP10 1 PRJNA64611
+s__Pseudomonas_phage_KPP12 1 PRJNA184164
+s__Anaerostipes_sp_3_2_56FAA 1 GCF_000185825
+s__Methylibium_petroleiphilum 1 GCF_000015725
+s__Bdellovibrio_phage_phi1402 1 PRJNA68417
+s__Youcai_mosaic_virus 1 PRJNA14869
+s__Sagittula_stellata 1 GCF_000169415
+s__Lactobacillus_phage_phig1e 1 PRJNA14315
+s__Ipomoea_begomovirus_satellite_DNA_beta 1 PRJNA80873
+s__Eimeria_tenella 1 GCA_000002835
+s__Cebus_albifrons_polyomavirus_1 1 PRJNA183903
+s__Brucella_ceti 5 GCF_000158775 GCF_000157855 GCF_000157835 GCF_000158755 GCF_000182425
+s__Xipapillomavirus_1 1 PRJNA15452
+s__Tobacco_vein_distorting_virus 1 PRJNA29875
+s__Clostridium_autoethanogenum 1 GCF_000484505
+s__Pseudoalteromonas_arctica 1 GCF_000238395
+s__Nitratifractor_salsuginis 1 GCF_000186245
+s__Kyasanur_forest_disease_virus 1 PRJNA15387
+s__Honeysuckle_yellow_vein_beta 1 PRJNA19601
+s__Ageratum_leaf_curl_betasatellite 1 PRJNA195929
+s__Selenomonas_sp_oral_taxon_149 1 GCF_000146365
+s__Methylomicrobium_album 1 GCF_000214275
+s__Mycoplasma_putrefaciens 2 GCF_000224105 GCF_000376625
+s__Pediococcus_acidilactici 3 GCF_000235805 GCF_000146325 GCF_000163095
+s__Asticcacaulis_benevestitus 2 GCF_000495775 GCF_000376105
+s__Rhizobium_sp_2MFCol3_1 1 GCF_000377565
+s__Streptomyces_sp_R1_NS_10 1 GCF_000376565
+s__Tula_virus 1 PRJNA14936
+s__gamma_proteobacterium_IMCC3088 1 GCF_000204315
+s__Anaerolinea_thermophila 1 GCF_000199675
+s__Helicoverpa_armigera_nucleopolyhedrovirus 3 PRJNA14108 PRJNA14615 PRJNA32205
+s__Ehrlichia_canis 1 GCF_000012565
+s__Mobiluncus_mulieris 4 GCF_000148485 GCF_000160615 GCF_000146895 GCF_000176775
+s__Clostridium_stercorarium 1 GCF_000331995
+s__Torque_teno_virus_26 1 PRJNA48157
+s__Enterobacteria_phage_vB_EcoS_Rogue1 1 PRJNA183151
+s__Woolly_monkey_sarcoma_virus 1 PRJNA19547
+s__Torque_teno_virus_25 1 PRJNA48165
+s__Streptomyces_sp_HCCB10043 1 GCF_000498935
+s__Torque_teno_virus_28 1 PRJNA48145
+s__Staphylococcus_phage_G1 1 PRJNA15261
+s__Erwinia_phage_PEp14 1 PRJNA82653
+s__Chalara_elegans_RNA_Virus_1 1 PRJNA15126
+s__Candidatus_Riesia_pediculicola 1 GCF_000093065
+s__Veillonella_ratti 1 GCF_000315505
+s__Strawberry_pallidosis_associated_virus 1 PRJNA15058
+s__Human_adenovirus_C 3 PRJNA14518 PRJNA15106 PRJNA15107
+s__Enterobacteria_phage_phiEcoM_GJ1 1 PRJNA27979
+s__Canine_circovirus 1 PRJNA196432
+s__Lactococcus_phage_bIL285 1 PRJNA14111
+s__Lactococcus_phage_bIL286 1 PRJNA14397
+s__Pedobacter_arcticus 1 GCF_000302595
+s__Acinetobacter_phage_Acj61 1 PRJNA60117
+s__Blackberry_vein_banding_associated_virus 1 PRJNA215129
+s__Tomato_leaf_curl_virus 1 PRJNA14191
+s__Novosphingobium_sp_PP1Y 1 GCF_000253255
+s__Raphanus_sativus_cryptic_virus_2 1 PRJNA28757
+s__Raphanus_sativus_cryptic_virus_3 1 PRJNA33269
+s__Raphanus_sativus_cryptic_virus_1 1 PRJNA17127
+s__Halopiger_xanaduensis 1 GCF_000217715
+s__Methanocella_conradii 1 GCF_000251105
+s__Candidatus_Accumulibacter_phosphatis 1 GCF_000024165
+s__Cellulophaga_phage_phi39_1 1 PRJNA212957
+s__Pseudomonas_fluorescens 21 GCF_000275925 GCF_000009225 GCF_000292795 GCF_000297195 GCF_000012445 GCF_000285355 GCF_000275905 GCF_000263695 GCF_000285955 GCF_000281895 GCF_000237065 GCF_000263675 GCF_000293885 GCF_000334015 GCF_000465595 GCF_000166515 GCF_000308175 GCF_000262325 GCF_000280805 GCF_000276585 GCF_000217955
+s__Salmonella_phage_SE1 1 PRJNA33483
+s__Haloterrigena_thermotolerans 1 GCF_000337115
+s__Aster_yellows_witches_broom_phytoplasma 1 GCF_000012225
+s__Burkholderia_phage_Bcep22 1 PRJNA14335
+s__Mycobacterium_phage_PG1 1 PRJNA14357
+s__Chinese_yam_necrotic_mosaic_virus 1 PRJNA173355
+s__Natronorubrum_sulfidifaciens 1 GCF_000337735
+s__Zantedeschia_mild_mosaic_virus 1 PRJNA32715
+s__Haloferax_lucentense 1 GCF_000336795
+s__Leptospira_terpstrae 1 GCF_000332495
+s__Listeria_phage_LP_125 1 PRJNA212716
+s__Staphylococcus_phage_SAP_2 1 PRJNA20925
+s__Alishewanella_agri 1 GCF_000272005
+s__Marinobacterium_rhizophilum 1 GCF_000378045
+s__Vibrio_nigripulchritudo 1 GCF_000222685
+s__Bovine_herpesvirus_1 1 PRJNA14585
+s__Grapevine_red_blotch_associated_virus 1 PRJNA214508
+s__Bacillus_sp_7_6_55CFAA_CT2 1 GCF_000238655
+s__Acinetobacter_sp_528 1 GCF_000302395
+s__Cardioderma_polyomavirus 1 PRJNA185188
+s__Gordonia_otitidis 1 GCF_000248075
+s__Clostridium_sp_BNL1100 1 GCF_000244875
+s__Murine_coronavirus 3 PRJNA15138 PRJNA15350 PRJNA39313
+s__Pseudomonas_sp_313 1 GCF_000316965
+s__Acinetobacter_baumannii 191 GCF_000184495 GCF_000314635 GCF_000372585 GCF_000188215 GCF_000341985 GCF_000314655 GCF_000241665 GCF_000332855 GCF_000189695 GCF_000309235 GCF_000413915 GCF_000309135 GCF_000021145 GCF_000297535 GCF_000305255 GCF_000282795 GCF_000309115 GCF_000335615 GCF_000309195 GCF_000163375 GCF_000369225 GCF_000301515 GCF_000369165 GCF_000441955 GCF_000353995 GCF_000369265 GCF_000301995 GCF_000302015 GCF_000302075 GCF_000069245 GCF_000354075 GCF_000241685 GCF_000314615 [...]
+s__Psychromonas_ingrahamii 1 GCF_000015285
+s__Pseudomonas_phage_PB1 1 PRJNA33499
+s__Mobiluncus_curtisii 4 GCF_000146285 GCF_000196535 GCF_000185445 GCF_000185425
+s__Oribacterium_sinus 1 GCF_000160635
+s__SAR116_cluster_alpha_proteobacterium_HIMB100 1 GCF_000238815
+s__Titi_monkey_adenovirus_ECC_2011 1 PRJNA192854
+s__Pigeon_picornavirus_B 1 PRJNA67691
+s__Choristoneura_fumiferana_DEF_multiple_nucleopolyhedrovirus 1 PRJNA15137
+s__Bacteriovorax_sp_DB6_IX 1 GCF_000447775
+s__Flavobacterium_branchiophilum 1 GCF_000253275
+s__Clostridium_asparagiforme 1 GCF_000158075
+s__Ruminococcus_bromii 1 GCF_000209875
+s__Acinetobacter_baylyi 2 GCF_000302115 GCF_000368685
+s__Sulfolobus_virus_STSV1 1 PRJNA14561
+s__Mycobacterium_phage_Ardmore 1 PRJNA46607
+s__Sulfolobus_virus_STSV2 1 PRJNA185313
+s__Leucania_separata_nucleopolyhedrovirus 1 PRJNA17669
+s__Ralstonia_phage_RSS30 1 PRJNA213021
+s__Japanese_yam_mosaic_virus 1 PRJNA15365
+s__Camelpox_virus 1 PRJNA14156
+s__Bacillus_sp_EGD_AK10 1 GCF_000465855
+s__Burkholderia_sp_RPE64 1 GCF_000402035
+s__Myroides_odoratimimus 6 GCF_000242095 GCF_000413415 GCF_000297855 GCF_000242135 GCF_000242075 GCF_000297875
+s__Tuber_melanosporum 1 GCA_000151645
+s__Horseradish_curly_top_virus 1 PRJNA14100
+s__Bean_common_mosaic_virus 1 PRJNA15183
+s__Pear_blister_canker_viroid 1 PRJNA14965
+s__Anagyris_vein_yellowing_virus 1 PRJNA32713
+s__Mimosa_yellow_leaf_curl_virus_satellite_DNA_beta 1 PRJNA19821
+s__Ononis_yellow_mosaic_virus 1 PRJNA14669
+s__Epulopiscium_sp_N_t_morphotype_B 1 GCF_000171335
+s__SAR202_cluster_bacterium_SCGC_AAA240_N13 1 GCF_000372165
+s__Butterbur_mosaic_virus 1 PRJNA42145
+s__Synechococcus_sp_CC9605 1 GCF_000012625
+s__Salmonella_phage_HK620 1 PRJNA14115
+s__Gloeobacter_violaceus 1 GCF_000011385
+s__Halorubrum_coriense 1 GCF_000337035
+s__Streptomyces_avermitilis 1 GCF_000009765
+s__Gluconobacter_morbifer 1 GCF_000234355
+s__Methylotenera_sp_73s 1 GCF_000384435
+s__Tomato_zonate_spot_virus 1 PRJNA29091
+s__Asparagus_virus_3 1 PRJNA28979
+s__Selenomonas_infelix 1 GCF_000234095
+s__Sunflower_leaf_curl_Karnataka_alphasatellite 1 PRJNA181246
+s__Sabia_virus 1 PRJNA15054
+s__Borrelia_valaisiana 1 GCF_000170955
+s__Turnip_curly_top_virus 1 PRJNA50429
+s__Cucumber_green_mottle_mosaic_virus 1 PRJNA14681
+s__Eggerthella_lenta 1 GCF_000024265
+s__Thermoanaerobacter_brockii 1 GCF_000175295
+s__Gemella_morbillorum 1 GCF_000185645
+s__Haloferax_volcanii 2 GCF_000025685 GCF_000337315
+s__Rous_sarcoma_virus 1 PRJNA14978
+s__Mycobacterium_liflandii 1 GCF_000026445
+s__Anoxybacillus_flavithermus 4 GCF_000019045 GCF_000327465 GCF_000353425 GCF_000367505
+s__Tomato_rugose_mosaic_virus 1 PRJNA14101
+s__Tembusu_virus 1 PRJNA70159
+s__Candidatus_Tremblaya_phenacola 1 GCF_000412755
+s__Corynebacterium_sp_KPL1860 1 GCF_000477995
+s__Bean_golden_yellow_mosaic_virus 1 PRJNA14200
+s__Bacteroides_sp_1_1_14 1 GCF_000162515
+s__Pseudomonas_phage_JBD24 1 PRJNA188535
+s__Clostridium_sp_DL_VIII 1 GCF_000230835
+s__alpha_proteobacterium_SCGC_AAA300_J04 1 GCF_000382645
+s__Feline_picornavirus 1 PRJNA76725
+s__Mycobacterium_phage_Quink 1 PRJNA219113
+s__Bacillus_cytotoxicus 1 GCF_000017425
+s__Bartonella_tamiae 2 GCF_000278275 GCF_000279995
+s__Persimmon_viroid_2 1 PRJNA210930
+s__Pseudomonas_amygdali 10 GCF_000163275 GCF_000145945 GCF_000145765 GCF_000145885 GCF_000159835 GCF_000275945 GCF_000146005 GCF_000163255 GCF_000145685 GCF_000145745
+s__Tobacco_curly_shoot_betasatellite 1 PRJNA14446
+s__Rudanella_lutea 1 GCF_000383955
+s__Visna_maedi_virus 1 PRJNA14636
+s__Halorhabdus_utahensis 1 GCF_000023945
+s__Maize_chlorotic_dwarf_virus 1 PRJNA15345
+s__gamma_proteobacterium_HTCC5015 1 GCF_000155715
+s__Turdivirus_3 1 PRJNA51591
+s__Bacillus_sp_2_A_57_CT2 1 GCF_000186145
+s__Mycoplasma_phage_P1 1 PRJNA14136
+s__Patulibacter_americanus 1 GCF_000420025
+s__Treponema_succinifaciens 1 GCF_000195275
+s__Pyrobaculum_arsenaticum 1 GCF_000016385
+s__Gordonia_phage_GRU1 1 PRJNA78691
+s__Tomato_yellow_vein_streak_virus 1 PRJNA30171
+s__Glaciecola_nitratireducens 1 GCF_000226565
+s__Plasmodium_falciparum 1 GCA_000002765
+s__Pseudomonas_phage_phi12 1 PRJNA14855
+s__Pseudomonas_phage_phi13 1 PRJNA14854
+s__Bacteroides_sp_2_1_16 1 GCF_000162135
+s__Lettuce_ring_necrosis_virus 1 PRJNA14959
+s__Tomato_leaf_curl_Malaysia_virus 1 PRJNA14260
+s__Clostridium_phage_phiCTP1 1 PRJNA51665
+s__Bacillus_phage_W_Ph 1 PRJNA80913
+s__Collimonas_fungivorans 1 GCF_000221045
+s__Micromonospora_sp_CNB394 1 GCF_000374985
+s__Saccharomonospora_azurea 2 GCF_000231055 GCF_000236985
+s__Smaragdicoccus_niigatensis 1 GCF_000380645
+s__Asticcacaulis_biprosthecium 1 GCF_000204015
+s__Roseomonas_sp_B5 1 GCF_000292225
+s__Bartonella_clarridgeiae 1 GCF_000253015
+s__Methylomonas_methanica 1 GCF_000214665
+s__Salmonella_phage_SE2 1 PRJNA82643
+s__Propionibacterium_sp_oral_taxon_192 1 GCF_000413315
+s__Cellulophaga_phage_phi17_1 1 PRJNA212962
+s__Pseudomonas_phage_UFV_P2 1 PRJNA177548
+s__Psychrobacter_sp_1501_2011 1 GCF_000213615
+s__Sida_golden_mosaic_Buckup_virus_Jamaica_St_Elizabeth_2004 1 PRJNA61135
+s__Glossina_hytrovirus 1 PRJNA28839
+s__Mycobacterium_phage_Wee 1 PRJNA61859
+s__Carnobacterium_maltaromaticum 2 GCF_000317975 GCF_000238575
+s__Rickettsia_peacockii 1 GCF_000021525
+s__Acanthamoeba_polyphaga_mimivirus 1 PRJNA60053
+s__Astrovirus_MLB1 2 PRJNA32327 PRJNA50359
+s__Mycoplasma_haemocanis 1 GCF_000238995
+s__Prochlorococcus_phage_P_SSP3 1 PRJNA195517
+s__Prochlorococcus_phage_P_SSP7 1 PRJNA15134
+s__Bradyrhizobium_sp_YR681 1 GCF_000282615
+s__Changuinola_virus 1 PRJNA226021
+s__Mycobacterium_phage_Rosebush 1 PRJNA14304
+s__Corynebacterium_pyruviciproducens 1 GCF_000411375
+s__Banana_streak_UA_virus 1 PRJNA66609
+s__Stenotrophomonas_phage_phiSHP2 1 PRJNA67419
+s__Sulfuricurvum_kujiense 1 GCF_000183725
+s__Limnobacter_sp_MED105 1 GCF_000170915
+s__Butyrivibrio_fibrisolvens 3 GCF_000420985 GCF_000420965 GCF_000209815
+s__Stanieria_cyanosphaera 1 GCF_000317575
+s__Erwinia_amylovora_phage_Era103 1 PRJNA18839
+s__Plasmodium_yoelii 1 GCA_000003085
+s__Pseudomonas_phage_EL 1 PRJNA16199
+s__Alcanivorax_borkumensis 1 GCF_000009365
+s__Myxococcus_stipitatus 1 GCF_000331735
+s__Sphingomonas_sp_SKA58 1 GCF_000153545
+s__Barley_mild_mosaic_virus 1 PRJNA15338
+s__Corynebacterium_urealyticum 2 GCF_000069945 GCF_000338095
+s__Cardamine_chlorotic_fleck_virus 1 PRJNA14674
+s__Pseudomonas_sp_GM102 1 GCF_000282555
+s__Nocardiopsis_ganjiahuensis 1 GCF_000341085
+s__Vibrio_phage_PWH3a_P1 1 PRJNA195481
+s__Chlamydophila_felis 1 GCF_000009945
+s__Halobiforma_nitratireducens 1 GCF_000337895
+s__Sulfurovum_sp_NBC37_1 1 GCF_000010345
+s__Prevotella_buccalis 1 GCF_000177075
+s__Escherichia_phage_HK75 1 PRJNA76733
+s__Japanese_iris_necrotic_ring_virus 1 PRJNA15094
+s__Diatraea_saccharalis_densovirus 1 PRJNA14036
+s__Bacillus_sp_FJAT_13831 1 GCF_000299035
+s__Oxalobacter_formigenes 2 GCF_000158475 GCF_000158495
+s__Brevibacillus_sp_CF112 1 GCF_000282015
+s__Human_enteric_coronavirus_strain_4408 1 PRJNA39335
+s__Crimean_Congo_hemorrhagic_fever_virus 1 PRJNA15026
+s__Lactococcus_garvieae 14 GCF_000407645 GCF_000213885 GCF_000236535 GCF_000305995 GCF_000236515 GCF_000212475 GCF_000269705 GCF_000407125 GCF_000305975 GCF_000269925 GCF_000504505 GCF_000205485 GCF_000300795 GCF_000269945
+s__Actinidia_virus_B 1 PRJNA77137
+s__Yersinia_frederiksenii 1 GCF_000168015
+s__Actinomyces_sp_HPA0247 1 GCF_000411415
+s__Tomato_leaf_curl_Mindanao_virus 1 PRJNA29011
+s__Synechococcus_phage_S_SKS1 1 PRJNA195489
+s__Lloviu_virus 1 PRJNA76475
+s__Potato_apical_leaf_curl_disease_associated_satellite_DNA_beta 1 PRJNA18323
+s__gamma_proteobacterium_IMCC2047 1 GCF_000211335
+s__Staphylococcus_phage_S24_1 1 PRJNA80917
+s__Finegoldia_magna 5 GCF_000159695 GCF_000179495 GCF_000010185 GCF_000179695 GCF_000221585
+s__Enterobacteria_phage_EcoDS1 1 PRJNA30601
+s__Turnip_rosette_virus 1 PRJNA14876
+s__Actinomadura_atramentaria 1 GCF_000381885
+s__Thermincola_potens 1 GCF_000092945
+s__Enterobacteria_phage_ID18_sensu_lato 1 PRJNA16628
+s__French_bean_leaf_curl_virus_Kanpur 1 PRJNA169555
+s__Clostridium_glycolicum 1 GCF_000373865
+s__Enterococcus_phage_phiEF24C 1 PRJNA21009
+s__Paenibacillus_fonticola 1 GCF_000381905
+s__Miniopterus_polyomavirus 1 PRJNA185189
+s__Oat_golden_stripe_virus 1 PRJNA15093
+s__Mycobacterium_phage_Jobu08 1 PRJNA209074
+s__Carrot_red_leaf_virus 1 PRJNA15057
+s__Anaerophaga_thermohalophila 2 GCF_000191885 GCF_000250735
+s__Macaca_fascicularis_polyomavirus_1 1 PRJNA183904
+s__Corynebacterium_caspium 1 GCF_000379705
+s__Methylosarcina_fibrata 1 GCF_000372865
+s__Agrobacterium_albertimagni 1 GCF_000300855
+s__Bacillus_virus_1 1 PRJNA20397
+s__Bat_circovirus 1 PRJNA202887
+s__Neodiprion_lecontei_nucleopolyhedrovirus 1 PRJNA14617
+s__Halomonas_zhanjiangensis 1 GCF_000377665
+s__Pectobacterium_phage_ZF40 1 PRJNA181216
+s__Staphylococcus_phage_PVL 1 PRJNA14392
+s__Serratia_sp_AS13 1 GCF_000214805
+s__Serratia_sp_AS12 1 GCF_000214195
+s__MW_polyomavirus 1 XXX
+s__Rickettsia_slovaca 2 GCF_000252365 GCF_000237845
+s__Feline_astrovirus_2 1 PRJNA218014
+s__Anaplasma_marginale 7 GCF_000172515 GCF_000172475 GCF_000495535 GCF_000495495 GCF_000172495 GCF_000011945 GCF_000020305
+s__Methanocella_arvoryzae 1 GCF_000063445
+s__Prune_dwarf_virus 1 PRJNA16818
+s__Lactobacillus_shenzhenensis 1 GCF_000469325
+s__Chlamydia_psittaci 46 GCF_000211155 GCF_000415685 GCF_000298475 GCF_000415525 GCF_000415665 GCF_000298515 GCF_000270445 GCF_000415565 GCF_000270405 GCF_000204175 GCF_000415805 GCF_000417695 GCF_000298535 GCF_000298555 GCF_000417565 GCF_000298455 GCF_000298435 GCF_000417585 GCF_000270385 GCF_000417825 GCF_000417845 GCF_000417715 GCF_000417655 GCF_000298375 GCF_000417735 GCF_000317995 GCF_000415645 GCF_000415625 GCF_000191925 GCF_000338695 GCF_000415845 GCF_000298495 GCF_000415545 GCF_0 [...]
+s__Porphyromonas_macacae 1 GCF_000379945
+s__Aotine_herpesvirus_1 1 PRJNA78945
+s__Pseudoalteromonas_sp_BSi20480 1 GCF_000241365
+s__Aspergillus_foetidus_dsRNA_mycovirus 1 PRJNA186431
+s__Mycobacterium_phage_PBI1 1 PRJNA17165
+s__Bacteroidetes_oral_taxon_274 1 GCF_000163695
+s__Acheta_domesticus_volvovirus 1 PRJNA198480
+s__Clostridium_cellulovorans 2 GCF_000180115 GCF_000145275
+s__Bacillus_phage_B103 1 PRJNA14216
+s__Gordonia_paraffinivorans 1 GCF_000344155
+s__Methanosalsum_zhilinae 1 GCF_000217995
+s__Pyrobaculum_aerophilum 1 GCF_000007225
+s__Burkholderia_cepacia 1 GCF_000292915
+s__Marine_Group_I_thaumarchaeote_SCGC_AB_629_A13 1 GCF_000399745
+s__Ground_squirrel_hepatitis_virus 1 PRJNA14070
+s__Helminthosporium_victoriae_145S_virus 1 PRJNA14945
+s__Ignicoccus_hospitalis 1 GCF_000017945
+s__Bovine_respiratory_coronavirus_AH187 1 PRJNA39331
+s__Caldisericum_exile 1 GCF_000284335
+s__Primula_malacoides_virus_China_Mar2007 1 PRJNA39975
+s__Streptomyces_sp_C 1 GCF_000158895
+s__Malassezia_globosa 1 GCA_000181695
+s__Mesoflavibacter_zeaxanthinifaciens 1 GCF_000220585
+s__Blueberry_red_ringspot_virus 1 PRJNA14129
+s__Caldicellulosiruptor_lactoaceticus 1 GCF_000193435
+s__Erwinia_tracheiphila 1 GCF_000404125
+s__Thermococcus_sibiricus 1 GCF_000022545
+s__Actinomyces_turicensis 1 GCF_000296505
+s__Stenotrophomonas_phage_S1 1 PRJNA32787
+s__Haloarcula_vallismortis 1 GCF_000337775
+s__Shewanella_halifaxensis 1 GCF_000019185
+s__Eragrostis_curvula_streak_virus 1 PRJNA37889
+s__Cotton_leaf_curl_betasatellite 1 PRJNA14438
+s__Tobacco_leaf_curl_Thailand_virus 1 PRJNA19799
+s__Actinopolyspora_halophila 1 GCF_000371785
+s__Halanaerobium_hydrogeniformans 1 GCF_000166415
+s__Acinetobacter_sp_CIP_110321 1 GCF_000400715
+s__Propionibacterium_sp_KPL1849 1 GCF_000477835
+s__Enterobacter_sp_638 1 GCF_000016325
+s__Sorangium_cellulosum 2 GCF_000418325 GCF_000067165
+s__Propionibacterium_sp_KPL1847 1 GCF_000477855
+s__Providencia_stuartii 2 GCF_000259175 GCF_000154865
+s__Propionibacterium_sp_KPL1844 1 GCF_000477715
+s__Frangipani_mosaic_virus 1 PRJNA53499
+s__Fusarium_graminearum 1 GCA_000240135
+s__Brachyspira_hyodysenteriae 2 GCF_000383255 GCF_000022105
+s__Leptotrichia_sp_oral_taxon_879 1 GCF_000469385
+s__Cronobacter_malonaticus 2 GCF_000319555 GCF_000319535
+s__Thiobacillus_thioparus 1 GCF_000373385
+s__Nocardiopsis_baichengensis 1 GCF_000341205
+s__Rosa_rugosa_leaf_distortion_virus 1 PRJNA191123
+s__Corynebacterium_glucuronolyticum 2 GCF_000156595 GCF_000159595
+s__Jonquetella_anthropi 2 GCF_000161995 GCF_000237805
+s__Vibrio_phage_vB_VpaS_MAR10 1 PRJNA183157
+s__Johnsonella_ignava 1 GCF_000235445
+s__Bdellovibrio_phage_phiMH2K 1 PRJNA14107
+s__Enterobacteria_phage_HK022 1 PRJNA14048
+s__Mycobacterium_phage_Adzzy 1 PRJNA215109
+s__Enterococcus_raffinosus 2 GCF_000393895 GCF_000407525
+s__Enterobacteria_phage_vB_KleM_RaK2 1 PRJNA181223
+s__Calyptogena_okutanii_thioautotrophic_gill_symbiont 1 GCF_000010405
+s__Endosymbiont_phage_APSE_1 1 PRJNA14047
+s__Rhizobium_lupini 1 GCF_000304595
+s__Pantoea_sp_YR343 1 GCF_000282695
+s__Zygosaccharomyces_rouxii 1 GCA_000026365
+s__Anaerococcus_prevotii 2 GCF_000191725 GCF_000024105
+s__Butyricimonas_synergistica 1 GCF_000379665
+s__Mycobacterium_phage_D29 1 PRJNA14203
+s__Shigella_phage_SP18 1 PRJNA56019
+s__Borrelia_sp_SV1 1 GCF_000181875
+s__Pusillimonas_sp_T7_7 1 GCF_000209655
+s__Enterobacter_phage_IME11 1 PRJNA179425
+s__Ippy_virus 1 PRJNA16633
+s__Prevotella_buccae 2 GCF_000162455 GCF_000184945
+s__Corynebacterium_jeikeium 2 GCF_000163435 GCF_000006605
+s__European_bat_lyssavirus_2 1 PRJNA19759
+s__European_bat_lyssavirus_1 1 PRJNA19757
+s__Candidatus_Phytoplasma_mali 1 GCF_000026205
+s__Halococcus_thailandensis 1 GCF_000336715
+s__Mycobacterium_phage_TM4 1 PRJNA14154
+s__Pseudomonas_phage_PaBG 1 PRJNA215670
+s__Klebsiella_sp_KTE92 1 GCF_000398905
+s__Streptococcus_phage_TP_778L 1 PRJNA227111
+s__Lutibaculum_baratangense 1 GCF_000496075
+s__Spinach_severe_curly_top_virus 1 PRJNA59507
+s__Pepper_curly_top_virus 1 PRJNA19745
+s__Espirito_Santo_virus 1 PRJNA80737
+s__Methylovorus_sp_MP688 1 GCF_000183115
+s__Selenomonas_sp_FOBRC9 1 GCF_000287655
+s__Streptococcus_australis 2 GCF_000186465 GCF_000222745
+s__Listeria_marthii 1 GCF_000183865
+s__Croton_yellow_vein_mosaic_virus 1 PRJNA15195
+s__Variovorax_paradoxus 5 GCF_000463015 GCF_000023345 GCF_000377585 GCF_000184745 GCF_000382045
+s__Serratia_liquefaciens 1 GCF_000422085
+s__Cowpox_virus 1 PRJNA14174
+s__Streptomyces_sp_ATexAB_D23 1 GCF_000373645
+s__Coprobacillus_sp_29_1 1 GCF_000186525
+s__Selenomonas_artemidis 1 GCF_000187125
+s__Granulicatella_elegans 1 GCF_000162475
+s__Streptococcus_macedonicus 1 GCF_000283635
+s__Pseudanabaena_biceps 1 GCF_000332215
+s__Pseudomonas_phage_LUZ7 1 PRJNA42951
+s__Clostridium_phage_phiCD119 1 PRJNA16662
+s__Capnocytophaga_sp_oral_taxon_863 1 GCF_000466425
+s__Bhendi_yellow_vein_mosaic_virus 1 PRJNA14159
+s__Torque_teno_virus_10 1 PRJNA48151
+s__Helicobacter_canadensis 2 GCF_000162575 GCF_000155455
+s__Beet_virus_Q 1 PRJNA15091
+s__Tomato_black_ring_virus_satellite_RNA 1 PRJNA15016
+s__Helicobacter_cetorum 2 GCF_000259275 GCF_000259255
+s__Gramella_forsetii 1 GCF_000060345
+s__Night_heron_coronavirus_HKU19 1 PRJNA109277
+s__Rubellimicrobium_thermophilum 1 GCF_000442315
+s__Vernonia_yellow_vein_Fujian_virus_alphasatellite 1 PRJNA72145
+s__Pseudomonas_phage_LKA1 1 PRJNA21045
+s__Malvastrum_yellow_vein_betasatellite 1 PRJNA15317
+s__Chilli_leaf_curl_virus 1 PRJNA14250
+s__Paenibacillus_sp_HW567 1 GCF_000374185
+s__Cassava_mosaic_Madagascar_virus 1 PRJNA129597
+s__Pseudovibrio_sp_FO_BEG1 1 GCF_000236645
+s__Fort_Morgan_virus 1 PRJNA42147
+s__Austwickia_chelonae 1 GCF_000298175
+s__Ageratum_yellow_vein_betasatellite 1 PRJNA14444
+s__Desulfatibacillum_alkenivorans 1 GCF_000021905
+s__Yersinia_phage_phiR1_RT 1 PRJNA184143
+s__Leptospira_licerasiae 3 GCF_000244755 GCF_000216455 GCF_000244715
+s__Propionibacterium_phage_P105 1 PRJNA177533
+s__Alteromonas_sp_SN2 1 GCF_000213655
+s__Mannheimia_phage_vB_MhM_1152AP 1 PRJNA212715
+s__Staphylococcus_phage_StB12 1 PRJNA192927
+s__Sida_golden_yellow_vein_virus 1 PRJNA14253
+s__Legionella_tunisiensis 1 GCF_000308315
+s__Streptomyces_phage_R4 1 PRJNA179407
+s__Bacteroides_sp_1_1_30 1 GCF_000218365
+s__Murrumbidgee_virus 1 PRJNA225034
+s__Vicia_cryptic_virus 1 PRJNA15555
+s__Aedes_aegypti_densovirus 1 PRJNA37821
+s__Thysanoplusia_orichalcea_nucleopolyhedrovirus 1 PRJNA184813
+s__Yersinia_phage_PY54 1 PRJNA15227
+s__zeta_proteobacterium_SCGC_AB_137_I08 1 GCF_000379305
+s__Northern_cereal_mosaic_virus 1 PRJNA14984
+s__Pepper_veinal_mottle_virus 1 PRJNA33675
+s__Haloquadratum_walsbyi 4 GCF_000415965 GCF_000009185 GCF_000415985 GCF_000237865
+s__Paenibacillus_terrae 1 GCF_000235585
+s__Winogradskyella_psychrotolerans 1 GCF_000427335
+s__Methanococcoides_burtonii 1 GCF_000013725
+s__Dorea_sp_5_2 1 GCF_000403455
+s__Kluyvera_phage_Kvp1 1 PRJNA32673
+s__Microbacterium_barkeri 1 GCF_000299315
+s__Enterobacteria_phage_BA14 1 PRJNA30599
+s__Micavibrio_aeruginosavorus 1 GCF_000226315
+s__Sphingobium_sp_AP49 1 GCF_000281715
+s__Switchgrass_mosaic_virus 1 PRJNA66897
+s__Cyanophage_S_TIM5 1 PRJNA181237
+s__Erysipelotrichaceae_bacterium_5_2_54FAA 1 GCF_000163515
+s__Propionibacterium_phage_ATCC29399B_C 1 PRJNA177539
+s__Eggerthella_sp_YY7918 1 GCF_000270285
+s__Tobacco_ringspot_virus_satellite_RNA 1 PRJNA14189
+s__Spiroplasma_phage_SVTS2 1 PRJNA14032
+s__Blastomonas_sp_CACIA14H2 1 GCF_000503195
+s__Enterococcus_cecorum 3 GCF_000492155 GCF_000379745 GCF_000407565
+s__Prevotella_stercorea 1 GCF_000235885
+s__Chlorobium_chlorochromatii 1 GCF_000012585
+s__African_elephant_polyomavirus_1 1 PRJNA222309
+s__Lactobacillus_phage_c5 1 PRJNA181077
+s__Bacillus_alcalophilus 1 GCF_000292245
+s__Turnip_crinkle_virus_satellite_RNA 2 PRJNA14433 PRJNA14506
+s__New_World_begomovirus_associated_satellite_DNA 1 PRJNA88123
+s__Streptomyces_sp_303MFCol5_2 1 GCF_000383635
+s__Oceanicaulis_alexandrii 1 GCF_000420265
+s__Bacillus_oceanisediminis 1 GCF_000294775
+s__Laceyella_sacchari 1 GCF_000421885
+s__Scallion_virus_X 1 PRJNA15099
+s__Asticcacaulis_sp_YBE204 1 GCF_000495855
+s__Siegesbeckia_yellow_vein_virus 1 PRJNA17267
+s__Brevibacillus_sp_phR 1 GCF_000311785
+s__Beilong_virus 1 PRJNA16630
+s__Aeromonas_phage_31 1 PRJNA15416
+s__Kotonkan_virus 1 PRJNA159107
+s__Dethiobacter_alkaliphilus 1 GCF_000174415
+s__Andes_virus 1 PRJNA14746
+s__zeta_proteobacterium_SCGC_AB_604_B04 1 GCF_000379205
+s__Tylonycteris_bat_coronavirus_HKU4 1 PRJNA18863
+s__Lettuce_necrotic_yellows_virus 1 PRJNA16236
+s__Streptococcus_constellatus 5 GCF_000223295 GCF_000463395 GCF_000463445 GCF_000463425 GCF_000257785
+s__Kurthia_massiliensis 1 GCF_000285555
+s__Methylophilales_bacterium_HTCC2181 1 GCF_000168995
+s__Pseudomonas_sp_M47T1 1 GCF_000263855
+s__Chlorogloeopsis_sp_PCC_9212 1 GCF_000317265
+s__Burkholderia_phage_phiE12_2 1 PRJNA19161
+s__Nocardiopsis_synnemataformans 1 GCF_000340945
+s__Dokdonia_donghaensis 1 GCF_000152925
+s__Lysinibacillus_sphaericus 2 GCF_000017965 GCF_000392615
+s__Sclerotinia_sclerotiorum_debilitation_associated_RNA_virus 1 PRJNA15717
+s__SAR86_cluster_bacterium_SAR86E 1 GCF_000307935
+s__SAR86_cluster_bacterium_SAR86D 1 GCF_000252585
+s__SAR86_cluster_bacterium_SAR86C 1 GCF_000252565
+s__Tupaiid_herpesvirus_1 1 PRJNA14597
+s__Pelagibacter_phage_HTVC008M 1 PRJNA192865
+s__Staphylococcus_warneri 4 GCF_000332735 GCF_000321085 GCF_000175175 GCF_000211215
+s__Ageratum_yellow_vein_Singapore_alphasatellite 1 PRJNA14232
+s__Cyanothece_sp_PCC_7822 1 GCF_000147335
+s__Seneca_valley_virus 1 PRJNA32193
+s__Rhodococcus_erythropolis 5 GCF_000010105 GCF_000454045 GCF_000174835 GCF_000225665 GCF_000454425
+s__Methylophilus_sp_42 1 GCF_000384155
+s__Flavobacterium_enshiense 1 GCF_000498495
+s__Mycobacterium_phage_Whirlwind 1 PRJNA215117
+s__Vibrio_genomosp_F10 4 GCF_000287015 GCF_000287055 GCF_000287035 GCF_000287195
+s__Streptococcus_sp_BS35b 1 GCF_000286475
+s__Propionibacterium_phage_PHL071N05 1 PRJNA219109
+s__Thielaviopsis_basicola_mitovirus 1 PRJNA37715
+s__Tolumonas_auensis 1 GCF_000023065
+s__Acetohalobium_arabaticum 1 GCF_000144695
+s__Paramecium_bursaria_Chlorella_virus_AR158 1 PRJNA20991
+s__Pyrobaculum_calidifontis 1 GCF_000015805
+s__Wild_tomato_mosaic_virus 1 PRJNA20625
+s__Enterobacteriaceae_bacterium_9_2_54FAA 1 GCF_000185685
+s__Rhizobium_mesoamericanum 1 GCF_000312665
+s__Alloscardovia_omnicolens 2 GCF_000420505 GCF_000466365
+s__Aeromonas_phage_44RR2_8t 1 PRJNA14321
+s__Sulfurimonas_denitrificans 1 GCF_000012965
+s__Paenibacillus_sp_JDR_2 1 GCF_000023585
+s__Halogranum_salarium 1 GCF_000283335
+s__Clavibacter_michiganensis 3 GCF_000069225 GCF_000355695 GCF_000063485
+s__Natronorubrum_tibetense 2 GCF_000383975 GCF_000337235
+s__Xylella_fastidiosa 11 GCF_000019325 GCF_000219235 GCF_000007245 GCF_000006725 GCF_000506905 GCF_000166835 GCF_000148405 GCF_000466025 GCF_000506405 GCF_000166855 GCF_000019765
+s__Synechococcus_sp_CB0101 1 GCF_000179235
+s__Hydrogenophaga_sp_PBC 1 GCF_000263795
+s__Tomato_leaf_curl_Patna_virus 1 PRJNA36527
+s__Limnohabitans_sp_Rim47 1 GCF_000292865
+s__Cellulophaga_algicola 1 GCF_000186265
+s__Turkey_adenovirus_A 2 PRJNA14524 PRJNA15112
+s__Novosphingobium_sp_Rr_2_17 1 GCF_000272475
+s__Acyrthosiphon_pisum_virus 1 PRJNA40357
+s__Zucchini_green_mottle_mosaic_virus 1 PRJNA15189
+s__Methylosinus_trichosporium 1 GCF_000178815
+s__Mycobacterium_sp_155 1 GCF_000373905
+s__Yersinia_pestis 123 GCF_000268485 GCF_000323285 GCF_000268865 GCF_000268825 GCF_000269145 GCF_000475135 GCF_000324885 GCF_000269465 GCF_000323645 GCF_000268445 GCF_000324785 GCF_000007885 GCF_000022805 GCF_000268545 GCF_000268505 GCF_000323365 GCF_000269125 GCF_000269325 GCF_000268665 GCF_000269025 GCF_000169655 GCF_000268685 GCF_000268745 GCF_000169635 GCF_000169615 GCF_000170275 GCF_000323785 GCF_000324465 GCF_000324025 GCF_000018805 GCF_000323505 GCF_000022825 GCF_000268905 GCF_000 [...]
+s__Tomato_yellow_leaf_curl_China_virus 1 PRJNA15318
+s__Enterococcus_villorum 2 GCF_000407205 GCF_000393935
+s__Pantoea_sp_Sc1 1 GCF_000255315
+s__Caldilinea_aerophila 1 GCF_000281175
+s__Halovivax_asiaticus 1 GCF_000337515
+s__Macroptilium_golden_mosaic_virus 1 PRJNA30169
+s__Clostridium_pasteurianum 3 GCF_000389635 GCF_000506785 GCF_000330945
+s__Prochlorothrix_hollandica 2 GCF_000332315 GCF_000341585
+s__Enterovirus_A 1 PRJNA15445
+s__Blattabacterium_sp_Blaberus_giganteus 1 GCF_000262715
+s__Prochlorococcus_sp_W8 1 GCF_000291825
+s__Bacillus_phage_WBeta 1 PRJNA16329
+s__Epirus_cherry_virus 1 PRJNA30739
+s__Prochlorococcus_sp_W3 1 GCF_000291905
+s__Prochlorococcus_sp_W2 1 GCF_000291885
+s__Prochlorococcus_sp_W4 1 GCF_000291785
+s__Prochlorococcus_sp_W7 1 GCF_000291805
+s__Rhodobacter_sphaeroides 7 GCF_000012905 GCF_000273405 GCF_000015985 GCF_000212605 GCF_000021005 GCF_000269625 GCF_000016405
+s__Amycolatopsis_sp_ATCC_39116 1 GCF_000231075
+s__Bat_adeno_associated_virus_YNM 1 PRJNA51735
+s__Paenibacillus_peoriae 1 GCF_000236805
+s__Nitrospina_sp_AB_629_B18 1 GCF_000375765
+s__Turkey_adenovirus_5 1 PRJNA225923
+s__Thermus_thermophilus 4 GCF_000091545 GCF_000008125 GCF_000258245 GCF_000214845
+s__Demetria_terragena 1 GCF_000376825
+s__Streptomyces_sp_GBA_94_10 1 GCF_000495635
+s__Ruminococcus_champanellensis 1 GCF_000210095
+s__Moroccan_watermelon_mosaic_virus 1 PRJNA27897
+s__Myxococcus_fulvus 1 GCF_000219105
+s__Shewanella_loihica 1 GCF_000016065
+s__Terriglobus_roseus 1 GCF_000265425
+s__Cyclovirus_bat_USA_2009 1 PRJNA61951
+s__Bat_adenovirus_A 1 PRJNA84399
+s__Stx2_converting_phage_1717 1 PRJNA32213
+s__Kedougou_virus 1 PRJNA36617
+s__Candidatus_Korarchaeum_cryptofilum 1 GCF_000019605
+s__Paenibacillus_sp_ICGEB2008 1 GCF_000307675
+s__Bell_pepper_mottle_virus 1 PRJNA20059
+s__Actinobacillus_ureae 1 GCF_000188255
+s__Lactococcus_phage_bIL309 1 PRJNA14338
+s__Thermacetogenium_phaeum 1 GCF_000305935
+s__Salinispora_tropica 3 GCF_000016425 GCF_000377085 GCF_000377065
+s__Streptomyces_sp_W007 1 GCF_000239075
+s__Potato_yellow_vein_virus 1 PRJNA14924
+s__Moritella_dasanensis 1 GCF_000276805
+s__Malva_mosaic_virus 1 PRJNA17349
+s__Encephalitozoon_intestinalis 1 GCA_000146465
+s__Sciscionella_marina 1 GCF_000379465
+s__Zea_mosaic_virus 1 PRJNA177544
+s__Burkholderia_phage_phi1026b 1 PRJNA14410
+s__Orientia_tsutsugamushi 2 GCF_000063545 GCF_000010205
+s__Methanocaldococcus_sp_FS406_22 1 GCF_000025525
+s__Rice_tungro_bacilliform_virus 1 PRJNA14579
+s__Corchorus_yellow_spot_virus 1 PRJNA17993
+s__Enterobacteria_phage_C_1_INW_2012 1 PRJNA184162
+s__Cleome_leaf_crumple_virus_associated_DNA_1 1 PRJNA60045
+s__Rattail_cactus_necrosis_associated_virus 1 PRJNA78929
+s__Sphingobium_chlorophenolicum 1 GCF_000147835
+s__Bacillus_phage_BPS13 1 PRJNA177519
+s__Lactococcus_phage_949 1 PRJNA64559
+s__Chicken_astrovirus 1 PRJNA14804
+s__Bamboo_mosaic_virus_satellite_RNA 1 PRJNA14748
+s__Enterovibrio_norvegicus 3 GCF_000286855 GCF_000286835 GCF_000264435
+s__Brachymonas_chironomi 1 GCF_000374625
+s__Mesorhizobium_alhagi 1 GCF_000236565
+s__Vibrio_phage_CP_T1 1 PRJNA181062
+s__Streptomyces_sp_FxanaC1 1 GCF_000375625
+s__Choristoneura_biennis_entomopoxvirus_L 1 PRJNA203666
+s__Prevotella_denticola 2 GCF_000191765 GCF_000193395
+s__Blattabacterium_sp_Blatta_orientalis 1 GCF_000334405
+s__Streptococcus_porcinus 1 GCF_000187955
+s__Ignavibacterium_album 1 GCF_000258405
+s__Mycoplasma_arginini 1 GCF_000367785
+s__Saccharophagus_degradans 1 GCF_000013665
+s__Centipeda_periodontii 1 GCF_000213975
+s__RD114_retrovirus 1 PRJNA20979
+s__Bacillus_phage_IEBH 1 PRJNA31057
+s__African_cassava_mosaic_virus 1 PRJNA15175
+s__Tomato_yellow_leaf_curl_Indonesia_virus_Lembang 1 PRJNA17387
+s__Butyrivibrio_sp_AE2015 1 GCF_000420825
+s__Allamanda_leaf_curl_virus 1 PRJNA30179
+s__Oscillibacter_sp_KLE_1728 1 GCF_000469425
+s__Mycobacteriophage_Daenerys 1 PRJNA215121
+s__Desulfurispirillum_indicum 1 GCF_000177635
+s__Leuconostoc_sp_C2 1 GCF_000219785
+s__Lachnoanaerobaculum_saburreum 2 GCF_000185385 GCF_000257705
+s__Choristoneura_occidentalis_alphabaculovirus 1 PRJNA214177
+s__Syntrophomonas_wolfei 1 GCF_000014725
+s__Parastagonospora_nodorum 1 GCA_000146915
+s__Actinomyces_johnsonii 2 GCF_000466205 GCF_000466245
+s__Pseudomonas_sp_M1 1 GCF_000317185
+s__Rhodococcus_sp_R1101 1 GCF_000278445
+s__Mycobacterium_phage_Gumball 1 PRJNA32009
+s__Galinsoga_mosaic_virus 1 PRJNA15209
+s__Streptomyces_viridochromogenes 1 GCF_000158955
+s__Thermus_phage_P23_45 1 PRJNA20765
+s__Chilli_leaf_curl_Multan_alphasatellite 1 PRJNA39933
+s__Wolbachia_endosymbiont_of_Drosophila_simulans 2 GCF_000376585 GCF_000376605
+s__Desulfococcus_oleovorans 1 GCF_000018405
+s__Brevibacillus_panacihumi 1 GCF_000503775
+s__Zinnia_leaf_curl_disease_associated_sequence 1 PRJNA14440
+s__Galbibacter_marinus 1 GCF_000300875
+s__Bifidobacterium_sp_12_1_47BFAA 1 GCF_000185665
+s__Mycobacterium_phage_Phlyer 1 PRJNA33871
+s__Lyngbya_sp_PCC_8106 1 GCF_000169095
+s__Tomato_leaf_curl_Bangalore_virus 1 PRJNA14190
+s__Krokinobacter_sp_4H_3_7_5 1 GCF_000212355
+s__Lactococcus_phage_BM13 1 PRJNA213076
+s__Dulcamara_mottle_virus 1 PRJNA16188
+s__Malvastrum_yellow_vein_Baoshan_virus 1 PRJNA37891
+s__Shallot_latent_virus 1 PRJNA15426
+s__Pseudomonas_phage_phi2954 1 PRJNA34533
+s__Dyella_japonica 1 GCF_000292265
+s__Aedes_flavivirus 1 PRJNA39601
+s__Actinoalloteichus_spitiensis 1 GCF_000239155
+s__Streptococcus_phage_TP_J34 1 PRJNA188154
+s__Sugarcane_mosaic_virus 1 PRJNA14994
+s__Mycobacterium_phage_Severus 1 PRJNA206027
+s__Pea_enation_mosaic_virus_1 1 PRJNA14769
+s__Methyloglobulus_morosus 1 GCF_000496735
+s__Mapuera_virus 1 PRJNA19651
+s__Pea_enation_mosaic_virus_2 1 PRJNA14818
+s__Bacillus_mojavensis 1 GCF_000245335
+s__Xanthomonas_translucens 3 GCF_000334075 GCF_000331775 GCF_000313775
+s__Enterococcus_moraviensis 2 GCF_000407445 GCF_000394015
+s__Rhodobacteraceae_bacterium_KLH11 1 GCF_000158135
+s__Clostridium_phage_phiCD27 1 PRJNA32323
+s__Bacillus_phage_vB_BceM_Bc431v3 1 PRJNA195534
+s__Helicobacter_hepaticus 1 GCF_000007905
+s__Escherichia_phage_P13374 1 PRJNA177543
+s__Thioalkalivibrio_sp_ALMg2 1 GCF_000381145
+s__Thioalkalivibrio_sp_ALMg3 1 GCF_000381225
+s__Picrophilus_torridus 1 GCF_000008265
+s__Bacteroides_ovatus 5 GCF_000273195 GCF_000273215 GCF_000218325 GCF_000178275 GCF_000154125
+s__Thioalkalivibrio_sp_AL21 1 GCF_000381325
+s__Thioalkalivibrio_sp_ALMg9 1 GCF_000380625
+s__Indian_cassava_mosaic_virus 1 PRJNA14483
+s__Rhodococcus_opacus 3 GCF_000264745 GCF_000234335 GCF_000010805
+s__Campoletis_sonorensis_ichnovirus 1 PRJNA16738
+s__Caulobacter_sp_K31 1 GCF_000019145
+s__Halococcus_salifodinae 1 GCF_000336935
+s__Thauera_terpenica 1 GCF_000443165
+s__Borrelia_bavariensis 1 GCF_000196215
+s__Murine_astrovirus 1 PRJNA176429
+s__Oceanicaulis_sp_HTCC2633 1 GCF_000152745
+s__Neisseria_subflava 1 GCF_000173955
+s__Cassava_common_mosaic_virus 1 PRJNA14705
+s__Streptomyces_sp_LaPpAH_165 1 GCF_000373525
+s__Maize_mosaic_virus 1 PRJNA14920
+s__Southern_bean_mosaic_virus 1 PRJNA15356
+s__Phaeobacter_gallaeciensis 2 GCF_000154745 GCF_000203975
+s__secondary_endosymbiont_of_Ctenarytaina_eucalypti 1 GCF_000287335
+s__Ochrobactrum_sp_EGD_AQ16 1 GCF_000465835
+s__Hafnia_alvei 1 GCF_000239255
+s__Mycoplasma_alligatoris 1 GCF_000178375
+s__Enterobacter_asburiae 1 GCF_000224675
+s__St_Louis_encephalitis_virus 1 PRJNA16150
+s__Human_cosavirus_D 1 PRJNA38501
+s__Streptomyces_zinciresistens 1 GCF_000225525
+s__Weissella_halotolerans 1 GCF_000420365
+s__Vibrio_phage_vB_VpaM_MAR 1 PRJNA183156
+s__Clostridium_colicanis 1 GCF_000371465
+s__Drosophila_melanogaster_totivirus_SW_2009a 1 PRJNA41725
+s__Gammapapillomavirus_9 1 PRJNA39691
+s__Gammapapillomavirus_8 1 PRJNA36517
+s__Gammapapillomavirus_5 1 PRJNA28737
+s__Cellulophaga_phage_phiSM 1 PRJNA195497
+s__Gammapapillomavirus_7 1 PRJNA36519
+s__Gammapapillomavirus_6 3 PRJNA17119 PRJNA17121 PRJNA34847
+s__Gammapapillomavirus_1 1 PRJNA15492
+s__Lachancea_thermotolerans 1 GCA_000142805
+s__Cellulophaga_phage_phiST 1 PRJNA195498
+s__Ovine_herpesvirus_2 1 PRJNA16234
+s__Pseudomonas_fragi 2 GCF_000250615 GCF_000250595
+s__Leptospira_broomii 1 GCF_000243715
+s__Microcoleus_vaginatus 1 GCF_000214075
+s__Listeria_phage_A118 1 PRJNA14589
+s__Desulfurococcus_mucosus 1 GCF_000186365
+s__Pepper_yellow_vein_Mali_virus 1 PRJNA14348
+s__Aeromonas_sp_159 1 GCF_000292325
+s__Shewanella_sp_ANA_3 1 GCF_000203935
+s__Tetraselmis_viridis_virus_SI1 1 PRJNA195491
+s__Vibrio_scophthalmi 1 GCF_000222585
+s__Pyrococcus_abyssi_virus_1 1 PRJNA19929
+s__Enterobacteria_phage_alpha3 1 PRJNA14570
+s__Candidatus_Rickettsia_amblyommii 1 GCF_000284055
+s__Lactobacillus_pasteurii 1 GCF_000297025
+s__Lactobacillus_sanfranciscensis 1 GCF_000225325
+s__Grapevine_fanleaf_virus_satellite_RNA 1 PRJNA14986
+s__Methylophaga_aminisulfidivorans 1 GCF_000214595
+s__Squirrel_monkey_polyomavirus 1 PRJNA27775
+s__Bacillus_phage_phBC6A52 1 PRJNA15022
+s__Mirabilis_jalapa_mottle_virus 1 PRJNA74427
+s__Lachnospiraceae_bacterium_ICM7 1 GCF_000287675
+s__Bacillus_phage_phBC6A51 1 PRJNA15021
+s__Sweet_potato_leaf_curl_Uganda_virus_Uganda_Kampala_2008 1 PRJNA62213
+s__Leuconostoc_gasicomitatum 1 GCF_000196855
+s__Streptococcus_phage_P9 1 PRJNA20785
+s__Pestivirus_Giraffe_1 1 PRJNA14780
+s__Bacteroides_fluxus 1 GCF_000195635
+s__Methanomethylovorans_hollandica 1 GCF_000328665
+s__Potato_Virus_P 1 PRJNA20657
+s__Vibrio_metschnikovii 1 GCF_000176155
+s__Gordonia_neofelifaecis 1 GCF_000192435
+s__Avian_adeno_associated_virus 1 PRJNA14463
+s__Staphylococcus_phage_187 1 PRJNA15264
+s__Turnip_crinkle_virus 1 PRJNA14811
+s__Coprobacillus_sp_3_3_56FAA 1 GCF_000239735
+s__Staphylococcus_caprae_capitis 5 GCF_000174135 GCF_000263775 GCF_000160215 GCF_000183705 GCF_000221525
+s__Pseudoxanthomonas_sp_GW2 1 GCF_000283075
+s__Lindernia_anagallis_yellow_vein_virus 1 PRJNA19777
+s__Halovirus_HVTV_1 1 PRJNA186952
+s__Thermoanaerobacter_wiegelii 1 GCF_000147695
+s__Lachnospiraceae_bacterium_2_1_58FAA 1 GCF_000218465
+s__Marinitoga_piezophila 1 GCF_000255135
+s__Pontibacter_sp_BAB1700 1 GCF_000277005
+s__Aromatoleum_aromaticum 1 GCF_000025965
+s__Bacteroides_coprocola 1 GCF_000154845
+s__Halorubrum_arcis 1 GCF_000337015
+s__Streptomyces_sp_HGB0020 1 GCF_000411315
+s__Caldimonas_manganoxidans 1 GCF_000381125
+s__Middle_East_respiratory_syndrome_coronavirus 1 PRJNA183710
+s__Sida_golden_mosaic_Costa_Rica_virus 1 PRJNA14262
+s__Eggerthia_catenaformis 1 GCF_000340375
+s__Clostridium_innocuum 1 GCF_000371425
+s__Vibrionales_bacterium_SWAT_3 1 GCF_000169995
+s__Capnocytophaga_granulosa 1 GCF_000411115
+s__Lactococcus_phage_KSY1 1 PRJNA20783
+s__Streptomyces_acidiscabies 1 GCF_000242715
+s__Enterobacteria_phage_JS10 1 PRJNA38265
+s__Omegapapillomavirus_1 1 PRJNA29915
+s__Thermoanaerobacter_pseudethanolicus 1 GCF_000019085
+s__Paenisporosarcina_sp_HGH0030 1 GCF_000411295
+s__Actinopolyspora_mortivallis 1 GCF_000384035
+s__Clostridium_clariflavum 1 GCF_000237085
+s__Lisianthus_necrosis_virus 1 PRJNA16737
+s__Pedilanthus_leaf_curl_virus 1 PRJNA34665
+s__Rhizobium_phage_16_3 1 PRJNA30845
+s__Gammapapillomavirus_HPV127 1 PRJNA51741
+s__Pepper_vein_yellows_virus 1 PRJNA62493
+s__Bocavirus_gorilla_GBoV1_2009 1 PRJNA51179
+s__Haloarcula_californiae 1 GCF_000337755
+s__Listeria_phage_LP_030_2 1 PRJNA209078
+s__Hyphomicrobium_sp 1 GCF_000253295
+s__Molluscum_contagiosum_virus 1 PRJNA14328
+s__Xanthomonas_gardneri 1 GCF_000192065
+s__Psychrobacter_cryohalolentis 1 GCF_000013905
+s__Alphamesonivirus_1 2 PRJNA68059 PRJNA71143
+s__Elizabethkingia_meningoseptica 3 GCF_000367325 GCF_000447375 GCF_000401415
+s__Thermococcus_gammatolerans 1 GCF_000022365
+s__Sphingobacterium_spiritivorum 2 GCF_000143765 GCF_000159515
+s__Actinobaculum_schaalii 1 GCF_000411135
+s__Mycobacterium_smegmatis 4 GCF_000331165 GCF_000283295 GCF_000328565 GCF_000015005
+s__Prevotella_marshii 1 GCF_000146675
+s__Verbena_virus_Y 1 PRJNA29881
+s__Limnohabitans_sp_Rim28 1 GCF_000293865
+s__Escherichia_phage_TL_2011b 1 PRJNA181074
+s__Escherichia_phage_TL_2011c 1 PRJNA181075
+s__Asaia_sp_SF2_1 1 GCF_000505765
+s__beta_proteobacterium_KB13 1 GCF_000156155
+s__Gordonia_kroppenstedtii 1 GCF_000380485
+s__Mycoplasma_iowae 1 GCF_000227355
+s__Rhodopseudomonas_palustris 7 GCF_000013745 GCF_000013685 GCF_000195775 GCF_000020445 GCF_000013365 GCF_000014825 GCF_000177255
+s__Paenibacillus_riograndensis 1 GCF_000224945
+s__Caminibacter_mediatlanticus 1 GCF_000170735
+s__Streptomyces_niveus 1 GCF_000497425
+s__Broad_bean_true_mosaic_virus 1 PRJNA214691
+s__Acinetobacter_sp_ANC_3929 1 GCF_000369405
+s__Arthrobacter_sp_AK_YN10 1 GCF_000465895
+s__Salinispora_arenicola 21 GCF_000375165 GCF_000375125 GCF_000380945 GCF_000373845 GCF_000375085 GCF_000259615 GCF_000378645 GCF_000375145 GCF_000375005 GCF_000384275 GCF_000259675 GCF_000375105 GCF_000375185 GCF_000375045 GCF_000018265 GCF_000375205 GCF_000375025 GCF_000378665 GCF_000378705 GCF_000378685 GCF_000377605
+s__Shewanella_frigidimarina 1 GCF_000014705
+s__Shigella_boydii 7 GCF_000020185 GCF_000268185 GCF_000012025 GCF_000268145 GCF_000193915 GCF_000211975 GCF_000211955
+s__Tetrapisispora_blattae 1 GCA_000315915
+s__European_mountain_ash_ringspot_associated_virus 1 PRJNA39973
+s__Lactobacillus_phage_Sha1 1 PRJNA181084
+s__Agrobacterium_sp_10MFCol1_1 1 GCF_000381165
+s__Leptospira_inadai 1 GCF_000243675
+s__Bacillus_smithii 1 GCF_000238675
+s__Salmonella_phage_SSU5 1 PRJNA177521
+s__Holospora_undulata 1 GCF_000388175
+s__Tomato_leaf_curl_China_betasatellite 1 PRJNA15446
+s__Staphylococcus_phage_53_sensu_lato 11 PRJNA15260 PRJNA15265 PRJNA15266 PRJNA15270 PRJNA15273 PRJNA15275 PRJNA15277 PRJNA15278 PRJNA15279 PRJNA15280 PRJNA15281
+s__Enterobacteria_phage_HK446 1 PRJNA183141
+s__Sida_yellow_vein_disease_associated_DNA_1 1 PRJNA48075
+s__Mycoplasma_fermentans 2 GCF_000148625 GCF_000186005
+s__Gloeocapsa_sp_PCC_73106 1 GCF_000332035
+s__Thioflavicoccus_mobilis 1 GCF_000327045
+s__Vibrio_proteolyticus 1 GCF_000467125
+s__Plum_bark_necrosis_stem_pitting_associated_virus 1 PRJNA27909
+s__Nosema_ceranae 1 GCA_000182985
+s__Canine_papillomavirus_9 1 PRJNA74353
+s__Canine_papillomavirus_8 1 PRJNA73441
+s__Gordonia_aichiensis 1 GCF_000332975
+s__Acidaminococcus_sp_BV3L6 1 GCF_000468835
+s__Sida_golden_mosaic_virus 1 PRJNA14083
+s__Bartonella_vinsonii 4 GCF_000341385 GCF_000385415 GCF_000278335 GCF_000278235
+s__Classical_swine_fever_virus 1 PRJNA15457
+s__Simbu_virus 1 PRJNA173359
+s__Barley_stripe_mosaic_virus 1 PRJNA15031
+s__Acinetobacter_venetianus 3 GCF_000271425 GCF_000368585 GCF_000308235
+s__Enterobacteria_phage_JS 1 PRJNA27983
+s__Gordonia_alkanivorans 2 GCF_000503935 GCF_000225505
+s__Tuber_aestivum_endornavirus 1 PRJNA61903
+s__White_clover_cryptic_virus_2 1 PRJNA198685
+s__Burkholderia_phage_BcepB1A 1 PRJNA14476
+s__Bradyrhizobium_sp_ORS_285 1 GCF_000239755
+s__Clostridium_symbiosum 3 GCF_000189595 GCF_000466485 GCF_000189615
+s__Sigmapapillomavirus_1 1 PRJNA15171
+s__Bacillus_phage_Cherry 1 PRJNA15784
+s__Acidianus_spindle_shaped_virus_1 1 PRJNA42351
+s__Rhodococcus_phage_E3 1 PRJNA206474
+s__Aerococcus_urinae 1 GCF_000193205
+s__Enterobacteria_phage_SfV 1 PRJNA14162
+s__Mycobacterium_phage_PMC 1 PRJNA17169
+s__Acidovorax_delafieldii 1 GCF_000175235
+s__Methanotorris_formicicus 1 GCF_000243455
+s__Brachyspira_intermedia 1 GCF_000223215
+s__East_Asian_Passiflora_virus 1 PRJNA16326
+s__Sida_micrantha_mosaic_virus 1 PRJNA14343
+s__Rhizobium_etli 9 GCF_000172775 GCF_000172695 GCF_000442435 GCF_000092045 GCF_000172715 GCF_000172795 GCF_000172755 GCF_000172735 GCF_000020265
+s__Tomato_leaf_curl_betasatellite 1 PRJNA14622
+s__Cycloclasticus_pugetii 1 GCF_000384415
+s__Salmonella_phage_Fels_2 1 PRJNA32273
+s__Actinoplanes_phage_phiAsp2 1 PRJNA14378
+s__Wolbachia_endosymbiont_of_Drosophila_melanogaster 2 GCF_000475015 GCF_000008025
+s__Tomato_leaf_curl_Hanoi_virus 1 PRJNA62755
+s__Rothia_dentocariosa 2 GCF_000143585 GCF_000164695
+s__Mycobacterium_phage_Papyrus 1 PRJNA215107
+s__Pseudomonas_sp_CMAA1215 1 GCF_000474765
+s__Bacillus_toyonensis 1 GCF_000496285
+s__Candidatus_Prevotella_conceptionensis 1 GCF_000312305
+s__Clerodendrum_golden_mosaic_China_virus 1 PRJNA32175
+s__Tomato_leaf_curl_Sulawesi_virus 1 PRJNA41173
+s__Heliothis_zea_virus_1 1 PRJNA14215
+s__Primate_T_lymphotropic_virus_3 1 PRJNA14732
+s__Primate_T_lymphotropic_virus_2 1 PRJNA15221
+s__Acinetobacter_phage_AB3 1 PRJNA206500
+s__Sendai_virus 1 PRJNA15023
+s__Klebsiella_phage_0507_KN2_1 1 PRJNA219106
+s__Pepper_yellow_leaf_curl_Indonesia_virus 1 PRJNA17429
+s__Desulfovibrio_sp_6_1_46AFAA 1 GCF_000224635
+s__Oscillibacter_sp_KLE_1745 1 GCF_000469445
+s__Bacillus_infantis 1 GCF_000473245
+s__Bacillus_sp_HYC_10 1 GCF_000300535
+s__Vibrio_phage_fs2 1 PRJNA14088
+s__Vibrio_phage_fs1 1 PRJNA14227
+s__Halomonas_boliviensis 1 GCF_000236035
+s__Bovine_papular_stomatitis_virus 1 PRJNA14469
+s__Bordetella_avium 1 GCF_000070465
+s__Methylocystis_parvus 1 GCF_000283235
+s__Marinomonas_sp_MED121 1 GCF_000153025
+s__Methylobacter_sp_UW_659_2_G11 1 GCF_000375905
+s__Streptococcus_phage_C1 1 PRJNA14288
+s__Methanobacterium_sp_Maddingley_MBC34 1 GCF_000309865
+s__Helicoverpa_armigera_stunt_virus 1 PRJNA14652
+s__Saccharomyces_cerevisiae 1 GCA_000146045
+s__Carboxydothermus_hydrogenoformans 1 GCF_000012865
+s__Eragrostis_minor_streak_virus 1 PRJNA67111
+s__Rhodococcus_sp_29MFTsu3_1 1 GCF_000382105
+s__Pseudomonas_sp_GM41_2012 1 GCF_000282315
+s__Shewanella_violacea 1 GCF_000091325
+s__Malvastrum_leaf_curl_virus 1 PRJNA16325
+s__Indian_citrus_ringspot_virus 1 PRJNA14716
+s__Narcissus_yellow_stripe_virus 1 PRJNA32687
+s__Kingella_denitrificans 1 GCF_000190695
+s__Tetraselmis_viridis_virus_S20 1 PRJNA195490
+s__Periplaneta_fuliginosa_densovirus 1 PRJNA14091
+s__Sphingobium_sp_HDIP04 1 GCF_000445085
+s__Yarrowia_lipolytica 1 GCA_000002525
+s__Choristoneura_rosaceana_entomopoxvirus_L 1 PRJNA203664
+s__Schizosaccharomyces_japonicus 1 GCA_000149845
+s__Saccharibacter_floricola 1 GCF_000378165
+s__Propionibacterium_phage_PHL114L00 1 PRJNA219112
+s__Acaryochloris_sp_CCMEE_5410 1 GCF_000238775
+s__Micrococcus_luteus 4 GCF_000309825 GCF_000176875 GCF_000180435 GCF_000023205
+s__Hana_virus 1 PRJNA196418
+s__Celeribacter_baekdonensis 1 GCF_000299875
+s__Tomato_yellow_leaf_distortion_virus 1 PRJNA165747
+s__Dahlia_latent_viroid 1 PRJNA186953
+s__Desulfobacula_toluolica 1 GCF_000307105
+s__Staphylococcus_phage_phiMR11 1 PRJNA28065
+s__Aedes_albopictus_densovirus 1 PRJNA14581
+s__Oscillatoria_sp_PCC_6506 1 GCF_000180455
+s__Pseudomonas_sp_P179 1 GCF_000478485
+s__Brucella_sp_NF_2653 1 GCF_000177155
+s__Streptococcus_merionis 1 GCF_000380085
+s__Halorubrum_hochstenium 1 GCF_000337075
+s__alpha_proteobacterium_HIMB59 1 GCF_000299115
+s__Aeromonas_phage_Aeh1 1 PRJNA14312
+s__Bacillus_coagulans 5 GCF_000333935 GCF_000333915 GCF_000169195 GCF_000217835 GCF_000223155
+s__Candidatus_Liberibacter_americanus 2 GCF_000496595 GCF_000350385
+s__Staphylococcus_phage_phiN315 1 PRJNA14527
+s__Psipapillomavirus_1 1 PRJNA17549
+s__Staphylococcus_phage_SMSAP5 1 PRJNA181240
+s__Sphingobium_indicum 1 GCF_000264945
+s__Methylophaga_lonarensis 1 GCF_000349205
+s__Geobacillus_sp_C56_T3 1 GCF_000092445
+s__Nanovirus_like_particle 1 PRJNA14386
+s__Diplorickettsia_massiliensis 1 GCF_000257395
+s__Drosophila_A_virus 1 PRJNA39351
+s__Gilvimarinus_chinensis 1 GCF_000377745
+s__Mesorhizobium_opportunistum 1 GCF_000176035
+s__Pseudomonas_putida 24 GCF_000478865 GCF_000281215 GCF_000183645 GCF_000495455 GCF_000016865 GCF_000007565 GCF_000325725 GCF_000285395 GCF_000390005 GCF_000226035 GCF_000497385 GCF_000219705 GCF_000019445 GCF_000410575 GCF_000319305 GCF_000412675 GCF_000292775 GCF_000226475 GCF_000294445 GCF_000271965 GCF_000287915 GCF_000264665 GCF_000019125 GCF_000367825
+s__Singulisphaera_acidiphila 2 GCF_000242455 GCF_000255675
+s__Mythimna_loreyi_densovirus 1 PRJNA14346
+s__Weissella_cibaria 1 GCF_000193635
+s__Rhizobium_tropici 1 GCF_000330885
+s__Salmonella_phage_ViI 1 PRJNA64767
+s__Methylacidiphilum_infernorum 1 GCF_000019665
+s__Listeria_phage_vB_LmoM_AG20 1 PRJNA195527
+s__Vibrio_phage_VEJphi 1 PRJNA38367
+s__Prevotella_maculosa 2 GCF_000243015 GCF_000382385
+s__Streptomyces_sp_CNT372 1 GCF_000377145
+s__Cyanophage_NATL2A_133 1 PRJNA81185
+s__Anaplasma_phagocytophilum 6 GCF_000478445 GCF_000439755 GCF_000439775 GCF_000013125 GCF_000478425 GCF_000439795
+s__Verticillium_alfalfae 1 GCA_000150825
+s__Enterobacteria_phage_BZ13 1 PRJNA14635
+s__Avastrovirus_3 1 PRJNA14954
+s__Vibrio_alginolyticus 4 GCF_000176055 GCF_000354175 GCF_000153505 GCF_000467145
+s__Marinobacter_manganoxydans 1 GCF_000235625
+s__Erwinia_phage_FE44 1 PRJNA227003
+s__Human_cosavirus_E 1 PRJNA38493
+s__Desulfobacterium_autotrophicum 1 GCF_000020365
+s__Rhynchosai_mild_mosaic_virus 1 PRJNA66547
+s__Human_cosavirus_B 1 PRJNA38499
+s__Bacillus_sp_SG_1 1 GCF_000181495
+s__Pleurocapsa_minor 1 GCF_000317025
+s__Banana_mild_mosaic_virus 1 PRJNA14711
+s__Equid_herpesvirus_2 1 PRJNA14457
+s__Equid_herpesvirus_1 1 PRJNA14465
+s__Chlamydia_pecorum 4 GCF_000470825 GCF_000470765 GCF_000204135 GCF_000470805
+s__Hydrogenivirga_sp_128_5_R1_1 1 GCF_000171895
+s__Equid_herpesvirus_4 1 PRJNA14418
+s__Equid_herpesvirus_9 1 PRJNA33137
+s__Equid_herpesvirus_8 1 PRJNA162499
+s__Maize_rayado_fino_virus 1 PRJNA15381
+s__Aurantimonas_manganoxydans 1 GCF_000153465
+s__Aliivibrio_logei 2 GCF_000390125 GCF_000286935
+s__Pseudoalteromonas_haloplanktis 3 GCF_000238355 GCF_000026085 GCF_000212655
+s__Nitrobacter_sp_Nb_311A 1 GCF_000152905
+s__Cowpea_mosaic_virus 1 PRJNA15283
+s__Pseudomonas_phage_YuA 1 PRJNA28053
+s__Bacteroides_phage_B124_14 1 PRJNA82753
+s__Sanguibacter_sp_JC301 1 GCF_000312125
+s__Cherry_green_ring_mottle_virus 1 PRJNA14650
+s__Vibrio_crassostreae 5 GCF_000272065 GCF_000272205 GCF_000272185 GCF_000272045 GCF_000272085
+s__Bhargavaea_cecembensis 1 GCF_000348905
+s__Shewanella_piezotolerans 1 GCF_000014885
+s__Sida_leaf_curl_virus_satellite_DNA_beta 1 PRJNA19823
+s__Roseobacter_sp_CCS2 1 GCF_000169435
+s__Amorphus_coralli 1 GCF_000374525
+s__Halomonas_sp_KM_1 1 GCF_000246875
+s__Streptomyces_scabrisporus 1 GCF_000372745
+s__Teredinibacter_turnerae 6 GCF_000381665 GCF_000023025 GCF_000379165 GCF_000372325 GCF_000381645 GCF_000372925
+s__Streptococcus_criceti 1 GCF_000187975
+s__Duck_parvovirus 1 PRJNA14425
+s__Mycoplasma_crocodyli 1 GCF_000025845
+s__Enterococcus_hirae 3 GCF_000407425 GCF_000271405 GCF_000393835
+s__Atopobium_sp_oral_taxon_199 1 GCF_000411555
+s__Natrinema_sp_J7_2 1 GCF_000281695
+s__Xanthomonas_sp_SHU308 1 GCF_000364645
+s__Hosta_virus_X 1 PRJNA32693
+s__Tomato_leaf_curl_New_Delhi_betasatellite 1 PRJNA14451
+s__Streptococcus_sp_I_G2 1 GCF_000479335
+s__Gibbon_ape_leukemia_virus 1 PRJNA14657
+s__Mycobacterium_phage_Fishburne 1 PRJNA206033
+s__Alfalfa_mosaic_virus 1 PRJNA14667
+s__Pseudomonas_viridiflava 1 GCF_000307715
+s__Sinorhizobium_medicae 3 GCF_000372345 GCF_000378785 GCF_000017145
+s__Desulfovibrio_sp_U5L 1 GCF_000245055
+s__Halomonas_sp_BJGMM_B45 1 GCF_000470745
+s__Borrelia_miyamotoi 1 GCF_000445425
+s__Conexibacter_woesei 1 GCF_000025265
+s__Bovine_foamy_virus 1 PRJNA14646
+s__Chitiniphilus_shinanonensis 1 GCF_000374805
+s__Grapevine_yellow_speckle_viroid_2 1 PRJNA14764
+s__Brachyspira_pilosicoli 4 GCF_000319185 GCF_000325665 GCF_000143725 GCF_000296575
+s__Sulfobacillus_thermosulfidooxidans 1 GCF_000294425
+s__Desulfobacter_postgatei 1 GCF_000233695
+s__Desulfovibrio_sp_3_1_syn3 1 GCF_000145315
+s__Australian_bat_lyssavirus 1 PRJNA14730
+s__Centrosema_yellow_spot_virus 1 PRJNA124057
+s__Rhodobacter_phage_RcapMu 1 PRJNA76743
+s__Alkaliphilus_metalliredigens 1 GCF_000016985
+s__Grapevine_endophyte_endornavirus 1 PRJNA181245
+s__Gossypium_darwinii_symptomless_alphasatellite 1 PRJNA39593
+s__Staphylococcus_sp_HGB0015 1 GCF_000411275
+s__Grapevine_yellow_speckle_viroid_1 1 PRJNA14963
+s__Bacillus_phage_SPO1 1 PRJNA32379
+s__Abaca_bunchy_top_virus 1 PRJNA28697
+s__Campylobacter_showae 3 GCF_000175655 GCF_000313615 GCF_000344295
+s__Propionibacterium_phage_P104A 1 PRJNA177532
+s__Escherichia_phage_ADB_2 1 PRJNA183155
+s__Pseudomonas_phage_MPK6 1 PRJNA227001
+s__Brucella_phage_Tb 1 PRJNA181063
+s__Nitratireductor_aquibiodomus 1 GCF_000265055
+s__Sweet_potato_vein_clearing_virus 1 PRJNA64493
+s__Ross_s_goose_hepatitis_B_virus 2 PRJNA14380 PRJNA14403
+s__Porphyromonas_gulae 1 GCF_000378065
+s__Maribacter_sp_HTCC2170 1 GCF_000153165
+s__Ralstonia_sp_AU12_08 1 GCF_000442475
+s__Veillonella_sp_6_1_27 1 GCF_000163735
+s__Streptomyces_sp_CcalMP_8W 1 GCF_000373305
+s__Pseudoalteromonas_agarivorans 1 GCF_000363985
+s__Vibrio_parahaemolyticus 21 GCF_000454145 GCF_000500755 GCF_000182465 GCF_000196095 GCF_000454225 GCF_000154045 GCF_000500105 GCF_000454475 GCF_000454455 GCF_000454205 GCF_000182385 GCF_000315135 GCF_000182365 GCF_000454185 GCF_000477475 GCF_000328405 GCF_000454245 GCF_000195415 GCF_000454165 GCF_000182345 GCF_000454265
+s__Cotia_virus 1 PRJNA85563
+s__Anaerococcus_hydrogenalis 2 GCF_000191745 GCF_000173355
+s__Pectobacterium_phage_phiTE 1 PRJNA188533
+s__Verrucomicrobia_bacterium_SCGC_AAA168_E21 1 GCF_000264625
+s__Blueberry_scorch_virus 1 PRJNA15329
+s__Adoxophyes_honmai_nucleopolyhedrovirus 1 PRJNA14408
+s__Geobacillus_kaustophilus 2 GCF_000415905 GCF_000009785
+s__Hydrocarboniphaga_effusa 1 GCF_000271305
+s__Parietaria_mottle_virus 1 PRJNA14940
+s__Mycobacterium_phage_BPs 1 PRJNA29917
+s__Bacillus_coahuilensis 1 GCF_000171615
+s__Pseudomonas_sp_Chol1 1 GCF_000306015
+s__West_Nile_virus 1 PRJNA30293
+s__Grapevine_satellite_virus 1 PRJNA208539
+s__Caldicellulosiruptor_kronotskyensis 1 GCF_000166775
+s__Porcine_associated_stool_circular_virus 1 PRJNA175586
+s__Bordetella_sp_FB_8 1 GCF_000382185
+s__Weissella_ceti 1 GCF_000320345
+s__Erwinia_sp_Ejp617 1 GCF_000165815
+s__Aquaspirillum_serpens 1 GCF_000420525
+s__Flexibacter_litoralis 1 GCF_000265505
+s__Pseudomonas_phage_D3112 1 PRJNA14334
+s__Holdemania_sp_AP2 1 GCF_000327285
+s__Mud_crab_dicistrovirus 1 PRJNA61121
+s__Langat_virus 1 PRJNA15370
+s__Bacteroides_caccae 2 GCF_000169015 GCF_000273725
+s__Bradyrhizobium_sp_S23321 1 GCF_000284275
+s__Xenopus_laevis_endogenous_retrovirus_Xen1 1 PRJNA30173
+s__Comamonas_testosteroni 5 GCF_000168855 GCF_000241525 GCF_000093145 GCF_000178915 GCF_000241245
+s__Eudoraea_adriatica 1 GCF_000382125
+s__Acinetobacter_sp_NIPH_542 1 GCF_000369825
+s__Actinobacillus_capsulatus 1 GCF_000374285
+s__Nodamura_virus 1 PRJNA14724
+s__Janibacter_sp_HTCC2649 1 GCF_000152705
+s__Gordonia_sp_KTR9 1 GCF_000143885
+s__Vibrio_genomosp_F6 1 GCF_000272145
+s__Leucobacter_sp_UCD_THU 1 GCF_000349545
+s__Tomato_mild_yellow_leaf_curl_Aragua_virus 1 PRJNA19653
+s__Haloferax_larsenii 1 GCF_000336955
+s__Latino_virus 1 PRJNA29905
+s__Phycisphaera_mikurensis 1 GCF_000284115
+s__Eubacterium_rectale 3 GCF_000209935 GCF_000020605 GCF_000209955
+s__Escherichia_fergusonii 2 GCF_000026225 GCF_000190495
+s__Brochothrix_phage_NF5 1 PRJNA64545
+s__Gluconacetobacter_hansenii 1 GCF_000164395
+s__Helicobacter_pylori 271 GCF_000275185 GCF_000275365 GCF_000274325 GCF_000345065 GCF_000148855 GCF_000345405 GCF_000392455 GCF_000274405 GCF_000275425 GCF_000345465 GCF_000275085 GCF_000274905 GCF_000444325 GCF_000274965 GCF_000345565 GCF_000192335 GCF_000346875 GCF_000359645 GCF_000274745 GCF_000345885 GCF_000307795 GCF_000345945 GCF_000256035 GCF_000196755 GCF_000023805 GCF_000270045 GCF_000346025 GCF_000349505 GCF_000345145 GCF_000299815 GCF_000275385 GCF_000274025 GCF_000273805 GCF [...]
+s__Munia_coronavirus_HKU13 1 PRJNA32703
+s__Rhodococcus_phage_RRH1 1 PRJNA81169
+s__Pseudomonas_phage_MPK7 1 PRJNA215673
+s__Bovine_adeno_associated_virus 1 PRJNA14381
+s__Gallid_herpesvirus_2 1 PRJNA14402
+s__Silicibacter_sp_TrichCH4B 1 GCF_000161815
+s__Erysipelothrix_tonsillarum 1 GCF_000373785
+s__Bean_yellow_mosaic_Mexico_virus 1 PRJNA66545
+s__Talaromyces_stipitatus 1 GCA_000003125
+s__Arthroderma_gypseum 1 GCA_000150975
+s__Bacillus_sp_m3_13 1 GCF_000175075
+s__Aliivibrio_fischeri 4 GCF_000011805 GCF_000241785 GCF_000287175 GCF_000020845
+s__Rickettsia_honei 1 GCF_000263055
+s__Haloferax_mediterranei 2 GCF_000337295 GCF_000306765
+s__Cereal_yellow_dwarf_virus_RPV_satellite_RNA 1 PRJNA14169
+s__Bordetella_petrii 1 GCF_000067205
+s__Betapapillomavirus_5 1 PRJNA15488
+s__Betapapillomavirus_4 1 PRJNA14406
+s__Duvenhage_virus 1 PRJNA194144
+s__Rhodococcus_rhodnii 1 GCF_000389715
+s__Ageratum_Yellow_vein_China_virus_OX1 1 PRJNA202889
+s__Staphylococcus_phage_EW 1 PRJNA15272
+s__Geobacillus_sp_G11MC16 1 GCF_000173035
+s__Parvularcula_bermudensis 1 GCF_000152825
+s__Streptococcus_phage_Cp_1 1 PRJNA14584
+s__Carnobacterium_sp_17_4 1 GCF_000195575
+s__Junonia_coenia_densovirus 1 PRJNA15423
+s__Acinetobacter_sp_CIP_102136 1 GCF_000369685
+s__Bacteroides_sp_D22 1 GCF_000163675
+s__Bacteroides_sp_D20 1 GCF_000162215
+s__Bunyamwera_virus 1 PRJNA14649
+s__Phytophthora_endornavirus_1 1 PRJNA15418
+s__Mycobacterium_phage_Rizal 1 PRJNA31281
+s__Sin_Nombre_virus 1 PRJNA15005
+s__Mycobacterium_abscessus 61 GCF_000069185 GCF_000332605 GCF_000270565 GCF_000270925 GCF_000271145 GCF_000271225 GCF_000271125 GCF_000271105 GCF_000261105 GCF_000270765 GCF_000333695 GCF_000500165 GCF_000277775 GCF_000270785 GCF_000270645 GCF_000260575 GCF_000270585 GCF_000271025 GCF_000257245 GCF_000271205 GCF_000280595 GCF_000445035 GCF_000271045 GCF_000271265 GCF_000280615 GCF_000270945 GCF_000270845 GCF_000271185 GCF_000280655 GCF_000270825 GCF_000500185 GCF_000270665 GCF_000271165 [...]
+s__Plesiomonas_shigelloides 1 GCF_000392595
+s__Gremmeniella_abietina_mitochondrial_RNA_virus_S2 1 PRJNA15229
+s__Crow_polyomavirus 1 PRJNA16654
+s__Actinomyces_cardiffensis 1 GCF_000364865
+s__Plutella_xylostella_granulovirus 1 PRJNA14104
+s__Paramecium_tetraurelia 1 GCA_000165425
+s__Nonomuraea_coxensis 1 GCF_000379885
+s__Burkholderia_phage_Bcep43 1 PRJNA14411
+s__Klebsiella_phage_KP15 1 PRJNA47333
+s__Thioalkalivibrio_sp_K90mix 1 GCF_000025545
+s__Streptococcus_caballi 1 GCF_000379985
+s__Burkholderia_oklahomensis 2 GCF_000170375 GCF_000170355
+s__Sugarcane_bacilliform_IM_virus 1 PRJNA14123
+s__Mycobacterium_phage_Nigel 1 PRJNA30609
+s__Oxalobacteraceae_bacterium_IMCC9480 1 GCF_000195205
+s__Sweet_potato_leaf_curl_Japan_virus 1 PRJNA217880
+s__Microbulbifer_variabilis 1 GCF_000380565
+s__Paenibacillus_dendritiformis 1 GCF_000245555
+s__Coccidioides_posadasii 1 GCA_000151335
+s__Nitrolancetus_hollandicus 1 GCF_000297255
+s__Clostridium_methylpentosum 1 GCF_000158655
+s__Parvovirus_NIH_CQV 1 PRJNA215356
+s__alpha_proteobacterium_HIMB5 1 GCF_000299095
+s__Bacteroides_clarus 1 GCF_000195615
+s__Archaeoglobus_veneficus 1 GCF_000194625
+s__Brucella_suis 37 GCF_000371225 GCF_000371125 GCF_000371305 GCF_000365705 GCF_000480315 GCF_000366205 GCF_000007505 GCF_000209635 GCF_000371265 GCF_000480155 GCF_000366245 GCF_000480055 GCF_000480035 GCF_000236255 GCF_000371245 GCF_000365585 GCF_000371325 GCF_000160275 GCF_000157775 GCF_000157755 GCF_000292125 GCF_000292005 GCF_000366085 GCF_000371085 GCF_000371185 GCF_000371205 GCF_000018905 GCF_000371145 GCF_000366265 GCF_000292105 GCF_000480135 GCF_000365565 GCF_000223195 GCF_000371 [...]
+s__Corynebacterium_resistens 1 GCF_000177535
+s__Methanobrevibacter_sp_AbM4 1 GCF_000404165
+s__Treponema_lecithinolyticum 1 GCF_000468055
+s__Tamus_red_mosaic_virus 1 PRJNA73083
+s__Cocksfoot_mild_mosaic_virus 1 PRJNA30849
+s__Rhizobium_phage_RR1_A 1 PRJNA209209
+s__Brucella_sp_F5_99 1 GCF_000158995
+s__Lucky_bamboo_bacilliform_virus 1 PRJNA19855
+s__Clostridium_phage_phiCP34O 1 PRJNA181211
+s__Actinomadura_flavalba 1 GCF_000374305
+s__Human_papillomavirus_type_128 1 PRJNA62171
+s__Salmonella_phage_ST64T 1 PRJNA14230
+s__Dyoetapapillomavirus_1 1 PRJNA33407
+s__Macroptilium_mosaic_Puerto_Rico_virus 1 PRJNA14398
+s__Raphidiopsis_brookii 1 GCF_000175855
+s__Stenotrophomonas_maltophilia 14 GCF_000355725 GCF_000382065 GCF_000223885 GCF_000237025 GCF_000287935 GCF_000295735 GCF_000020665 GCF_000072485 GCF_000455685 GCF_000355745 GCF_000344215 GCF_000346445 GCF_000284595 GCF_000308335
+s__Alcaligenes_sp_EGD_AK7 1 GCF_000465875
+s__Haemophilus_aegyptius 1 GCF_000195005
+s__Phormidium_phage_Pf_WMP4 1 PRJNA17743
+s__Citromicrobium_bathyomarinum 1 GCF_000176355
+s__Halomonas_titanicae 1 GCF_000336575
+s__Cytophaga_hutchinsonii 1 GCF_000014145
+s__Pelagibacter_phage_HTVC010P 1 PRJNA192866
+s__Aspergillus_terreus 1 GCA_000149615
+s__Cupriavidus_sp_WS 1 GCF_000395345
+s__Marine_birnavirus 1 PRJNA16748
+s__Xestia_c_nigrum_granulovirus 1 PRJNA14092
+s__Lactobacillus_phage_J_1 1 PRJNA227005
+s__marine_gamma_proteobacterium_HTCC2080 1 GCF_000169115
+s__Angelonia_flower_break_virus 1 PRJNA16334
+s__Streptococcus_phage_EJ_1 1 PRJNA14604
+s__Pseudomonas_sp_TJI_51 1 GCF_000190455
+s__Modoc_virus 1 PRJNA15393
+s__Clostridium_sp_KLE_1755 1 GCF_000466465
+s__Coprococcus_comes 1 GCF_000155875
+s__Vesicular_stomatitis_Indiana_virus 1 PRJNA14673
+s__Gordonia_namibiensis 1 GCF_000298235
+s__Maize_Iranian_mosaic_virus 1 PRJNA32689
+s__Trichormus_azollae 1 GCF_000196515
+s__Rhizobium_giardinii 1 GCF_000379605
+s__Leptospira_sp_B5_022 1 GCF_000347035
+s__Shimwellia_blattae 2 GCF_000262305 GCF_000327265
+s__Eimeria_brunetti_RNA_virus_1 1 PRJNA14725
+s__Methylobacter_tundripaludum 1 GCF_000190755
+s__Theileria_parva 1 GCA_000165365
+s__Bat_polyomavirus 1 PRJNA32077
+s__Bacteroides_gallinarum 1 GCF_000374365
+s__Acidianus_rod_shaped_virus_1 1 PRJNA27799
+s__Youngiibacter_fragilis 1 GCF_000495435
+s__Liberibacter_phage_SC1 1 PRJNA181990
+s__Liberibacter_phage_SC2 1 PRJNA181991
+s__Acinetobacter_sp_NIPH_2171 1 GCF_000369625
+s__Cactus_virus_X 1 PRJNA14996
+s__Runella_slithyformis 1 GCF_000218895
+s__Streptococcus_phage_M102 1 PRJNA38845
+s__Fowlpox_virus 1 PRJNA14052
+s__Saccharomonospora_marina 1 GCF_000244955
+s__Wigglesworthia_glossinidia 2 GCF_000247565 GCF_000008885
+s__Shewanella_benthica 1 GCF_000172075
+s__Exiguobacterium_antarcticum 1 GCF_000299435
+s__Paenibacillus_barengoltzii 1 GCF_000403375
+s__Cohnella_laeviribosi 1 GCF_000378425
+s__Enterococcus_phage_phiFL3A 1 PRJNA42787
+s__Croceibacter_atlanticus 1 GCF_000196315
+s__Simian_virus_40 1 PRJNA14024
+s__Tomato_golden_mottle_virus 1 PRJNA14182
+s__Sea_turtle_tornovirus_1 1 PRJNA34541
+s__Possum_enterovirus_W6 1 PRJNA18519
+s__Possum_enterovirus_W1 1 PRJNA18517
+s__Thioalkalivibrio_sp_AKL6 1 GCF_000376905
+s__Thioalkalivibrio_sp_AKL7 1 GCF_000381705
+s__Neisseria_mucosa 2 GCF_000173875 GCF_000186165
+s__Enterobacteria_phage_HK633 1 PRJNA183143
+s__Thioalkalivibrio_sp_AKL3 1 GCF_000377805
+s__Tannerella_forsythia 1 GCF_000238215
+s__Enterobacteria_phage_HK630 1 PRJNA183142
+s__Staphylococcus_phage_phiMR25 1 PRJNA30061
+s__Thioalkalivibrio_sp_AKL8 1 GCF_000380525
+s__Thioalkalivibrio_sp_AKL9 1 GCF_000377825
+s__Border_disease_virus 1 PRJNA15463
+s__Cleome_golden_mosaic_virus 1 PRJNA65817
+s__Sulfurovum_sp_AR 1 GCF_000296775
+s__Staphylococcus_phage_PH15 1 PRJNA18525
+s__Leucothrix_mucor 1 GCF_000419525
+s__Tomato_leaf_curl_Pakistan_alphasatellite 1 PRJNA38463
+s__Brucella_sp_F8_99 1 GCF_000371005
+s__Bartonella_elizabethae 2 GCF_000278175 GCF_000278315
+s__Kosakonia_radicincitans 1 GCF_000280495
+s__Deferribacter_desulfuricans 1 GCF_000010985
+s__Faecalibacterium_prausnitzii 5 GCF_000166035 GCF_000209855 GCF_000154385 GCF_000210735 GCF_000162015
+s__SAR324_cluster_bacterium_JCVI_SC_AAA005 1 GCF_000224765
+s__Prevotella_pallens 1 GCF_000220255
+s__Walrus_calicivirus 1 PRJNA14874
+s__Ectocarpus_siliculosus_virus_1 1 PRJNA14114
+s__Rubus_canadensis_virus_1 1 PRJNA178460
+s__California_sea_lion_polyomavirus_1 1 PRJNA45909
+s__Streptococcus_sp_HPH0090 1 GCF_000411475
+s__Beutenbergia_cavernae 1 GCF_000023105
+s__Tomato_yellow_leaf_curl_China_alphasatellite 1 PRJNA15481
+s__Corynebacterium_sp_KPL2004 1 GCF_000477875
+s__Peptoniphilus_rhinitidis 1 GCF_000246925
+s__Parachlamydia_acanthamoebae 2 GCF_000176075 GCF_000253035
+s__Bovine_ephemeral_fever_virus 1 PRJNA14434
+s__Cotton_leaf_curl_Multan_virus_satellite_U36_1 1 PRJNA16312
+s__Desulfobacter_curvatus 1 GCF_000373985
+s__Desulfurobacterium_sp_TC5_1 1 GCF_000421485
+s__Brucella_melitensis 51 GCF_000370725 GCF_000366625 GCF_000370665 GCF_000250835 GCF_000298595 GCF_000366885 GCF_000182235 GCF_000366965 GCF_000366925 GCF_000331655 GCF_000348645 GCF_000365865 GCF_000370825 GCF_000370765 GCF_000158695 GCF_000366865 GCF_000370845 GCF_000022625 GCF_000367045 GCF_000370785 GCF_000367025 GCF_000365845 GCF_000292065 GCF_000370685 GCF_000158735 GCF_000227645 GCF_000479975 GCF_000370745 GCF_000370645 GCF_000298615 GCF_000367005 GCF_000209575 GCF_000192885 GCF_ [...]
+s__Blackcurrant_reversion_virus_satellite_RNA 1 PRJNA14821
+s__Pseudomonas_sp_CF150 1 GCF_000416175
+s__Acinetobacter_sp_TG27347 1 GCF_000301635
+s__Rhodanobacter_sp_115 1 GCF_000264335
+s__Seal_anellovirus_TFFN_USA_2006 1 PRJNA63583
+s__Acinetobacter_sp_NCTC_10304 1 GCF_000248215
+s__Bovine_hungarovirus 1 PRJNA176432
+s__Chthonomonas_calidirosea 1 GCF_000427095
+s__Pepper_chat_fruit_viroid 1 PRJNA32817
+s__Lactococcus_phage_340 1 PRJNA213081
+s__Chayote_mosaic_virus 1 PRJNA15420
+s__Bacillus_sp_AP8 1 GCF_000321185
+s__Lactococcus_phage_936_sensu_lato 7 PRJNA14087 PRJNA14096 PRJNA17737 PRJNA17739 PRJNA17757 PRJNA17759 PRJNA30597
+s__Leek_white_stripe_virus 1 PRJNA15082
+s__Campylobacter_rectus 1 GCF_000174175
+s__Pectobacterium_wasabiae 2 GCF_000024645 GCF_000291725
+s__Ageratum_yellow_vein_Sri_Lanka_virus 1 PRJNA14120
+s__Propionibacterium_humerusii 1 GCF_000204235
+s__Corynebacterium_ulcerans 4 GCF_000215645 GCF_000306825 GCF_000215665 GCF_000498915
+s__Brucella_sp_63_311 1 GCF_000370945
+s__Fujinami_sarcoma_virus 1 PRJNA14708
+s__Oryza_rufipogon_endornavirus 1 PRJNA16238
+s__Novosphingobium_aromaticivorans 1 GCF_000013325
+s__Carrot_mottle_virus 1 PRJNA72859
+s__gamma_proteobacterium_HdN1 1 GCF_000198515
+s__Persimmon_viroid 1 PRJNA28683
+s__Avian_gyrovirus_2 1 PRJNA65815
+s__Vesicular_exanthema_of_swine_virus 1 PRJNA14704
+s__Bilophila_wadsworthia 1 GCF_000185705
+s__Staphylococcus_massiliensis 2 GCF_000314555 GCF_000298075
+s__Lactobacillus_phage_Lrm1 1 PRJNA30879
+s__Tobacco_bushy_top_virus 1 PRJNA14868
+s__Apple_stem_grooving_virus 1 PRJNA15119
+s__Tobacco_leaf_curl_Japan_virus 1 PRJNA14261
+s__GB_virus_C 1 PRJNA15467
+s__Infectious_salmon_anemia_virus 1 PRJNA15020
+s__Planaria_asexual_strain_specific_virus_like_element_type_1 1 PRJNA14140
+s__Deep_sea_thermophilic_phage_D6E 1 PRJNA181996
+s__Ageratum_leaf_curl_virus 1 PRJNA14492
+s__Meiothermus_silvanus 1 GCF_000092125
+s__Klebsiella_phage_phiKO2 1 PRJNA14495
+s__Caulobacter_phage_phiCbK 1 PRJNA179418
+s__Vibrio_phage_11895_B1 1 PRJNA195495
+s__Mink_calicivirus 1 PRJNA183163
+s__Planctomyces_limnophilus 1 GCF_000092105
+s__Tomato_leaf_curl_Pakistan_virus 1 PRJNA17539
+s__St_Valerien_swine_virus 1 PRJNA38093
+s__Enterocytozoon_bieneusi 1 GCA_000209485
+s__Anaerococcus_tetradius 1 GCF_000159095
+s__Mason_Pfizer_monkey_virus 1 PRJNA14683
+s__Bean_chlorotic_mosaic_virus 1 PRJNA214690
+s__Synechococcus_phage_syn9 1 PRJNA17541
+s__Sulfolobus_turreted_icosahedral_virus_2 1 PRJNA48299
+s__Murine_leukemia_virus 2 PRJNA14907 PRJNA15204
+s__Rhodothermus_marinus 2 GCF_000024845 GCF_000224745
+s__Subdoligranulum_sp_4_3_54A2FAA 1 GCF_000238635
+s__Clostridium_sp_MSTE9 1 GCF_000277625
+s__Paraprevotella_clara 1 GCF_000233955
+s__Xylella_phage_Xfas53 1 PRJNA42595
+s__Caldivirga_maquilingensis 1 GCF_000018305
+s__Beet_ringspot_virus 1 PRJNA15287
+s__Achromobacter_piechaudii 2 GCF_000286415 GCF_000164035
+s__Ruminococcus_sp 1 GCF_000209835
+s__Zunongwangia_profunda 1 GCF_000023465
+s__Anaerococcus_vaginalis 1 GCF_000163295
+s__Prevotella_bivia 2 GCF_000177315 GCF_000262545
+s__Helicobacter_cinaedi 3 GCF_000349975 GCF_000155475 GCF_000284635
+s__Simian_retrovirus_4 1 PRJNA51791
+s__Arcanobacterium_haemolyticum 1 GCF_000092365
+s__Duck_circovirus 3 PRJNA14543 PRJNA14619 PRJNA15558
+s__Sida_yellow_vein_Vietnam_virus 1 PRJNA19783
+s__Chlamydia_phage_2 1 PRJNA14593
+s__Chlamydia_phage_3 1 PRJNA14471
+s__Chlamydia_phage_1 1 PRJNA14064
+s__Chlamydia_phage_4 1 PRJNA15781
+s__Soybean_yellow_mottle_mosaic_virus 1 PRJNA33135
+s__Beet_necrotic_yellow_vein_virus 1 PRJNA15033
+s__Paenibacillus_sp_OSY_SE 1 GCF_000283315
+s__Lactobacillus_versmoldensis 1 GCF_000260455
+s__Cucurbit_aphid_borne_yellows_virus 1 PRJNA15074
+s__Porcine_type_C_oncovirus 1 PRJNA14126
+s__Flavobacterium_rivuli 1 GCF_000378485
+s__Helicoverpa_armigera_granulovirus 1 PRJNA28275
+s__Pannonibacter_phragmitetus 1 GCF_000382365
+s__Pokeweed_mosaic_virus 1 PRJNA177901
+s__Mycobacterium_phage_Bxb1 1 PRJNA14109
+s__Komagataella_pastoris 1 GCA_000027005
+s__Apricot_pseudo_chlorotic_leaf_spot_virus 1 PRJNA15172
+s__Saccharomyces_cerevisiae_killer_virus_M1 1 PRJNA14678
+s__Magnetospirillum_sp_SO_1 1 GCF_000342045
+s__Enterobacteria_phage_fiAA91_ss 1 PRJNA226726
+s__Kineosphaera_limosa 1 GCF_000298215
+s__Sida_yellow_mosaic_Yucatan_virus 1 PRJNA18625
+s__Rotavirus_C 1 PRJNA16140
+s__Rotavirus_A 1 PRJNA32521
+s__Rotavirus_G 1 PRJNA209727
+s__Coriobacterium_glomerans 1 GCF_000195315
+s__Lactobacillus_hilgardii 1 GCF_000159315
+s__Novosphingobium_tardaugens 1 GCF_000466945
+s__Candidatus_Cloacimonas_acidaminovorans 1 GCF_000146065
+s__Pyrobaculum_neutrophilum 1 GCF_000019805
+s__Streptococcus_pseudoporcinus 2 GCF_000188035 GCF_000183465
+s__Clostridium_phage_phiCP13O 1 PRJNA181210
+s__Corynebacterium_propinquum 1 GCF_000375525
+s__Rhinovirus_B 1 PRJNA15309
+s__Rhinovirus_C 1 PRJNA27901
+s__Rhinovirus_A 1 PRJNA15330
+s__Vernonia_yellow_vein_virus 1 PRJNA16335
+s__Rehmannia_mosaic_virus 1 PRJNA18885
+s__Enterobacteria_phage_FI_sensu_lato 1 PRJNA15459
+s__Sweet_potato_chlorotic_fleck_virus 1 PRJNA15038
+s__Bacillus_phage_SP10 1 PRJNA181082
+s__Vibrio_sinaloensis 1 GCF_000189275
+s__Bacteroides_intestinalis 1 GCF_000172175
+s__Giardia_lamblia_virus 1 PRJNA15018
+s__Labrenzia_alexandrii 1 GCF_000158095
+s__Thioalkalivibrio_sp_ALMg11 1 GCF_000377905
+s__Ateline_herpesvirus_3 1 PRJNA14040
+s__SAR324_cluster_bacterium_SCGC_AAA001_C10 1 GCF_000213335
+s__Desulfovibrio_alaskensis 1 GCF_000012665
+s__Salmonella_phage_7_11 1 PRJNA72387
+s__Lunk_virus_NKS_1 1 PRJNA176605
+s__Cymbidium_ringspot_virus_satellite_RNA 1 PRJNA14989
+s__Turnip_vein_clearing_virus 1 PRJNA14685
+s__Anoxybacillus_sp_SK3_4 1 GCF_000443775
+s__Butyrivibrio_sp_NC2007 1 GCF_000421405
+s__Shewanella_woodyi 1 GCF_000019525
+s__Anaerococcus_sp_PH9 1 GCF_000307225
+s__Human_metapneumovirus 1 PRJNA15498
+s__Chryseobacterium_gleum 1 GCF_000143785
+s__Staphylococcus_phage_StauST398_1 1 PRJNA206473
+s__Staphylococcus_phage_StauST398_2 1 PRJNA206489
+s__Staphylococcus_phage_StauST398_3 1 PRJNA206472
+s__Lupine_mosaic_virus 1 PRJNA61853
+s__Glaciecola_agarilytica 1 GCF_000314935
+s__Corynebacterium_lipophiloflavum 1 GCF_000159635
+s__Fusobacterium_gonidiaformans 2 GCF_000158835 GCF_000158235
+s__Obodhiang_virus 1 PRJNA159049
+s__Lewinella_persica 1 GCF_000373105
+s__Streptomyces_phage_phiSASD1 1 PRJNA49613
+s__Acinetobacter_sp_CIP_102159 1 GCF_000368285
+s__Yersinia_phage_Berlin 1 PRJNA18481
+s__Geobacillus_sp_WCH70 1 GCF_000023385
+s__Bat_coronavirus_HKU10 1 PRJNA177902
+s__alpha_proteobacterium_SCGC_AAA280_B11 1 GCF_000371745
+s__Nocardiopsis_xinjiangensis 1 GCF_000341145
+s__Sulfolobus_acidocaldarius 3 GCF_000012285 GCF_000338775 GCF_000340315
+s__Streptomyces_sp_351MFTsu5_1 1 GCF_000383655
+s__Broad_bean_mottle_virus 1 PRJNA14833
+s__Butyrivibrio_proteoclasticus 1 GCF_000145035
+s__Acidovorax_radicis 1 GCF_000204195
+s__Thauera_aminoaromatica 1 GCF_000310185
+s__Tomato_chino_La_Paz_virus 1 PRJNA14368
+s__Lactate_dehydrogenase_elevating_virus 1 PRJNA14702
+s__Synechococcus_phage_S_CBS1 1 PRJNA76741
+s__Rhodococcus_pyridinivorans 1 GCF_000236965
+s__Human_herpesvirus_6B 1 PRJNA14422
+s__Candidatus_Pelagibacter_sp_HTCC7211 1 GCF_000155895
+s__Lactobacillus_iners 15 GCF_000160875 GCF_000191685 GCF_000149085 GCF_000179935 GCF_000149145 GCF_000177755 GCF_000204435 GCF_000179955 GCF_000179975 GCF_000149125 GCF_000149065 GCF_000149105 GCF_000185405 GCF_000191705 GCF_000179995
+s__Staphylococcus_phage_11 1 PRJNA14246
+s__Enterobacteria_phage_Phieco32 1 PRJNA28729
+s__Vibrio_phage_KSF_1phi 1 PRJNA14562
+s__Klebsiella_phage_KP36 1 PRJNA183428
+s__Klebsiella_phage_KP34 1 PRJNA42781
+s__Klebsiella_phage_KP32 1 PRJNA42779
+s__Blue_squill_virus_A 1 PRJNA179427
+s__Blattabacterium_sp_Nauphoeta_cinerea 1 GCF_000471965
+s__Klebsiella_sp_4_1_44FAA 1 GCF_000238715
+s__Candida_glabrata 1 GCA_000002545
+s__Desmodium_leaf_distortion_virus 1 PRJNA17991
+s__Leptospirillum_ferrooxidans 1 GCF_000284315
+s__Enterobacteria_phage_Qbeta 1 PRJNA15479
+s__Persephonella_marina 1 GCF_000021565
+s__Imtechella_halotolerans 1 GCF_000260835
+s__Lactobacillus_ultunensis 1 GCF_000159415
+s__Okra_leaf_curl_alphasatellite 1 PRJNA29397
+s__Streptococcus_sp_I_P16 1 GCF_000479315
+s__Bacillus_mycoides 3 GCF_000161435 GCF_000161415 GCF_000003925
+s__Kurthia_sp_JC8E 1 GCF_000285595
+s__Cyanophage_NATL1A_7 1 PRJNA81183
+s__Streptomyces_chartreusis 2 GCF_000226435 GCF_000226455
+s__Thiorhodospira_sibirica 1 GCF_000227725
+s__Erysimum_latent_virus 1 PRJNA14651
+s__Cherry_rasp_leaf_virus 1 PRJNA15131
+s__Hydrogenobaculum_sp_HO 1 GCF_000341855
+s__Malvastrum_leaf_curl_Philippines_virus 1 PRJNA203676
+s__Streptomyces_phage_VWB 1 PRJNA14485
+s__Dyoepsilonpapillomavirus_1 1 PRJNA39987
+s__Human_polyomavirus_12 1 PRJNA195931
+s__Enterococcus_phage_phiEf11 1 PRJNA42943
+s__Ectropis_obliqua_nucleopolyhedrovirus 1 PRJNA18273
+s__Sphingopyxis_alaskensis 1 GCF_000013985
+s__Azoarcus_toluclasticus 1 GCF_000378245
+s__Providencia_rettgeri 2 GCF_000314835 GCF_000158055
+s__Digitaria_streak_virus 1 PRJNA14069
+s__Moraxella_catarrhalis 12 GCF_000193025 GCF_000302495 GCF_000193065 GCF_000192985 GCF_000193085 GCF_000193005 GCF_000193045 GCF_000192965 GCF_000192905 GCF_000192925 GCF_000192945 GCF_000092265
+s__Goose_parvovirus 1 PRJNA14098
+s__Leptospira_kmetyi 1 GCF_000243735
+s__Porcine_stool_associated_circular_virus_3 1 PRJNA202890
+s__Hirame_rhabdovirus 1 PRJNA15132
+s__Kangiella_koreensis 1 GCF_000024085
+s__Diuris_virus_B 1 PRJNA178592
+s__Bacteroidetes_bacterium_oral_taxon_272 1 GCF_000442105
+s__Halorubrum_pleomorphic_virus_3 1 PRJNA157259
+s__Cowpea_severe_mosaic_virus 1 PRJNA15301
+s__Halorubrum_pleomorphic_virus_1 1 PRJNA36677
+s__Mycobacterium_phage_Crossroads 1 PRJNA215113
+s__Streptococcus_massiliensis 2 GCF_000380065 GCF_000341525
+s__Squash_leaf_curl_Philippines_virus 1 PRJNA14369
+s__Treponema_bryantii 1 GCF_000421345
+s__Bacteroides_eggerthii 3 GCF_000155815 GCF_000185605 GCF_000273465
+s__Alcelaphine_herpesvirus_1 1 PRJNA14099
+s__Lactobacillus_animalis 1 GCF_000183825
+s__Vibrio_phage_VSK 1 PRJNA14337
+s__Clostridium_sp_JC122 1 GCF_000285575
+s__Marivirga_tractuosa 1 GCF_000183425
+s__Xanthomonas_campestris 14 GCF_000277975 GCF_000277955 GCF_000277875 GCF_000277915 GCF_000277895 GCF_000070605 GCF_000007145 GCF_000012105 GCF_000159815 GCF_000263835 GCF_000221965 GCF_000233635 GCF_000277935 GCF_000321125
+s__Desulfotalea_psychrophila 1 GCF_000025945
+s__Persimmon_cryptic_virus 1 PRJNA167734
+s__Pseudomonas_luteola 1 GCF_000282775
+s__Nilaparvata_lugens_reovirus 1 PRJNA14775
+s__Mycobacterium_marinum 3 GCF_000419315 GCF_000419335 GCF_000018345
+s__Burkholderia_phage_phi52237 1 PRJNA15422
+s__Hydrogenobaculum_sp_Y04AAS1 1 GCF_000020785
+s__Rhodococcus_phage_RGL3 1 PRJNA81167
+s__Thiorhodovibrio_sp_970 1 GCF_000228725
+s__Caldisphaera_lagunensis 1 GCF_000317795
+s__Streptococcus_sp_C300 1 GCF_000187645
+s__Tall_oatgrass_mosaic_virus 1 PRJNA226728
+s__Yersinia_intermedia 1 GCF_000168035
+s__Bacteroides_dorei 5 GCF_000273035 GCF_000273075 GCF_000158335 GCF_000273055 GCF_000156075
+s__Hahella_ganghwensis 1 GCF_000376785
+s__Capnocytophaga_cynodegmi 1 GCF_000379185
+s__Thioalkalimicrobium_cyclicum 1 GCF_000214825
+s__Synechococcus_elongatus 2 GCF_000012525 GCF_000010065
+s__Owenweeksia_hongkongensis 1 GCF_000236705
+s__Burkholderia_pseudomallei 35 GCF_000439695 GCF_000170595 GCF_000259775 GCF_000294635 GCF_000346205 GCF_000169715 GCF_000193475 GCF_000170555 GCF_000193455 GCF_000170455 GCF_000259735 GCF_000152685 GCF_000494855 GCF_000152325 GCF_000259795 GCF_000152365 GCF_000170535 GCF_000259815 GCF_000347975 GCF_000260515 GCF_000015925 GCF_000170575 GCF_000170515 GCF_000445385 GCF_000170435 GCF_000182445 GCF_000012785 GCF_000170415 GCF_000170475 GCF_000182585 GCF_000182195 GCF_000152345 GCF_00025975 [...]
+s__Mastigocladopsis_repens 1 GCF_000315565
+s__Thalassobacter_arenae 1 GCF_000442275
+s__Mitsuokella_sp_oral_taxon_131 1 GCF_000469545
+s__Cucumber_leaf_spot_virus 1 PRJNA16590
+s__Bacillus_licheniformis 8 GCF_000008425 GCF_000315975 GCF_000260535 GCF_000477395 GCF_000260555 GCF_000011645 GCF_000258125 GCF_000408885
+s__Opuntia_virus_X 1 PRJNA14956
+s__Mycobacterium_phage_ArcherS7 1 PRJNA206478
+s__Perkinsus_marinus 1 GCA_000006405
+s__Erythrobacter_litoralis 1 GCF_000013005
+s__Dietzia_alimentaria 1 GCF_000226215
+s__Enterobacteria_phage_vB_EcoM_FV3 1 PRJNA181219
+s__Enterococcus_phage_phiFL1A 1 PRJNA42789
+s__Tupaia_paramyxovirus 1 PRJNA14723
+s__Rhodococcus_sp_DK17 1 GCF_000263875
+s__Desulfosporosinus_meridiei 1 GCF_000231385
+s__Maricaulis_maris 1 GCF_000014745
+s__Oropouche_virus 1 PRJNA14943
+s__Pediococcus_claussenii 1 GCF_000237995
+s__Streptococcus_sp_GMD5S 1 GCF_000298715
+s__Prevotella_ruminicola 1 GCF_000025925
+s__Sphingomonas_sp_PR090111_T3T_6A 1 GCF_000383095
+s__Nocardioides_sp_JS614 1 GCF_000015265
+s__Erectites_yellow_mosaic_virus 1 PRJNA19787
+s__Yersinia_massiliensis 1 GCF_000312485
+s__Sebokele_virus_1 1 PRJNA208541
+s__Papaya_leaf_curl_China_virus_satellite_DNA_beta 1 PRJNA19819
+s__Streptosporangium_roseum 1 GCF_000024865
+s__Acinetobacter_sp_MDS7A 1 GCF_000386005
+s__Eidolon_polyomavirus_1 1 PRJNA185194
+s__Beet_mosaic_virus 1 PRJNA14942
+s__Mungbean_yellow_mosaic_virus 1 PRJNA14555
+s__Enterobacter_phage_Enc34 1 PRJNA181238
+s__Arracacha_virus_B 1 PRJNA196183
+s__Sweet_potato_leaf_curl_China_virus 1 PRJNA225011
+s__Abelson_murine_leukemia_virus 1 PRJNA14654
+s__Acinetobacter_sp_CIP_A165 1 GCF_000367985
+s__Acinetobacter_sp_CIP_A162 1 GCF_000367905
+s__Shewanella_amazonensis 1 GCF_000015245
+s__Gordonia_sputi 1 GCF_000248055
+s__Enterobacteria_phage_UAB_Phi78 1 PRJNA191121
+s__Striped_Jack_nervous_necrosis_virus 1 PRJNA14741
+s__Sweet_potato_mild_mottle_virus 1 PRJNA15340
+s__Staphylococcus_phage_tp310_1 1 PRJNA20659
+s__Staphylococcus_phage_vB_SauM_Remus 1 PRJNA215669
+s__Torque_teno_tamarin_virus 1 PRJNA48169
+s__Pantoea_sp_aB 1 GCF_000179655
+s__Pepper_yellow_leaf_curl_China_virus 1 PRJNA188776
+s__Fig_cryptic_virus 1 PRJNA66565
+s__Mycoplasma_mycoides 4 GCF_000011445 GCF_000253075 GCF_000339035 GCF_000143865
+s__Thermosinus_carboxydivorans 1 GCF_000169155
+s__Cucurbit_yellow_stunting_disorder_virus 1 PRJNA14890
+s__Enterobacteria_phage_Phi1 1 PRJNA20789
+s__Aura_virus 1 PRJNA14830
+s__Desulfovibrio_alkalitolerans 1 GCF_000422245
+s__Candidatus_Arthromitus_sp_SFB_5 1 GCF_000252745
+s__Candidatus_Arthromitus_sp_SFB_4 1 GCF_000252725
+s__Candidatus_Arthromitus_sp_SFB_3 1 GCF_000252705
+s__Candidatus_Arthromitus_sp_SFB_2 1 GCF_000252685
+s__Candidatus_Arthromitus_sp_SFB_1 1 GCF_000252665
+s__Ralstonia_sp_PBA 1 GCF_000272025
+s__Edwardsiella_tarda 5 GCF_000341505 GCF_000020865 GCF_000163955 GCF_000348565 GCF_000146305
+s__Acinetobacter_sp_HA 1 GCF_000264725
+s__Cyanobium_sp_PCC_7001 1 GCF_000155635
+s__Vulcanisaeta_moutnovskia 1 GCF_000190315
+s__Sclerotinia_sclerotiorum_hypovirulence_associated_DNA_virus_1 1 PRJNA39985
+s__Okra_enation_leaf_curl_alphasatellite 1 PRJNA184814
+s__Clostera_anastomosis_granulovirus 1 PRJNA226250
+s__Tomato_pseudo_curly_top_virus 1 PRJNA14582
+s__Leuconostoc_inhae 1 GCF_000166735
+s__Betapapillomavirus_1 1 PRJNA15511
+s__Cyanophage_P_RSM1 1 PRJNA198436
+s__Cyanophage_P_RSM6 1 PRJNA195506
+s__Betapapillomavirus_3 1 PRJNA15455
+s__Laribacter_hongkongensis 1 GCF_000021025
+s__Betapapillomavirus_2 1 PRJNA15456
+s__Japanese_eel_endothelial_cells_infecting_virus 1 PRJNA62749
+s__Coleus_blumei_viroid_2 1 PRJNA14783
+s__Synechococcus_sp_JA_3_3Ab 1 GCF_000013205
+s__Woodchuck_hepatitis_virus 1 PRJNA14212
+s__Merremia_mosaic_Puerto_Rico_virus 1 PRJNA66549
+s__Synechococcus_sp_RCC307 1 GCF_000063525
+s__Frankia_sp_EUN1f 1 GCF_000177675
+s__Cycloclasticus_sp_PY97M 1 GCF_000444935
+s__Betapapillomavirus_6 1 PRJNA68287
+s__Mycoplasma_mobile 1 GCF_000008365
+s__Murid_herpesvirus_8 1 PRJNA182227
+s__Murid_herpesvirus_4 1 PRJNA14458
+s__Clostridium_phage_phiC2 1 PRJNA19153
+s__Murid_herpesvirus_1 1 PRJNA15181
+s__Murid_herpesvirus_2 1 PRJNA14419
+s__Pseudomonas_phage_SN 1 PRJNA33327
+s__Thalassiosira_pseudonana 1 GCA_000149405
+s__Aeromonas_sp_MDS8 1 GCF_000388005
+s__Bacillus_bataviensis 1 GCF_000307875
+s__Acholeplasma_laidlawii 1 GCF_000018785
+s__Neisseria_bacilliformis 1 GCF_000194925
+s__Choristoneura_rosaceana_alphabaculovirus 1 PRJNA214178
+s__Ipomoea_yellow_vein_virus 1 PRJNA39615
+s__Halovirus_HRTV_4 1 PRJNA206493
+s__Haemophilus_phage_Aaphi23 1 PRJNA15228
+s__Methanothermococcus_okinawensis 1 GCF_000179575
+s__Penaeus_merguiensis_densovirus 1 PRJNA15556
+s__Serratia_odorifera 1 GCF_000163595
+s__Bifidobacterium_magnum 1 GCF_000420565
+s__Oligella_urethralis 1 GCF_000372065
+s__Halomonas_stevensii 1 GCF_000275725
+s__Selenomonas_ruminantium 1 GCF_000284095
+s__Synechococcus_phage_S_IOM18 1 PRJNA209067
+s__Sweet_potato_leaf_curl_Canary_virus 1 PRJNA41623
+s__Achromobacter_arsenitoxydans 1 GCF_000236785
+s__Oat_chlorotic_stunt_virus 1 PRJNA15081
+s__Bovine_parvovirus_2 1 PRJNA14553
+s__alpha_proteobacterium_SCGC_AAA280_P20 1 GCF_000371845
+s__Cupixi_virus 1 PRJNA28327
+s__Ntaya_virus 1 PRJNA176549
+s__Amycolatopsis_balhimycina 1 GCF_000384295
+s__Pseudovibrio_sp_JE062 1 GCF_000156235
+s__Peanut_clump_virus 1 PRJNA14776
+s__Ruminococcaceae_bacterium_D16 1 GCF_000177015
+s__Caldicellulosiruptor_kristjanssonii 1 GCF_000166695
+s__Mycobacterium_avium 40 GCF_000218095 GCF_000504845 GCF_000505015 GCF_000216015 GCF_000504745 GCF_000240525 GCF_000390085 GCF_000240465 GCF_000007865 GCF_000240405 GCF_000240425 GCF_000218055 GCF_000504785 GCF_000504865 GCF_000240345 GCF_000218135 GCF_000218155 GCF_000504765 GCF_000504945 GCF_000504925 GCF_000504885 GCF_000215815 GCF_000504825 GCF_000240445 GCF_000240385 GCF_000218115 GCF_000504975 GCF_000014985 GCF_000504905 GCF_000240485 GCF_000218075 GCF_000330785 GCF_000504725 GCF_ [...]
+s__Papaya_ringspot_virus 1 PRJNA15289
+s__Listeria_ivanovii 2 GCF_000183925 GCF_000252975
+s__Marinobacter_santoriniensis 1 GCF_000347775
+s__Halomonas_lutea 1 GCF_000378505
+s__Flavobacterium_sp_MS220_5C 1 GCF_000341755
+s__Campylobacter_sp_10_1_50 1 GCF_000238755
+s__Okra_yellow_crinkle_Cameroon_alphasatellite 1 PRJNA61907
+s__Aquimarina_agarilytica 1 GCF_000255455
+s__Candidatus_Halobonum_tyrrellensis 1 GCF_000495475
+s__Alcaligenes_sp_HPC1271 1 GCF_000313875
+s__Salmonella_phage_iEPS5 1 PRJNA212949
+s__Candidatus_Haloredivivus_sp_G17 1 GCF_000236195
+s__Leptosphaeria_maculans 1 GCA_000230375
+s__Geobacter_bemidjiensis 1 GCF_000020725
+s__Rhizoctonia_solani_virus_717 1 PRJNA14807
+s__Halomicrobium_mukohataei 1 GCF_000023965
+s__Prevotella_sp_C561 1 GCF_000224595
+s__Papaya_leaf_curl_virus 1 PRJNA14213
+s__Hepatitis_A_virus 1 PRJNA15308
+s__Pseudoclavibacter_faecalis 1 GCF_000381765
+s__Honeysuckle_ringspot_virus 1 PRJNA62211
+s__Actinobaculum_urinale 1 GCF_000420445
+s__Methanothermobacter_thermautotrophicus 1 GCF_000008645
+s__Chrysanthemum_stunt_viroid 1 PRJNA14968
+s__Gluconacetobacter_europaeus 3 GCF_000285295 GCF_000285335 GCF_000227545
+s__Botryotinia_fuckeliana 1 GCA_000143535
+s__Chlorogloeopsis_fritschii 1 GCF_000317285
+s__Papaya_mosaic_virus 1 PRJNA14700
+s__Dalechampia_chlorotic_mosaic_virus 1 PRJNA176616
+s__Leptospirillum_sp_Group_IV 1 GCF_000496115
+s__Streptobacillus_moniliformis 1 GCF_000024565
+s__Porphyromonas_sp_oral_taxon_278 1 GCF_000467855
+s__Porphyromonas_sp_oral_taxon_279 1 GCF_000292995
+s__Thauera_sp_28 1 GCF_000310145
+s__Methanobacterium_phage_psiM2 1 PRJNA14160
+s__Allium_virus_X 1 PRJNA34843
+s__Pseudomonas_sp_PAMC_25886 1 GCF_000242655
+s__Methanobacterium_sp_SWAN_1 1 GCF_000214725
+s__Bacillus_sp_JC63 1 GCF_000311725
+s__Gordonia_effusa 1 GCF_000241305
+s__Tomato_leaf_curl_Joydebpur_betasatellite 1 PRJNA28273
+s__Parsnip_yellow_fleck_virus 1 PRJNA15299
+s__Tomato_chlorotic_dwarf_viroid 1 PRJNA14973
+s__Simian_virus_12 1 PRJNA16189
+s__Tomato_leaf_curl_Laos_virus 1 PRJNA14244
+s__Haemophilus_phage_HP1 1 PRJNA14078
+s__Bacillus_pseudofirmus 1 GCF_000005825
+s__Acidaminococcus_intestini 1 GCF_000230275
+s__Staphylococcus_lentus 1 GCF_000286395
+s__Flavobacteria_bacterium_MS024_3C 1 GCF_000173115
+s__Rhodococcus_sp_JVH1 1 GCF_000280725
+s__Staphylococcus_phage_37 1 PRJNA15271
+s__Terracoccus_sp_273MFTsu3_1 1 GCF_000383675
+s__Thermoplasma_volcanium 1 GCF_000011185
+s__alpha_proteobacterium_SCGC_AAA027_C06 1 GCF_000364545
+s__Murine_leukemia_related_retroviruses 1 PRJNA16631
+s__Brevundimonas_subvibrioides 1 GCF_000144605
+s__Pandoraea_sp_B_6 1 GCF_000282835
+s__Orf_virus 1 PRJNA14464
+s__Enterococcus_malodoratus 2 GCF_000393875 GCF_000407185
+s__Grapevine_chrome_mosaic_virus 1 PRJNA15285
+s__Paenibacillus_lactis 1 GCF_000230915
+s__Adoxophyes_orana_granulovirus 1 PRJNA14298
+s__Bacillus_cellulosilyticus 1 GCF_000177235
+s__Streptococcus_orisratti 1 GCF_000380105
+s__Omsk_hemorrhagic_fever_virus 1 PRJNA14995
+s__Helicobacter_macacae 1 GCF_000507845
+s__Thermomicrobium_roseum 1 GCF_000021685
+s__Prevotella_sp_MSX73 1 GCF_000287635
+s__Ostreococcus_lucimarinus_virus_OlV1 1 PRJNA61011
+s__Ostreococcus_lucimarinus_virus_OlV5 1 PRJNA195483
+s__Clostridium_scindens 1 GCF_000154505
+s__Enterococcus_gallinarum 1 GCF_000157255
+s__Coprinopsis_cinerea 1 GCA_000182895
+s__Guinea_pig_Chlamydia_phage 1 PRJNA14012
+s__Staphylococcus_phage_3A 1 PRJNA15269
+s__Bacillus_pseudomycoides 1 GCF_000161455
+s__Bacteroides_coprophilus 1 GCF_000157915
+s__Bacillus_sp_5B6 1 GCF_000259405
+s__Bacteroides_stercoris 2 GCF_000154525 GCF_000413395
+s__Synechococcus_sp_PCC_7502 1 GCF_000317085
+s__Fulvimarina_pelagi 1 GCF_000153705
+s__Klebsiella_oxytoca 11 GCF_000492815 GCF_000276705 GCF_000247915 GCF_000252915 GCF_000492955 GCF_000240325 GCF_000247835 GCF_000247875 GCF_000247895 GCF_000269585 GCF_000247855
+s__Mouse_astrovirus_M_52_USA_2008 1 PRJNA72381
+s__Heron_hepatitis_B_virus 1 PRJNA15458
+s__Pseudoalteromonas_phage_RIO_1 1 PRJNA206039
+s__Desulfitobacterium_metallireducens 1 GCF_000231405
+s__Maricaulis_sp_JL2009 1 GCF_000412185
+s__Xanthomonas_arboricola 1 GCF_000306055
+s__Leishmania_braziliensis 1 GCA_000002845
+s__Pedobacter_saltans 1 GCF_000190735
+s__Eubacterium_brachy 1 GCF_000488855
+s__Clostridium_acetobutylicum 3 GCF_000008765 GCF_000218855 GCF_000191905
+s__Grapevine_virus_E 1 PRJNA30853
+s__Mycobacterium_phage_Cali 1 PRJNA31291
+s__Grapevine_virus_A 1 PRJNA15086
+s__Grapevine_virus_B 1 PRJNA15083
+s__Bacillus_cereus_thuringiensis 174 GCF_000013065 GCF_000293525 GCF_000293685 GCF_000161235 GCF_000290715 GCF_000387405 GCF_000293745 GCF_000291115 GCF_000161715 GCF_000399605 GCF_000290895 GCF_000291235 GCF_000399405 GCF_000161335 GCF_000292705 GCF_000290975 GCF_000021225 GCF_000399385 GCF_000399005 GCF_000399065 GCF_000181615 GCF_000161055 GCF_000398985 GCF_000398945 GCF_000399165 GCF_000291535 GCF_000300475 GCF_000291075 GCF_000290655 GCF_000338755 GCF_000291415 GCF_000290675 GCF_000 [...]
+s__Sphingobium_xenophagum 2 GCF_000367345 GCF_000277525
+s__freshwater_metagenome 1 GCF_000500915
+s__Blattella_germanica_densovirus 1 PRJNA14320
+s__Porcine_astrovirus_3 1 PRJNA181247
+s__Rabies_virus 1 PRJNA15144
+s__Sandfly_fever_Sicilian_virus 1 PRJNA66185
+s__Methanocaldococcus_villosus 2 GCF_000363885 GCF_000371805
+s__Anticarsia_gemmatalis_nucleopolyhedrovirus 1 PRJNA17995
+s__Mycobacterium_phage_U2 1 PRJNA20943
+s__Varroa_destructor_virus_1 1 PRJNA15121
+s__Enterococcus_faecalis 301 GCF_000390965 GCF_000391525 GCF_000393275 GCF_000394995 GCF_000393315 GCF_000147255 GCF_000391085 GCF_000396905 GCF_000393395 GCF_000391145 GCF_000147555 GCF_000396485 GCF_000159255 GCF_000294045 GCF_000294325 GCF_000294225 GCF_000415185 GCF_000294025 GCF_000294005 GCF_000390645 GCF_000394455 GCF_000415205 GCF_000394395 GCF_000415025 GCF_000396385 GCF_000390805 GCF_000148265 GCF_000394275 GCF_000392975 GCF_000395985 GCF_000394375 GCF_000393075 GCF_000391565 G [...]
+s__Sulfolobus_tokodaii 1 GCF_000011205
+s__Spodoptera_exigua_iflavirus_1 1 PRJNA77135
+s__Mycobacterium_sp_MOTT36Y 1 GCF_000262165
+s__Geobacillus_sp_Y412MC52 1 GCF_000174795
+s__Peanut_stunt_virus 1 PRJNA15471
+s__Haemophilus_phage_SuMu 1 PRJNA181066
+s__Streptomyces_coelicoflavus 1 GCF_000241835
+s__Methylosinus_sp_LW4 1 GCF_000379125
+s__Albidiferax_ferrireducens 1 GCF_000013605
+s__gamma_proteobacterium_SCGC_AAA001_B15 1 GCF_000213375
+s__Thioalkalivibrio_sp_ALSr1 1 GCF_000381945
+s__Leek_yellow_stripe_virus 1 PRJNA15184
+s__Tobacco_leaf_curl_Pusa_virus 1 PRJNA56021
+s__Mycovirus_FusoV 1 PRJNA14829
+s__Corallococcus_coralloides 1 GCF_000255295
+s__Cucumber_vein_yellowing_virus 1 PRJNA15153
+s__Vicia_faba_endornavirus 1 PRJNA16237
+s__Bovine_adenovirus_D 1 PRJNA14486
+s__Bovine_adenovirus_E 1 PRJNA185272
+s__Bovine_adenovirus_B 2 PRJNA14515 PRJNA40311
+s__Orenia_marismortui 1 GCF_000379025
+s__Desulfovibrio_sp_A2 1 GCF_000226255
+s__Culex_flavivirus 1 PRJNA18303
+s__Ranid_herpesvirus_1 1 PRJNA17181
+s__Bacillus_phage_SPP1 1 PRJNA14586
+s__Sida_yellow_vein_Vietnam_virus_satellite_DNA_beta 1 PRJNA19825
+s__Ralstonia_sp_5_2_56FAA 1 GCF_000227255
+s__Bacillus_phage_BCD7 1 PRJNA181220
+s__Bettongia_penicillata_papillomavirus_1 1 PRJNA48601
+s__Spirosoma_luteum 1 GCF_000374065
+s__Denitrovibrio_acetiphilus 1 GCF_000025725
+s__Clostridium_bartlettii 1 GCF_000154445
+s__Synechococcus_sp_WH_8016 1 GCF_000230675
+s__Listeria_monocytogenes 53 GCF_000093125 GCF_000465735 GCF_000209755 GCF_000008285 GCF_000195395 GCF_000382925 GCF_000196035 GCF_000306905 GCF_000465815 GCF_000258905 GCF_000168615 GCF_000318055 GCF_000210795 GCF_000168495 GCF_000168655 GCF_000382945 GCF_000212455 GCF_000307615 GCF_000168475 GCF_000307045 GCF_000021185 GCF_000307005 GCF_000197755 GCF_000168395 GCF_000168635 GCF_000306985 GCF_000022925 GCF_000465755 GCF_000465775 GCF_000465795 GCF_000168555 GCF_000026705 GCF_000168415 G [...]
+s__Meyerozyma_guilliermondii 1 GCA_000149425
+s__Marinimicrobia_bacterium_SCGC_AAA076_M08 1 GCF_000402675
+s__Erwinia_tasmaniensis 1 GCF_000026185
+s__Okra_yellow_crinkle_virus 1 PRJNA17807
+s__Leeuwenhoekiella_blandensis 1 GCF_000152985
+s__Capnocytophaga_sp_oral_taxon_412 1 GCF_000271925
+s__Gluconacetobacter_xylinus 1 GCF_000182745
+s__Arthrobacter_sp_162MFSha1_1 1 GCF_000374905
+s__Broome_virus 1 PRJNA49651
+s__Pepper_leaf_curl_virus_satellite_DNA_beta 1 PRJNA28283
+s__Citrus_leaf_blotch_virus 1 PRJNA14825
+s__Novosphingobium_nitrogenifigens 1 GCF_000375445
+s__Thalassospira_lucentensis 1 GCF_000421265
+s__Erysipelotrichaceae_bacterium_6_1_45 1 GCF_000242175
+s__Clostridium_perfringens 11 GCF_000171135 GCF_000171215 GCF_000171195 GCF_000243175 GCF_000013285 GCF_000171175 GCF_000255475 GCF_000171155 GCF_000009685 GCF_000013845 GCF_000172455
+s__Burkholderia_phage_Bcep176 1 PRJNA16102
+s__Saccharomyces_cerevisiae_virus_L_A 1 PRJNA14792
+s__Cassava_vein_mosaic_virus 1 PRJNA14056
+s__Staphylococcus_phage_PT1028 1 PRJNA15262
+s__Iotapapillomavirus_1 1 PRJNA14022
+s__Bacillus_sp_WBUNB001 1 GCF_000309525
+s__Kordiimonas_gwangyangensis 1 GCF_000375545
+s__actinobacterium_SCGC_AAA024_D14 1 GCF_000372025
+s__Lactobacillus_gasseri 8 GCF_000155935 GCF_000406345 GCF_000177415 GCF_000014425 GCF_000439915 GCF_000175055 GCF_000283135 GCF_000143645
+s__Dehalobacter_sp_FTH1 1 GCF_000372005
+s__Olive_mild_mosaic_virus 1 PRJNA15159
+s__Treponema_pallidum 10 GCF_000246755 GCF_000410555 GCF_000008605 GCF_000246815 GCF_000246775 GCF_000387485 GCF_000304295 GCF_000246795 GCF_000024485 GCF_000410535
+s__Phaeocystis_globosa_virus 1 PRJNA206023
+s__Natrinema_versiforme 1 GCF_000337195
+s__Flavobacterium_psychrophilum 1 GCF_000064305
+s__Carnobacterium_sp_WN1359 1 GCF_000493735
+s__Caldicellulosiruptor_obsidiansis 1 GCF_000145215
+s__Tomato_curly_stunt_virus 1 PRJNA14267
+s__Cotton_leaf_curl_Gezira_beta 1 PRJNA20565
+s__Glaciibacter_superstes 1 GCF_000421145
+s__Pepper_leaf_curl_Yunnan_virus_YN323 1 PRJNA29413
+s__Panicum_streak_virus 1 PRJNA14076
+s__Streptomyces_sp_CNQ766 1 GCF_000377105
+s__Nariva_virus 1 PRJNA167112
+s__Mycobacterium_sp_JDM601 1 GCF_000214155
+s__Quail_picornavirus_QPV1_HUN_2010 1 PRJNA77133
+s__Lactobacillus_otakiensis 1 GCF_000415925
+s__Isoptericola_variabilis 1 GCF_000215105
+s__Podospora_anserina 1 GCA_000226545
+s__Rice_yellow_stunt_virus 1 PRJNA14793
+s__Amycolatopsis_orientalis 1 GCF_000400635
+s__Lily_symptomless_virus 1 PRJNA15015
+s__Oceanibulbus_indolifex 1 GCF_000172095
+s__Synechococcus_sp_PCC_7335 1 GCF_000155595
+s__Synechococcus_sp_PCC_7336 1 GCF_000332275
+s__Nitratireductor_indicus 1 GCF_000300515
+s__Prevotella_amnii 2 GCF_000177355 GCF_000378745
+s__Streptomyces_sp_DvalAA_83 1 GCF_000382745
+s__Cotton_leaf_curl_Allahabad_virus 1 PRJNA61851
+s__Bacteroides_vulgatus 4 GCF_000273295 GCF_000012825 GCF_000178195 GCF_000403235
+s__Staphylococcus_saprophyticus 2 GCF_000251125 GCF_000010125
+s__Turnip_ringspot_virus 1 PRJNA40327
+s__Marine_Group_III_euryarchaeote_SCGC_AAA007_O11 1 GCF_000372505
+s__Staphylococcus_phage_phi_12 1 PRJNA14247
+s__Aggregatibacter_aphrophilus 3 GCF_000231255 GCF_000226495 GCF_000022985
+s__Citricoccus_sp_CH26A 1 GCF_000224415
+s__Subterranean_clover_mottle_virus 1 PRJNA15403
+s__Bacteriovorax_sp_Seq25_V 1 GCF_000447795
+s__Ageratum_yellow_vein_China_betasatellite 1 PRJNA15515
+s__Prevotella_multiformis 1 GCF_000191065
+s__Halothermothrix_orenii 1 GCF_000020485
+s__Okra_leaf_curl_virus 1 PRJNA39605
+s__Salmonella_enterica 522 GCF_000414805 GCF_000272895 GCF_000329305 GCF_000505305 GCF_000329045 GCF_000231625 GCF_000486165 GCF_000487855 GCF_000487735 GCF_000189195 GCF_000231605 GCF_000272975 GCF_000487775 GCF_000189075 GCF_000486265 GCF_000231685 GCF_000272875 GCF_000486995 GCF_000231525 GCF_000484455 GCF_000494385 GCF_000329285 GCF_000483855 GCF_000484195 GCF_000272915 GCF_000231585 GCF_000272935 GCF_000020925 GCF_000335895 GCF_000271885 GCF_000258365 GCF_000484055 GCF_000500025 GCF [...]
+s__Zygocactus_virus_X 1 PRJNA14955
+s__Dracaena_mottle_virus 1 PRJNA16799
+s__Rice_tungro_spherical_virus 1 PRJNA15332
+s__Megamonas_rupellensis 1 GCF_000378365
+s__Streptococcus_ovis 1 GCF_000380125
+s__TGP_Carmovirus_1 1 PRJNA64491
+s__Delftia_sp_Cs1_4 1 GCF_000214395
+s__Nootka_lupine_vein_clearing_virus 1 PRJNA18853
+s__marine_actinobacterium_PHSC20C1 1 GCF_000153145
+s__Red_clover_cryptic_virus_1 1 PRJNA225924
+s__Acinetobacter_sp_P8_3_8 1 GCF_000214135
+s__Bacillus_phage_phi29 1 PRJNA30615
+s__Red_clover_cryptic_virus_2 1 PRJNA198686
+s__Pasteurella_multocida 21 GCF_000412105 GCF_000412075 GCF_000296345 GCF_000478235 GCF_000259545 GCF_000298675 GCF_000412015 GCF_000291645 GCF_000255915 GCF_000412035 GCF_000291625 GCF_000413135 GCF_000219335 GCF_000006825 GCF_000219315 GCF_000234745 GCF_000298655 GCF_000291605 GCF_000412125 GCF_000409915 GCF_000469095
+s__Butyrivibrio_crossotus 1 GCF_000156015
+s__Chilli_ringspot_virus 1 PRJNA73825
+s__Rhizobium_grahamii 1 GCF_000298315
+s__Cryptosporidium_parvum 1 GCA_000165345
+s__Lactococcus_phage_c2 1 PRJNA14029
+s__Achromobacter_xylosoxidans 3 GCF_000165835 GCF_000219745 GCF_000186185
+s__Apoi_virus 1 PRJNA15369
+s__Caulobacter_segnis 1 GCF_000092285
+s__Mycoplasma_arthritidis 1 GCF_000020065
+s__Empedobacter_brevis 1 GCF_000382425
+s__Potato_mop_top_virus 1 PRJNA14789
+s__Cronobacter_universalis 1 GCF_000319325
+s__Enterobacteria_phage_M 1 PRJNA183161
+s__Prevotella_sp_oral_taxon_317 1 GCF_000162415
+s__Candidatus_Carsonella_ruddii 2 GCF_000287275 GCF_000010365
+s__Thermodesulfobacterium_geofontis 1 GCF_000215975
+s__Streptomyces_phage_Lika 1 PRJNA206037
+s__Agromyces_italicus 1 GCF_000421545
+s__Streptococcus_phage_O1205 1 PRJNA14226
+s__Anguillid_herpesvirus_1 1 PRJNA42931
+s__Lachnospiraceae_oral_taxon_107 1 GCF_000209465
+s__Staphylococcus_phage_phiNM 1 PRJNA18293
+s__Acidilobus_saccharovorans 1 GCF_000144915
+s__Bdellovibrio_exovorus 1 GCF_000348725
+s__Pasteurella_dagmatis 1 GCF_000163475
+s__Flavobacterium_sp_SCGC_AAA160_P02 1 GCF_000383355
+s__Staphylococcus_intermedius 1 GCF_000308095
+s__Murine_adenovirus_A 2 PRJNA14519 PRJNA40319
+s__Staphylococcus_carnosus 1 GCF_000009405
+s__Murine_adenovirus_C 1 PRJNA37713
+s__Murine_adenovirus_B 1 PRJNA61855
+s__Geobacillus_caldoxylosilyticus 1 GCF_000313345
+s__Agarivorans_albus 1 GCF_000414175
+s__Lactobacillus_zeae 1 GCF_000260435
+s__Prevotella_micans 2 GCF_000243035 GCF_000373705
+s__Valsa_ceratosperma_hypovirus_1 1 PRJNA157807
+s__Slackia_sp_CM382 1 GCF_000293015
+s__Klebsiella_phage_K11 1 PRJNA62963
+s__Rickettsia_parkeri 1 GCF_000284195
+s__Pseudomonas_phage_201phi2_1 1 PRJNA30097
+s__Geobacillus_phage_GBSV1 1 PRJNA17775
+s__Acinetobacter_nectaris 1 GCF_000488215
+s__Aeromonas_veronii 6 GCF_000204115 GCF_000298035 GCF_000297995 GCF_000464515 GCF_000297975 GCF_000298015
+s__Candidatus_Solibacter_usitatus 1 GCF_000014905
+s__Acinetobacter_sp_WC_743 1 GCF_000335555
+s__Hippea_alviniae 1 GCF_000420385
+s__Rice_yellow_mottle_virus_satellite 1 PRJNA14152
+s__Vibrio_natriegens 1 GCF_000417905
+s__Thioalkalivibrio_sp_ALE28 1 GCF_000377425
+s__Nitrosococcus_oceani 2 GCF_000155655 GCF_000012805
+s__Bartonella_alsatica 1 GCF_000280015
+s__Thioalkalivibrio_sp_ALE20 1 GCF_000381405
+s__Thioalkalivibrio_sp_ALE23 1 GCF_000378545
+s__Thioalkalivibrio_sp_ALE22 1 GCF_000381445
+s__Thioalkalivibrio_sp_ALE25 1 GCF_000377285
+s__Thioalkalivibrio_sp_ALE27 1 GCF_000377485
+s__Streptococcus_canis 1 GCF_000268305
+s__Geitlerinema_sp_PCC_7407 1 GCF_000317045
+s__Enterobacteria_phage_cdtI 1 PRJNA19737
+s__Torque_teno_felis_virus 1 PRJNA48143
+s__Thermosediminibacter_oceani 1 GCF_000144645
+s__Cyclobacterium_qasimii 1 GCF_000427295
+s__Cronobacter_phage_vB_CsaM_GAP32 1 PRJNA179410
+s__Cronobacter_phage_vB_CsaM_GAP31 1 PRJNA179409
+s__Citrus_bark_cracking_viroid 1 PRJNA14757
+s__Geobacillus_sp_MAS1 1 GCF_000498995
+s__Sweet_potato_leaf_curl_Bengal_virus 1 PRJNA42745
+s__Human_papillomavirus_type_129 1 PRJNA62175
+s__Salmonella_phage_Jersey 1 PRJNA212713
+s__Raspberry_leaf_mottle_virus 1 PRJNA18275
+s__Clostridium_sp_ASF356 1 GCF_000364165
+s__Oscillatoria_formosa 1 GCF_000332155
+s__Culex_nigripalpus_nucleopolyhedrovirus 1 PRJNA14128
+s__Lactococcus_phage_TP901_1 1 PRJNA14116
+s__Tobacco_necrosis_virus_D 1 PRJNA14747
+s__Azorhizobium_caulinodans 1 GCF_000010525
+s__Tobacco_necrosis_virus_A 1 PRJNA15146
+s__Sporobolus_striate_mosaic_virus_2 1 PRJNA174776
+s__Sporobolus_striate_mosaic_virus_1 1 PRJNA174775
+s__Porcine_partetravirus 1 PRJNA215864
+s__Leuconostoc_lactis 2 GCF_000185085 GCF_000179875
+s__Bartonella_melophagi 1 GCF_000278255
+s__Acinetobacter_ursingii 4 GCF_000368825 GCF_000369885 GCF_000368845 GCF_000248135
+s__Thermoproteus_tenax_spherical_virus_1 1 PRJNA14540
+s__Mycobacterium_phage_Omega 1 PRJNA14273
+s__Amycolatopsis_nigrescens 1 GCF_000384315
+s__Catenovulum_agarivorans 1 GCF_000281085
+s__Hardenbergia_mosaic_virus 1 PRJNA65811
+s__Clostridium_phage_vB_CpeS_CP51 1 PRJNA206487
+s__Haloferax_denitrificans 1 GCF_000337795
+s__Carnation_mottle_virus 1 PRJNA14993
+s__Rheinheimera_nanhaiensis 1 GCF_000296695
+s__Canine_distemper_virus 1 PRJNA15002
+s__Capnocytophaga_sp_oral_taxon_336 1 GCF_000411575
+s__Pseudomonas_phage_phikF77 1 PRJNA36373
+s__Capnocytophaga_sp_oral_taxon_335 1 GCF_000277665
+s__Capnocytophaga_sp_oral_taxon_338 1 GCF_000192225
+s__Halovirus_HSTV_2 1 PRJNA186951
+s__Strawberry_latent_ringspot_virus 1 PRJNA15167
+s__Columbid_circovirus 1 PRJNA14437
+s__Tick_borne_encephalitis_virus 1 PRJNA15335
+s__Erwinia_phage_phiEt88 1 PRJNA64765
+s__Halorubrum_pleomorphic_virus_6 1 PRJNA157261
+s__Cryocola_sp_340MFSha3_1 1 GCF_000383315
+s__Tobacco_yellow_dwarf_virus 1 PRJNA14181
+s__Porphyromonas_gingivalis 12 GCF_000467995 GCF_000503975 GCF_000271945 GCF_000007585 GCF_000467955 GCF_000380305 GCF_000467815 GCF_000467835 GCF_000270225 GCF_000467795 GCF_000467975 GCF_000010505
+s__Tropheryma_whipplei 2 GCF_000196075 GCF_000007485
+s__Methanopyrus_kandleri 1 GCF_000007185
+s__Neisseria_shayeganii 1 GCF_000226875
+s__Sandarakinorhabdus_limnophila 1 GCF_000420765
+s__Streptococcus_sp_F0442 1 GCF_000314795
+s__Streptococcus_sp_F0441 1 GCF_000314775
+s__Magpie_robin_coronavirus_HKU18 1 PRJNA109275
+s__Pseudoalteromonas_sp_PAMC_22718 1 GCF_000263075
+s__Blattabacterium_sp_Blattella_germanica 1 GCF_000022605
+s__Micromonospora_sp_L5 1 GCF_000177655
+s__Lactobacillus_phage_phiAQ113 1 PRJNA188466
+s__Ryegrass_mosaic_virus 1 PRJNA15344
+s__Thioalkalivibrio_sp_ALE9 1 GCF_000377445
+s__Bovine_papillomavirus_7 1 PRJNA16202
+s__Bacillus_phage_AP50 1 PRJNA32599
+s__Nocardia_asteroides 1 GCF_000308355
+s__Streptococcus_iniae 3 GCF_000403625 GCF_000331915 GCF_000300915
+s__Malachra_yellow_vein_mosaic_virus_associated_satellite_DNA_beta 1 PRJNA28727
+s__Synechocystis_sp_PCC_7509 1 GCF_000332075
+s__Clostridium_sp_ATCC_BAA_442 1 GCF_000466445
+s__Haloarcula_argentinensis 1 GCF_000336895
+s__Staphylococcus_phage_55 1 PRJNA15276
+s__Haloarcula_sinaiiensis 1 GCF_000337275
+s__Streptomyces_scabiei 1 GCF_000091305
+s__Vibrio_phage_JA_1 1 PRJNA209075
+s__Flavobacterium_sp_F52 1 GCF_000278705
+s__Clostridiales_bacterium_BV3C26 1 GCF_000478985
+s__Acinetobacter_sp_NIPH_298 1 GCF_000369505
+s__Listeria_phage_A500 1 PRJNA20791
+s__Dysgonomonas_mossii 2 GCF_000376405 GCF_000213575
+s__Borrelia_turicatae 1 GCF_000012085
+s__Pseudomonas_psychrophila 1 GCF_000282975
+s__Methanobrevibacter_ruminantium 1 GCF_000024185
+s__Bacillus_pumilus 4 GCF_000299555 GCF_000017885 GCF_000225935 GCF_000172815
+s__Alteromonadales_bacterium_TW_7 1 GCF_000169055
+s__Treponema_pedis 1 GCF_000447675
+s__Human_erythrovirus_V9 1 PRJNA14224
+s__Corynebacterium_durum 1 GCF_000318135
+s__Oceanibaculum_indicum 1 GCF_000299935
+s__Microbacterium_maritypicum 1 GCF_000455825
+s__Tomato_ringspot_virus 1 PRJNA15300
+s__Parvibaculum_lavamentivorans 1 GCF_000017565
+s__Moumouvirus 1 PRJNA186430
+s__Burkholderia_phenoliruptrix 1 GCF_000300095
+s__Pariacoto_virus 1 PRJNA14785
+s__Mouse_parvovirus_5a 1 PRJNA33007
+s__Lactobacillus_phage_ATCC_8014_B1 1 PRJNA184150
+s__Moloney_murine_sarcoma_virus 1 PRJNA14721
+s__Naumovozyma_castellii 1 GCA_000237345
+s__Pennisetum_mosaic_virus 1 PRJNA15447
+s__Methylobacter_marinus 1 GCF_000383855
+s__Cupriavidus_sp_HPC_L 1 GCF_000307735
+s__Haloarcula_japonica 1 GCF_000336635
+s__Methanotorris_igneus 1 GCF_000214415
+s__Passionfruit_severe_leaf_distortion_virus 1 PRJNA38459
+s__Micromonas_pusilla_virus_12T 1 PRJNA195482
+s__Sida_mosaic_Bolivia_virus_2 1 PRJNA62475
+s__Sida_mosaic_Bolivia_virus_1 1 PRJNA62477
+s__Pseudomonad_phage_gh_1 1 PRJNA14265
+s__Yaba_monkey_tumor_virus 1 PRJNA14466
+s__Curtobacterium_sp_B8 1 GCF_000333315
+s__Human_papillomavirus_type_140 1 PRJNA167868
+s__Pyrococcus_yayanosii 1 GCF_000215995
+s__Human_papillomavirus_type_144 1 PRJNA167869
+s__Mycobacterium_phage_Chy4 1 PRJNA206477
+s__Prevotella_nigrescens 3 GCF_000220235 GCF_000507825 GCF_000336235
+s__Bougainvillea_spectabilis_chlorotic_vein_banding_virus 1 PRJNA32823
+s__Thermanaerovibrio_acidaminovorans 1 GCF_000024905
+s__Pseudomonas_sp_S9 1 GCF_000222125
+s__Sapporo_virus 3 PRJNA14952 PRJNA15040 PRJNA15048
+s__Agrobacterium_fabrum 1 GCF_000092025
+s__Lactobacillus_gigeriorum 1 GCF_000296855
+s__Neisseria_meningitidis 174 GCF_000392695 GCF_000328145 GCF_000386185 GCF_000387345 GCF_000191285 GCF_000367485 GCF_000191265 GCF_000386885 GCF_000387385 GCF_000293285 GCF_000386845 GCF_000413175 GCF_000448125 GCF_000293465 GCF_000386685 GCF_000386905 GCF_000448185 GCF_000293405 GCF_000328225 GCF_000448245 GCF_000448085 GCF_000327805 GCF_000327865 GCF_000327765 GCF_000327705 GCF_000386105 GCF_000327965 GCF_000392355 GCF_000327645 GCF_000293365 GCF_000387185 GCF_000327545 GCF_000191245 [...]
+s__Bacillus_clausii 1 GCF_000009825
+s__Serinicoccus_profundi 1 GCF_000224715
+s__Kappapapillomavirus_2 1 PRJNA14075
+s__Salinibacterium_sp_PAMC_21357 1 GCF_000247645
+s__Haloarcula_phage_SH1 1 PRJNA15535
+s__Sputnik_virophage 1 PRJNA30929
+s__Pseudomonas_phage_vB_Pae_TbilisiM32 1 PRJNA167051
+s__Acinetobacter_sp_CIP_70_18 1 GCF_000369525
+s__Thiomicrospira_arctica 1 GCF_000381085
+s__Burkholderia_terrae 1 GCF_000265115
+s__Legionella_shakespearei 1 GCF_000373765
+s__Acidovorax_sp_JS42 1 GCF_000015545
+s__Avian_adeno_associated_virus_ATCC_VR_865 1 PRJNA14456
+s__Dichelobacter_nodosus 1 GCF_000015345
+s__Oat_mosaic_virus 1 PRJNA15391
+s__Capnocytophaga_sputigena 1 GCF_000173675
+s__Bean_common_mosaic_necrosis_virus 1 PRJNA15333
+s__Yersinia_rohdei 1 GCF_000173775
+s__Vibrio_gazogenes 1 GCF_000390165
+s__Proteiniphilum_acetatigenes 1 GCF_000380985
+s__Murine_pneumotropic_virus 1 PRJNA14071
+s__Chlorobaculum_parvum 1 GCF_000020505
+s__Clostridium_hylemonae 1 GCF_000156515
+s__Sulfitobacter_sp_EE_36 1 GCF_000152605
+s__Flavobacteria_bacterium_BAL38 1 GCF_000169355
+s__His_1_virus 1 PRJNA16650
+s__Stenotrophomonas_phage_IME15 1 PRJNA179429
+s__Red_clover_vein_mosaic_virus 1 PRJNA34841
+s__Kurthia_sp_Dielmo 1 GCF_000307285
+s__Halorubrum_saccharovorum 1 GCF_000337915
+s__Beggiatoa_alba 1 GCF_000245015
+s__Pseudorhodobacter_ferrugineus 1 GCF_000420745
+s__Desulfococcus_multivorans 1 GCF_000422185
+s__Bacillus_sp_CPSM8 1 GCF_000409505
+s__Colwellia_psychrerythraea 1 GCF_000012325
+s__Bacillus_sp_BT1B_CT2 1 GCF_000186125
+s__Alternanthera_yellow_vein_virus 1 PRJNA15560
+s__Streptomyces_albulus 1 GCF_000403765
+s__Beet_mild_curly_top_virus 1 PRJNA14282
+s__Vibrio_brasiliensis 1 GCF_000189255
+s__Paracoccus_zeaxanthinifaciens 1 GCF_000420145
+s__Methylotenera_sp_1P_1 1 GCF_000384355
+s__Bartonella_sp_OS02 1 GCF_000312545
+s__Actinomyces_vaccimaxillae 1 GCF_000420425
+s__Acinetobacter_sp_ADP1 1 GCF_000046845
+s__Mogibacterium_sp_CM50 1 GCF_000293155
+s__Barley_yellow_dwarf_virus_MAV 1 PRJNA14781
+s__Streptococcus_salivarius 7 GCF_000225385 GCF_000253335 GCF_000174715 GCF_000305335 GCF_000253315 GCF_000257585 GCF_000286295
+s__Wissadula_golden_mosaic_virus 1 PRJNA30167
+s__Oribacterium_sp_oral_taxon_078 2 GCF_000160135 GCF_000469565
+s__Staphylococcus_phage_phiSauS_IPLA35 1 PRJNA32997
+s__Acinetobacter_sp_CIP_102082 1 GCF_000368365
+s__Vibrio_phage_ICP1 1 PRJNA63229
+s__Tomato_leaf_curl_Iran_virus 1 PRJNA14474
+s__Vibrio_phage_ICP2 1 PRJNA63231
+s__Ustilago_maydis 1 GCA_000328475
+s__Lactobacillus_hominis 1 GCF_000296835
+s__Halyomorpha_halys_virus 1 PRJNA225920
+s__Cylindrospermopsis_raciborskii 1 GCF_000175835
+s__candidate_division_TM7_single_cell_isolate_TM7b 1 GCF_000170655
+s__candidate_division_TM7_single_cell_isolate_TM7c 1 GCF_000170675
+s__Streptococcus_infantis 5 GCF_000215385 GCF_000187465 GCF_000223335 GCF_000223255 GCF_000260755
+s__Cotton_leaf_curl_Multan_alphasatellite 1 PRJNA169228
+s__Ljungan_virus 1 PRJNA15401
+s__Methanoplanus_limicola 1 GCF_000243255
+s__Stx2_converting_phage_86 1 PRJNA17979
+s__Burkholderia_rhizoxinica 1 GCF_000198775
+s__Virgibacillus_sp_CM_4 1 GCF_000445495
+s__Chlamydia_pneumoniae 5 GCF_000007205 GCF_000008745 GCF_000024145 GCF_000091085 GCF_000011165
+s__Kyrpidia_tusciae 1 GCF_000092905
+s__Novosphingobium_sp_AP12 1 GCF_000281975
+s__Mycobacterium_phage_Reprobate 1 PRJNA215118
+s__Mycobacterium_phage_Astraea 1 PRJNA206480
+s__Streptococcus_sp_GMD1S 1 GCF_000296875
+s__Human_parechovirus 1 PRJNA15357
+s__Sulfolobus_turreted_icosahedral_virus 1 PRJNA14401
+s__Mycoplasma_parvum 1 GCF_000477415
+s__Porphyromonas_cansulci 1 GCF_000509265
+s__Gluconobacter_thailandicus 1 GCF_000344115
+s__Torque_teno_douroucouli_virus 1 PRJNA48173
+s__Desulfosporosinus_orientis 1 GCF_000235605
+s__Cereal_yellow_dwarf_virus_RPS 1 PRJNA14691
+s__Soil_borne_wheat_mosaic_virus 1 PRJNA14661
+s__Cereal_yellow_dwarf_virus_RPV 1 PRJNA14883
+s__Caldibacillus_debilis 1 GCF_000383875
+s__Geobacter_sulfurreducens 2 GCF_000210155 GCF_000007985
+s__Alstroemeria_virus_x 1 PRJNA15687
+s__Enterococcus_caccae 2 GCF_000394055 GCF_000407145
+s__Capnocytophaga_gingivalis 1 GCF_000174755
+s__Bean_leafroll_virus 1 PRJNA14734
+s__Sugarcane_streak_Reunion_virus 1 PRJNA14303
+s__Escherichia_phage_vB_EcoP_G7C 1 PRJNA72371
+s__Enterobacter_phage_EcP1 1 PRJNA181071
+s__Mycobacterium_phage_Fredward 1 PRJNA227008
+s__Pseudoalteromonas_citrea 1 GCF_000238375
+s__Halococcus_saccharolyticus 1 GCF_000336915
+s__Pasteurella_bettyae 1 GCF_000262245
+s__Streptococcus_phage_phiNJ2 1 PRJNA179424
+s__Methylobacterium_sp_285MFTsu5_1 1 GCF_000383455
+s__Pseudomonas_phage_LUZ19 1 PRJNA28741
+s__Calditerrivibrio_nitroreducens 1 GCF_000183405
+s__Acidaminococcus_sp_D21 1 GCF_000174215
+s__Mycoplasma_orale 1 GCF_000420105
+s__Streptomyces_sp_CNY243 1 GCF_000377165
+s__Aggregatibacter_actinomycetemcomitans 20 GCF_000241025 GCF_000241005 GCF_000332915 GCF_000226715 GCF_000259915 GCF_000226795 GCF_000318155 GCF_000226835 GCF_000146265 GCF_000226755 GCF_000372365 GCF_000332895 GCF_000226775 GCF_000226855 GCF_000226735 GCF_000226815 GCF_000163615 GCF_000240985 GCF_000332935 GCF_000332955
+s__Macroptilium_yellow_spot_virus 1 PRJNA124059
+s__Streptomyces_sp_PVA_94_07 1 GCF_000495755
+s__Psychrobacter_sp_PRwf_1 1 GCF_000016885
+s__Morelia_spilota_papillomavirus_1 1 PRJNA73439
+s__Cronobacter_dublinensis 2 GCF_000319495 GCF_000319345
+s__Pipistrellus_bat_coronavirus_HKU5 1 PRJNA18865
+s__Vibrio_phage_KVP40 1 PRJNA14416
+s__Soybean_crinkle_leaf_virus 1 PRJNA14149
+s__Lactococcus_phage_BK5_T 1 PRJNA15244
+s__Helicobacter_pullorum 1 GCF_000155495
+s__Cedecea_davisae 1 GCF_000412335
+s__Marine_Group_I_thaumarchaeote_SCGC_AB_629_I23 1 GCF_000399765
+s__Listeria_welshimeri 1 GCF_000060285
+s__Peanut_stunt_virus_satellite_RNA 1 PRJNA14502
+s__Tomato_bushy_stunt_virus 1 PRJNA15147
+s__Acheta_domesticus_mini_ambidensovirus 1 PRJNA223005
+s__Nocardiopsis_alkaliphila 1 GCF_000341005
+s__Cestrum_yellow_leaf_curling_virus 1 PRJNA14470
+s__Thermoanaerobacter_thermohydrosulfuricus 1 GCF_000353265
+s__Pseudoalteromonas_sp_SM9913 1 GCF_000184065
+s__Human_coronavirus_HKU1 1 PRJNA15139
+s__Glaciecola_sp_HTCC2999 1 GCF_000155775
+s__Cowpea_mottle_virus 1 PRJNA14755
+s__Escherichia_phage_N4 1 PRJNA18511
+s__Campylobacter_hominis 1 GCF_000017585
+s__Gentian_Kobu_sho_associated_virus 1 PRJNA189210
+s__Providencia_rustigianii 1 GCF_000156395
+s__Enterobacteria_phage_EPS7 1 PRJNA29287
+s__Lactobacillus_sp_ASF360 1 GCF_000364185
+s__Malvastrum_yellow_mosaic_virus_satellite_DNA_beta 1 PRJNA18133
+s__Francisella_noatunensis 1 GCF_000262205
+s__Helicobacter_phage_phiHP33 1 PRJNA80923
+s__Malvastrum_yellow_vein_virus 1 PRJNA14252
+s__Clostridium_beijerinckii 2 GCF_000016965 GCF_000280535
+s__Amphibacillus_jilinensis 1 GCF_000306965
+s__Gryllus_bimaculatus_nudivirus 1 PRJNA19181
+s__Mycobacterium_phage_BTCU_1 1 PRJNA209077
+s__Neorickettsia_risticii 1 GCF_000022525
+s__Sulfuricella_denitrificans 1 GCF_000297055
+s__Gracilibacillus_halophilus 1 GCF_000359605
+s__Clostridium_phage_phiCP26F 1 PRJNA181251
+s__Pirital_virus 1 PRJNA14919
+s__Treponema_brennaborense 1 GCF_000212415
+s__Fig_badnavirus_1 1 PRJNA162487
+s__Brevicoryne_brassicae_picorna_like_virus 1 PRJNA19753
+s__Myxoma_virus 1 PRJNA14396
+s__Rubrivivax_gelatinosus 1 GCF_000284255
+s__Cotton_leaf_crumple_virus 1 PRJNA14541
+s__Tomato_yellow_leaf_curl_Yunnan_virus 1 PRJNA206466
+s__Canine_picornavirus 1 PRJNA89397
+s__actinobacterium_SCGC_AAA027_J17 1 GCF_000383815
+s__Leptospira_biflexa 2 GCF_000017685 GCF_000017605
+s__Neisseria_sp_GT4A_CT1 1 GCF_000227275
+s__Clostridium_sp_M62_1 1 GCF_000159055
+s__Acinetobacter_sp_NIPH_899 1 GCF_000368385
+s__Actinomyces_naeslundii 1 GCF_000285995
+s__Bacteroides_pyogenes 1 GCF_000466505
+s__Hyphomicrobium_nitrativorans 1 GCF_000503895
+s__Coprococcus_sp_ART55_1 1 GCF_000210595
+s__Propionibacterium_propionicum 1 GCF_000277715
+s__Coprococcus_sp_HPP0074 1 GCF_000411335
+s__Haloarcula_hispanica_pleomorphic_virus_1 1 PRJNA43589
+s__Cynomolgus_macaque_cytomegalovirus_strain_Ottawa 1 PRJNA76697
+s__Flavobacterium_frigoris 1 GCF_000252125
+s__Yaniella_halotolerans 1 GCF_000420805
+s__Spiribacter_sp_UAH_SP71 1 GCF_000485905
+s__Burkholderia_phage_Bcep1 1 PRJNA14409
+s__Tomato_leaf_curl_New_Delhi_virus 1 PRJNA14243
+s__Bacillus_phage_Gamma 1 PRJNA15783
+s__Bacteroides_coprosuis 1 GCF_000212915
+s__Staphylococcus_sp_E463 1 GCF_000316945
+s__Sweet_clover_necrotic_mosaic_virus 1 PRJNA14809
+s__Archaeoglobus_profundus 1 GCF_000025285
+s__Rhodanobacter_spathiphylli 1 GCF_000264295
+s__Sulfolobus_spindle_shaped_virus_6 1 PRJNA42355
+s__Dicliptera_yellow_mottle_virus 1 PRJNA14185
+s__Mupapillomavirus_2 1 PRJNA15486
+s__Mupapillomavirus_1 1 PRJNA15491
+s__Human_coronavirus_NL63 1 PRJNA14960
+s__Vibrio_phage_VPMS1 1 PRJNA212709
+s__Alicyclobacillus_pohliae 1 GCF_000376225
+s__Staphylococcus_simulans 2 GCF_000477455 GCF_000314755
+s__Honeysuckle_yellow_vein_betasatellite 1 PRJNA14620
+s__Blastomonas_sp_AAP53 1 GCF_000331245
+s__Thiomicrospira_halophila 1 GCF_000384235
+s__Streptomyces_turgidiscabies 1 GCF_000331005
+s__Lachnospiraceae_bacterium_8_1_57FAA 1 GCF_000185545
+s__Actinomyces_graevenitzii 2 GCF_000239695 GCF_000466185
+s__TTV_like_mini_virus 1 PRJNA193982
+s__Human_papillomavirus 1 PRJNA215652
+s__Equine_arteritis_virus 1 PRJNA15383
+s__Rhodococcus_jostii 1 GCF_000014565
+s__Synechococcus_phage_S_RIM2 1 PRJNA195488
+s__Verminephrobacter_aporrectodeae 1 GCF_000193225
+s__Streptococcus_dysgalactiae 8 GCF_000493775 GCF_000188715 GCF_000214575 GCF_000188315 GCF_000010705 GCF_000317855 GCF_000221105 GCF_000307185
+s__Enterococcus_casseliflavus 7 GCF_000191365 GCF_000393915 GCF_000407405 GCF_000157215 GCF_000273565 GCF_000157295 GCF_000414945
+s__Synechococcus_phage_S_RIM8 1 PRJNA192853
+s__Thauera_sp_63 1 GCF_000310165
+s__Duck_adenovirus_A 2 PRJNA14520 PRJNA40313
+s__Rhizobium_sp_JGI_0001005_K05 1 GCF_000376185
+s__Psychrobacter_lutiphocae 1 GCF_000382145
+s__Coprothermobacter_proteolyticus 1 GCF_000020945
+s__Pantoea_sp_SL1_M5 1 GCF_000220605
+s__Cyanophage_PSS2 1 PRJNA39613
+s__Southern_rice_black_streaked_dwarf_virus 1 PRJNA60383
+s__Bidens_mottle_virus 1 PRJNA50559
+s__Arhodomonas_aquaeolei 1 GCF_000374645
+s__Lactococcus_phage_jm2 1 PRJNA213074
+s__Lactococcus_phage_jm3 1 PRJNA213075
+s__Pediococcus_pentosaceus 2 GCF_000285875 GCF_000014505
+s__Sida_yellow_blotch_virus 1 PRJNA189216
+s__Pseudomonas_sp_GM24 1 GCF_000282235
+s__Pseudomonas_sp_GM25 1 GCF_000282255
+s__Pseudomonas_sp_GM21 1 GCF_000282215
+s__Human_immunodeficiency_virus_2 1 PRJNA14991
+s__Human_immunodeficiency_virus_1 1 PRJNA15476
+s__Desulfitobacterium_hafniense 5 GCF_000379505 GCF_000378805 GCF_000010045 GCF_000021925 GCF_000238035
+s__Eggplant_mosaic_virus 1 PRJNA14639
+s__Marseillevirus 1 PRJNA43573
+s__Propionibacterium_phage_PHL111M01 1 PRJNA219111
+s__Staphylococcus_phage_77 1 PRJNA14352
+s__Homalodisca_coagulata_virus_1 1 PRJNA16797
+s__Clostridium_celatum 1 GCF_000320405
+s__Cassava_virus_C 1 PRJNA39977
+s__Bebaru_virus 1 PRJNA88121
+s__Encephalitozoon_cuniculi 1 GCA_000091225
+s__Vibrio_cyclitrophicus 21 GCF_000256425 GCF_000256135 GCF_000256165 GCF_000256265 GCF_000256345 GCF_000256285 GCF_000256115 GCF_000256185 GCF_000256405 GCF_000247005 GCF_000256305 GCF_000256095 GCF_000256205 GCF_000256605 GCF_000473545 GCF_000256445 GCF_000256245 GCF_000256325 GCF_000256365 GCF_000256465 GCF_000256385
+s__Streptococcus_phage_Sfi11 1 PRJNA14054
+s__Diuris_virus_A 1 PRJNA178591
+s__Enterobacteria_phage_lambda 1 PRJNA14204
+s__Streptococcus_phage_Sfi19 1 PRJNA14045
+s__Candida_albicans 1 GCA_000182965
+s__Arthrobacter_sp_PAO19 1 GCF_000414345
+s__Blastococcus_saxobsidens 1 GCF_000284015
+s__Staphylococcus_aureus 436 GCF_000361605 GCF_000363125 GCF_000360545 GCF_000330825 GCF_000361665 GCF_000362745 GCF_000260015 GCF_000360525 GCF_000175475 GCF_000361965 GCF_000361905 GCF_000362185 GCF_000262955 GCF_000215405 GCF_000360245 GCF_000359785 GCF_000360485 GCF_000361765 GCF_000239615 GCF_000248935 GCF_000363225 GCF_000360425 GCF_000362205 GCF_000361845 GCF_000360305 GCF_000276625 GCF_000507725 GCF_000239655 GCF_000364085 GCF_000360085 GCF_000248535 GCF_000361365 GCF_000248675 G [...]
+s__Thermogladius_cellulolyticus 1 GCF_000264495
+s__Spirosoma_spitsbergense 1 GCF_000374085
+s__Andean_potato_mild_mosaic_virus 1 PRJNA192605
+s__Aspergillus_niger 1 GCA_000002855
+s__Algoriphagus_machipongonensis 1 GCF_000166275
+s__Haliscomenobacter_hydrossis 1 GCF_000212735
+s__Clostridium_phage_phi8074_B1 1 PRJNA184148
+s__Geodermatophilus_obscurus 1 GCF_000025345
+s__Mycobacterium_hassiacum 2 GCF_000300375 GCF_000379865
+s__Invertebrate_iridovirus_22 1 PRJNA213479
+s__actinobacterium_SCGC_AAA023_D18 1 GCF_000378925
+s__Alistipes_sp_HGB5 1 GCF_000183485
+s__Lymantria_dispar_multiple_nucleopolyhedrovirus 1 PRJNA14390
+s__Enterococcus_sp_GMD4E 1 GCF_000296935
+s__Sweet_potato_golden_vein_associated_virus 1 PRJNA65275
+s__Anaeroglobus_geminatus 1 GCF_000239275
+s__Vibrio_phage_henriette_12B8 1 PRJNA198435
+s__Shewanella_sediminis 1 GCF_000018025
+s__Enterobacteria_phage_St_1 1 PRJNA38669
+s__Vibrio_phage_VP882 1 PRJNA18851
+s__Burkholderia_sp_TJI49 1 GCF_000191945
+s__Haloterrigena_turkmenica 1 GCF_000025325
+s__Enterobacteria_phage_IME10 1 PRJNA181235
+s__Triatoma_virus 1 PRJNA14802
+s__Salmonella_phage_FelixO1 1 PRJNA14323
+s__Corynebacterium_sp_KPL1821 1 GCF_000478115
+s__Abalone_shriveling_syndrome_associated_virus 1 PRJNA33141
+s__Rice_ragged_stunt_virus 1 PRJNA14794
+s__Mimosa_yellow_leaf_curl_virus 1 PRJNA19781
+s__Luminiphilus_syltensis 1 GCF_000158175
+s__Vibrio_phage_kappa 1 PRJNA28503
+s__Pseudomonas_avellanae 1 GCF_000302915
+s__Vibrio_orientalis 2 GCF_000222645 GCF_000176235
+s__Brachybacterium_paraconglomeratum 1 GCF_000233655
+s__Listeria_seeligeri 2 GCF_000183965 GCF_000027145
+s__Eyach_virus 1 PRJNA14786
+s__Geobacillus_thermoglucosidasius 2 GCF_000178395 GCF_000258725
+s__Rickettsia_endosymbiont_of_Ixodes_scapularis 1 GCF_000160735
+s__Salinivibrio_phage_CW02 1 PRJNA181992
+s__Rhizobium_sp_Pop5 1 GCF_000295895
+s__Streptococcus_infantarius 2 GCF_000154985 GCF_000246835
+s__Small_anellovirus 2 PRJNA15252 PRJNA15253
+s__Kushneria_aurantia 1 GCF_000382245
+s__Citrus_leprosis_virus_C 1 PRJNA17095
+s__Mycobacterium_phage_Predator 1 PRJNA30611
+s__Synechococcus_phage_Syn5 1 PRJNA19763
+s__Listonella_phage_phiHSIC 1 PRJNA15173
+s__Escherichia_phage_bV_EcoS_AKFV33 1 PRJNA167572
+s__Novosphingobium_lindaniclasticum 1 GCF_000445125
+s__Lactobacillus_farciminis 1 GCF_000184535
+s__Salmonella_phage_SPN9CC 1 PRJNA167665
+s__Epizootic_hemorrhagic_disease_virus 1 PRJNA41081
+s__Liberibacter_crescens 1 GCF_000325745
+s__Brugmansia_mosaic_virus 1 PRJNA186429
+s__Helicobacter_canis 1 GCF_000507865
+s__Prochlorococcus_phage_P_SSM3 1 PRJNA209210
+s__Prochlorococcus_phage_P_SSM2 1 PRJNA15135
+s__Prochlorococcus_phage_P_SSM4 1 PRJNA15136
+s__Prochlorococcus_phage_P_SSM7 1 PRJNA64717
+s__Bacteroides_thetaiotaomicron 2 GCF_000403155 GCF_000011065
+s__Rickettsia_australis 1 GCF_000284155
+s__Arthrobacter_chlorophenolicus 1 GCF_000022025
+s__Bifidobacterium_adolescentis 2 GCF_000154085 GCF_000010425
+s__Cowpea_severe_leaf_curl_associated_DNA_beta 1 PRJNA15157
+s__Treponema_paraluiscuniculi 1 GCF_000217655
+s__Pseudomonas_phage_Bf7 1 PRJNA82647
+s__Avian_myelocytomatosis_virus 1 PRJNA14909
+s__Enterobacteria_phage_YYZ_2008 1 PRJNA32231
+s__Caldicellulosiruptor_bescii 1 GCF_000022325
+s__Bean_necrotic_mosaic_virus 2 PRJNA168523 PRJNA168596
+s__Lucerne_transient_streak_virus_satellite_RNA 1 PRJNA14501
+s__Treponema_primitia 2 GCF_000214375 GCF_000297095
+s__Myceliophthora_thermophila 1 GCA_000226095
+s__Ureibacillus_thermosphaericus 1 GCF_000284835
+s__Eubacteriaceae_bacterium_ACC19a 1 GCF_000238115
+s__Gluconacetobacter_oboediens 1 GCF_000227565
+s__Burkholderia_phage_KL3 1 PRJNA64565
+s__Pseudomonas_phage_PaP2 1 PRJNA14377
+s__Pseudomonas_phage_PaP3 1 PRJNA14322
+s__Pseudomonas_phage_PaP1 1 PRJNA184153
+s__Hordeum_mosaic_virus 1 PRJNA15064
+s__Sutterella_wadsworthensis 3 GCF_000297775 GCF_000411515 GCF_000186505
+s__Spirochaeta_africana 1 GCF_000242595
+s__Synechococcus_sp_CB0205 1 GCF_000179255
+s__Tomato_dwarf_leaf_virus 1 PRJNA81031
+s__Pasteurella_phage_F108 1 PRJNA17113
+s__Feline_calicivirus 1 PRJNA14877
+s__Brucella_pinnipedialis 4 GCF_000158675 GCF_000157815 GCF_000221005 GCF_000157795
+s__Bradyrhizobium_sp 1 GCF_000239795
+s__Methyloversatilis_sp_NVD 1 GCF_000372885
+s__Erwinia_amylovora 11 GCF_000367665 GCF_000367545 GCF_000027205 GCF_000240705 GCF_000367605 GCF_000367585 GCF_000367645 GCF_000367625 GCF_000091565 GCF_000367685 GCF_000367565
+s__Highlands_J_virus 1 PRJNA37281
+s__Geobacter_uraniireducens 1 GCF_000016745
+s__Leptospira_sp_Fiocruz_LV4135 1 GCF_000346675
+s__Salipiger_mucosus 1 GCF_000442255
+s__Shewanella_sp_HN_41 1 GCF_000217915
+s__Fusobacterium_necrophorum 4 GCF_000292975 GCF_000158295 GCF_000242215 GCF_000262225
+s__Escherichia_Stx1_converting_phage 1 PRJNA14293
+s__Gallid_herpesvirus_3 1 PRJNA14103
+s__Escherichia_sp_TW10509 1 GCF_000208545
+s__Gallid_herpesvirus_1 1 PRJNA14566
+s__Blueberry_necrotic_ring_blotch_virus 1 PRJNA74579
+s__Streptomyces_ipomoeae 1 GCF_000317595
+s__Acinetobacter_sp_NIPH_284 1 GCF_000369425
+s__Mycoplasma_pulmonis 1 GCF_000195875
+s__Halomonas_anticariensis 1 GCF_000409775
+s__Halorhabdus_tiamatea 1 GCF_000215915
+s__Porcine_parvovirus_4 1 PRJNA60137
+s__Desulfurivibrio_alkaliphilus 1 GCF_000092205
+s__Gyrovirus_4 1 PRJNA172459
+s__Rhodococcus_sp_P14 1 GCF_000256505
+s__Pan_troglodytes_schweinfurthii_polyomavirus_2 1 PRJNA183905
+s__Yam_mosaic_virus 1 PRJNA14884
+s__Johnsongrass_chlorotic_stripe_mosaic_virus 1 PRJNA14904
+s__Butyrivibrio_sp_AD3002 1 GCF_000420905
+s__Neisseria_sicca 4 GCF_000193735 GCF_000174655 GCF_000193755 GCF_000260655
+s__Prevotella_sp_F0091 1 GCF_000467895
+s__Salmonella_phage_ST64B 1 PRJNA14228
+s__Clostridium_sp_SY8519 1 GCF_000270305
+s__Lachnobacterium_bovis 1 GCF_000421025
+s__Gordonia_araii 1 GCF_000241265
+s__Turnip_yellows_virus 1 PRJNA15072
+s__Sulfolobus_solfataricus 3 GCF_000024745 GCF_000175555 GCF_000007005
+s__Natrialba_asiatica 1 GCF_000337555
+s__Mycobacterium_phage_Solon 1 PRJNA31287
+s__Ruminococcus_albus 2 GCF_000178155 GCF_000179635
+s__East_African_cassava_mosaic_Kenya_virus 1 PRJNA32329
+s__Perlucidibaca_piscinae 1 GCF_000420045
+s__Enterobacteria_phage_I2_2 1 PRJNA14572
+s__Ovine_lentivirus 1 PRJNA14668
+s__Leptotrichia_sp_oral_taxon_225 1 GCF_000469525
+s__Novosphingobium_sp_B_7 1 GCF_000410615
+s__Bavariicoccus_seileri 1 GCF_000421665
+s__Abalone_herpesvirus_Victoria_AUS_2009 1 PRJNA177933
+s__Bacteroides_sp_4_3_47FAA 1 GCF_000158515
+s__Fervidobacterium_nodosum 1 GCF_000017545
+s__Facklamia_hominis 2 GCF_000301035 GCF_000413455
+s__Aspergillus_oryzae 1 GCA_000184455
+s__Gordonia_rhizosphera 1 GCF_000298195
+s__Pichinde_virus 1 PRJNA15008
+s__Sphingopyxis_sp_MC1 1 GCF_000371385
+s__Asticcacaulis_excentricus 1 GCF_000175215
+s__Golden_Gate_virus 1 PRJNA173354
+s__Rhodobacter_phage_RC1 1 PRJNA195479
+s__Sphingobacterium_sp_IITKGP_BTPF85 1 GCF_000447275
+s__Fusarium_graminearum_dsRNA_mycovirus_1 1 PRJNA15154
+s__Mucispirillum_schaedleri 1 GCF_000487995
+s__Pseudomonas_phage_PAJU2 1 PRJNA32249
+s__Capraria_yellow_spot_Yucatan_virus 1 PRJNA214689
+s__Wheat_dwarf_virus 2 PRJNA15478 PRJNA30035
+s__Clostridium_phage_phi3626 1 PRJNA14166
+s__Human_herpesvirus_6A 1 PRJNA14462
+s__Cellulophaga_phage_phi18_1 1 PRJNA212959
+s__Spiroplasma_phage_4 1 PRJNA14161
+s__Cellulophaga_phage_phi18_3 1 PRJNA212960
+s__Thogoto_virus 1 PRJNA15043
+s__Duganella_zoogloeoides 1 GCF_000383895
+s__Erysipelotrichaceae_bacterium_21_3 1 GCF_000242195
+s__Mycobacterium_phage_AnnaL29 1 PRJNA215671
+s__Lindernia_anagallis_yellow_vein_virus_satellite_DNA_beta 1 PRJNA19831
+s__Bat_coronavirus_1A 1 PRJNA29247
+s__Staphylococcus_phage_phi7401PVL 1 PRJNA188545
+s__Mumps_virus 1 PRJNA15059
+s__Alloprevotella_tannerae 1 GCF_000159995
+s__Cellulomonas_flavigena 1 GCF_000092865
+s__Thermoanaerobacter_sp_X513 1 GCF_000148425
+s__Burkholderia_phage_phiE255 1 PRJNA19165
+s__Cocksfoot_streak_virus 1 PRJNA15399
+s__Thermoanaerobacter_sp_X514 1 GCF_000019065
+s__Geobacter_lovleyi 1 GCF_000020385
+s__Kiloniella_laminariae 1 GCF_000374005
+s__Enterovirus_G 1 PRJNA15396
+s__Arthrobacter_gangotriensis 1 GCF_000348945
+s__Stretch_Lagoon_orbivirus 1 PRJNA39971
+s__Enterobacter_sp_SST3 1 GCF_000286655
+s__Millerozyma_farinosa 1 GCA_000315895
+s__Sulfurimonas_autotrophica 1 GCF_000147355
+s__Coleus_blumei_viroid_3 1 PRJNA14784
+s__Nitrococcus_mobilis 1 GCF_000153205
+s__Coleus_blumei_viroid_1 1 PRJNA14782
+s__Coleus_blumei_viroid_6 1 PRJNA38541
+s__Coleus_blumei_viroid_5 1 PRJNA34737
+s__Streptomyces_phage_SV1 1 PRJNA177523
+s__Leptospira_sp_Fiocruz_LV3954 1 GCF_000306435
+s__Desulfomicrobium_baculatum 1 GCF_000023225
+s__Streptococcus_entericus 1 GCF_000380025
+s__Finch_polyomavirus 1 PRJNA16655
+s__Avian_leukosis_virus 1 PRJNA14633
+s__Pseudomonas_geniculata 1 GCF_000258575
+s__Pirellula_staleyi 1 GCF_000025185
+s__Tomato_leaf_curl_Seychelles_virus 1 PRJNA18869
+s__Bartonella_bacilliformis 2 GCF_000015445 GCF_000311905
+s__Cucumber_mosaic_virus 1 PRJNA15470
+s__Macacine_herpesvirus_3 1 PRJNA14468
+s__Macacine_herpesvirus_1 1 PRJNA14489
+s__Pseudomonas_sp_G5_2012 1 GCF_000408945
+s__Turkey_adenovirus_B 1 PRJNA53557
+s__Macacine_herpesvirus_5 1 PRJNA14423
+s__Macacine_herpesvirus_4 1 PRJNA14467
+s__Peach_mosaic_virus 1 PRJNA32727
+s__Lactobacillus_phage_Lb338_1 1 PRJNA36611
+s__Thermus_phage_TMA 1 PRJNA72385
+s__Enterobacteria_phage_RB43 1 PRJNA15417
+s__Enterobacteria_phage_RB49 1 PRJNA14301
+s__Acinetobacter_sp_CIP_56_2 1 GCF_000368445
+s__Mycobacterium_sp_360MFTsu5_1 1 GCF_000383495
+s__Mycoplasma_suis 2 GCF_000203215 GCF_000179035
+s__Ochrobactrum_anthropi 2 GCF_000251205 GCF_000017405
+s__Parabacteroides_sp_D25 1 GCF_000307475
+s__Aequorivita_sublithincola 1 GCF_000265385
+s__Prosthecochloris_aestuarii 1 GCF_000020625
+s__marine_gamma_proteobacterium_HTCC2148 1 GCF_000156295
+s__Lettuce_big_vein_associated_virus 1 PRJNA32725
+s__marine_gamma_proteobacterium_HTCC2143 1 GCF_000169075
+s__Microbacterium_sp_UCD_TDU 1 GCF_000340625
+s__Sphingobium_chinhatense 1 GCF_000421925
+s__alpha_proteobacterium_HIMB114 1 GCF_000163555
+s__Pseudomonas_phage_JBD88a 1 PRJNA188544
+s__Pseudomonas_entomophila 1 GCF_000026105
+s__Echinicola_pacifica 1 GCF_000373245
+s__Oceanicola_batsensis 1 GCF_000152725
+s__Bitter_gourd_leaf_curl_betasatellite 1 PRJNA16245
+s__Mycoplasma_hyopneumoniae 5 GCF_000427215 GCF_000008225 GCF_000400855 GCF_000008405 GCF_000008205
+s__Ruegeria_sp_TM1040 1 GCF_000014065
+s__Acidovorax_sp_MR_S7 1 GCF_000400995
+s__Burkholderia_sp_WSM4176 1 GCF_000372945
+s__Emiliania_huxleyi_virus_86 1 PRJNA15618
+s__Wheat_streak_mosaic_virus 1 PRJNA15354
+s__Soybean_yellow_common_mosaic_virus 1 PRJNA73551
+s__Pseudomonas_poae 1 GCF_000336465
+s__Marine_Group_III_euryarchaeote_SCGC_AAA288_E19 1 GCF_000382725
+s__Enterobacteria_phage_Bp7 1 PRJNA181212
+s__Coraliomargarita_akajimensis 1 GCF_000025905
+s__Afipia_felis 1 GCF_000314735
+s__Siegesbeckia_yellow_vein_Guangxi_virus 1 PRJNA17595
+s__Thermaerobacter_marianensis 1 GCF_000184705
+s__Midway_virus 1 PRJNA38097
+s__Streptococcus_lutetiensis 1 GCF_000441535
+s__Corynebacterium_lubricantis 1 GCF_000379425
+s__Megasphaera_sp_BL7 1 GCF_000417525
+s__Mycobacterium_phage_Goku 1 PRJNA215672
+s__Olsenella_uli 1 GCF_000143845
+s__Oligotropha_carboxidovorans 3 GCF_000021365 GCF_000218585 GCF_000218565
+s__Sweet_potato_leaf_curl_South_Carolina_virus 1 PRJNA65201
+s__Pseudomonas_fulva 1 GCF_000213805
+s__Ophiostoma_mitovirus_4 1 PRJNA14842
+s__Chlamydophila_sp_08_1274_3 1 GCF_000471025
+s__Corynebacterium_pseudotuberculosis 15 GCF_000241855 GCF_000227175 GCF_000259155 GCF_000143705 GCF_000144935 GCF_000255935 GCF_000152065 GCF_000265545 GCF_000144675 GCF_000233735 GCF_000227605 GCF_000221625 GCF_000263755 GCF_000258385 GCF_000248375
+s__Lactobacillus_pentosus 1 GCF_000271445
+s__Methylobacillus_flagellatus 1 GCF_000013705
+s__Enterobacteria_phage_vB_EcoM_ACG_C40 1 PRJNA179415
+s__Enterococcus_haemoperoxidus 2 GCF_000393995 GCF_000407165
+s__Uukuniemi_virus 1 PRJNA14902
+s__Corynebacterium_matruchotii 2 GCF_000175375 GCF_000158635
+s__Pelosinus_sp_HCF1 1 GCF_000317005
+s__Saccharomonospora_xinjiangensis 1 GCF_000258175
+s__Chroococcidiopsis_thermalis 1 GCF_000317125
+s__Ngaingan_virus 1 PRJNA46715
+s__Streptococcus_sobrinus 46 GCF_000228445 GCF_000228485 GCF_000227825 GCF_000228365 GCF_000228225 GCF_000227965 GCF_000228125 GCF_000228425 GCF_000228165 GCF_000228345 GCF_000228385 GCF_000227885 GCF_000228245 GCF_000227985 GCF_000228305 GCF_000228545 GCF_000228625 GCF_000228025 GCF_000227865 GCF_000227905 GCF_000228505 GCF_000228325 GCF_000228405 GCF_000228645 GCF_000227785 GCF_000228605 GCF_000228205 GCF_000227845 GCF_000228085 GCF_000228465 GCF_000228145 GCF_000467915 GCF_000228665 G [...]
+s__Hop_trefoil_cryptic_virus_2 1 PRJNA198687
+s__Chilli_veinal_mottle_virus 1 PRJNA15225
+s__Tolypocladium_cylindrosporum_virus_1 1 PRJNA61451
+s__Raspberry_latent_virus 1 PRJNA56055
+s__Micromonospora_sp_ATCC_39149 1 GCF_000158815
+s__Desulfovibrio_sp_FW1012B 1 GCF_000177215
+s__Brucella_sp_UK40_99 1 GCF_000371065
+s__Equine_infectious_anemia_virus 1 PRJNA14684
+s__Apple_fruit_crinkle_viroid 1 PRJNA14964
+s__Telosma_mosaic_virus 1 PRJNA20621
+s__Megasphaera_elsdenii 1 GCF_000283495
+s__Mirafiori_lettuce_big_vein_virus 1 PRJNA14886
+s__Leishmania_RNA_virus_1_1 1 PRJNA14666
+s__Leishmania_RNA_virus_1_4 1 PRJNA14761
+s__Circovirus_like_genome_CB_B 1 PRJNA39629
+s__Circovirus_like_genome_CB_A 1 PRJNA39627
+s__Tomato_leaf_curl_Hainan_virus 1 PRJNA39931
+s__Tistrella_mobilis 1 GCF_000264455
+s__Shigella_phage_SfII 1 PRJNA213070
+s__Shigella_phage_SfIV 1 PRJNA227000
+s__Chaetoceros_salsugineum_DNA_virus 1 PRJNA15497
+s__Chelatococcus_sp_GW1 1 GCF_000283095
+s__Brevibacillus_brevis 3 GCF_000010165 GCF_000296715 GCF_000346255
+s__Japanese_encephalitis_virus 1 PRJNA15310
+s__Chlamydophila_caviae 1 GCF_000007605
+s__Halarchaeum_acidiphilum 2 GCF_000474235 GCF_000400975
+s__Mesotoga_sp_PhosAc3 1 GCF_000367705
+s__Rivularia_sp_PCC_7116 1 GCF_000316665
+s__Human_bocavirus 1 PRJNA15895
+s__Tomato_mottle_mosaic_virus 1 PRJNA217881
+s__Mycobacterium_yongonense 1 GCF_000418535
+s__Treponema_medium 1 GCF_000413035
+s__Halovirus_HF1 1 PRJNA14294
+s__Nitrobacter_winogradskyi 1 GCF_000012725
+s__Streptomyces_auratus 1 GCF_000280865
+s__Paenibacillus_phage_phiIBB_Pl23 1 PRJNA213072
+s__Streptomyces_venezuelae 1 GCF_000253235
+s__Aeromonas_salmonicida 3 GCF_000234845 GCF_000447435 GCF_000196395
+s__Pseudoalteromonas_luteoviolacea 2 GCF_000333235 GCF_000495575
+s__alpha_proteobacterium_SCGC_AAA027_L15 1 GCF_000371865
+s__Prunus_necrotic_ringspot_virus 1 PRJNA14866
+s__Mycobacterium_sp_H4Y 1 GCF_000364405
+s__Caldithrix_abyssi 1 GCF_000241815
+s__Erwinia_billingiae 1 GCF_000196615
+s__Amycolatopsis_mediterranei 4 GCF_000196835 GCF_000282715 GCF_000454025 GCF_000220945
+s__Burdock_mottle_virus 1 PRJNA212310
+s__Bhendi_yellow_vein_Delhi_virus 1 PRJNA33677
+s__Tomato_yellow_leaf_curl_Mali_virus_associated_DNA_beta 1 PRJNA15995
+s__Candida_dubliniensis 1 GCA_000026945
+s__Ugandan_cassava_brown_streak_virus 1 PRJNA61097
+s__Intrasporangium_calvum 1 GCF_000184685
+s__endosymbiont_of_Bathymodiolus_sp 1 GCF_000297135
+s__Peach_chlorotic_mottle_virus 1 PRJNA20977
+s__Rhododendron_virus_A 1 PRJNA51905
+s__Enterovirus_J 2 PRJNA29255 PRJNA42941
+s__Enterovirus_H 1 PRJNA15371
+s__Enterovirus_B 1 PRJNA15321
+s__Enterovirus_C 1 PRJNA15288
+s__Haloplasma_contractile 1 GCF_000215935
+s__Enterovirus_F 1 PRJNA203090
+s__Odoribacter_laneus 1 GCF_000243215
+s__Enterovirus_D 1 PRJNA15297
+s__Enterovirus_E 1 PRJNA15351
+s__Tobacco_leaf_curl_Kochi_virus 1 PRJNA14400
+s__Dermacoccus_sp_Ellin185 1 GCF_000152185
+s__Tomato_leaf_curl_Kumasi_virus 1 PRJNA30837
+s__Leucobacter_chromiiresistens 1 GCF_000231305
+s__Streptomyces_sp_e14 1 GCF_000162775
+s__Sulfitobacter_phage_EE36phi1 1 PRJNA38079
+s__Thalassospira_xiamenensis 1 GCF_000300235
+s__Mycoplasma_yeatsii 1 GCF_000380285
+s__Eupatorium_yellow_vein_mosaic_virus 1 PRJNA14171
+s__Tomato_black_ring_virus 1 PRJNA14871
+s__Pieris_rapae_granulovirus 1 PRJNA45911
+s__Ornithobacterium_rhinotracheale 1 GCF_000265465
+s__Bacillus_subtilis 21 GCF_000349795 GCF_000293765 GCF_000245035 GCF_000385985 GCF_000209795 GCF_000186745 GCF_000230755 GCF_000338735 GCF_000227465 GCF_000177595 GCF_000344745 GCF_000340295 GCF_000321395 GCF_000227485 GCF_000183765 GCF_000155375 GCF_000341775 GCF_000332645 GCF_000497485 GCF_000146565 GCF_000245295
+s__Bacillus_endophyticus 1 GCF_000283255
+s__Lactobacillus_delbrueckii 10 GCF_000182835 GCF_000192165 GCF_000056065 GCF_000179375 GCF_000284695 GCF_000284715 GCF_000014405 GCF_000409675 GCF_000191165 GCF_000387565
+s__Kocuria_atrinae 1 GCF_000286355
+s__Neorickettsia_sennetsu 1 GCF_000013165
+s__Actinomyces_neuii 1 GCF_000296485
+s__Mycobacterium_phage_Peaches 1 PRJNA42939
+s__Desulfotignum_phosphitoxidans 1 GCF_000350545
+s__Sphingobium_baderi 1 GCF_000445145
+s__Salinispora_pacifica 20 GCF_000383575 GCF_000378845 GCF_000374705 GCF_000375265 GCF_000374685 GCF_000374745 GCF_000383995 GCF_000374785 GCF_000375285 GCF_000375225 GCF_000373825 GCF_000374665 GCF_000384095 GCF_000374765 GCF_000374725 GCF_000377025 GCF_000375305 GCF_000379065 GCF_000378825 GCF_000375245
+s__Gemella_bergeri 1 GCF_000469465
+s__Dietzia_sp_UCD_THP 1 GCF_000349585
+s__Vernonia_yellow_vein_betasatellite 1 PRJNA41303
+s__Wesselsbron_virus 1 PRJNA38295
+s__Schizophyllum_commune 1 GCA_000143185
+s__French_bean_leaf_curl_betasatellite_Kanpur 1 PRJNA169556
+s__Shigella_sonnei 7 GCF_000092525 GCF_000268045 GCF_000268225 GCF_000268005 GCF_000283715 GCF_000281815 GCF_000188795
+s__Klebsiella_phage_JD001 1 PRJNA188547
+s__Spiribacter_salinus 1 GCF_000319575
+s__Burkholderia_phage_BcepF1 1 PRJNA18857
+s__Tomato_torrado_virus 1 PRJNA18831
+s__Mesta_yellow_vein_mosaic_virus 1 PRJNA18967
+s__Aneurinibacillus_aneurinilyticus 1 GCF_000466385
+s__Alteromonas_phage_vB_AmaP_AD45_P 1 PRJNA209073
+s__Kangiella_aquimarina 1 GCF_000374105
+s__Sweet_potato_leaf_curl_virus 1 PRJNA15461
+s__Enterobacteria_phage_ID2_Moscow_ID_2001 1 PRJNA16591
+s__Rahnella_sp_Y9602 1 GCF_000187705
+s__Methanothermobacter_marburgensis 1 GCF_000145295
+s__Yersinia_phage_L_413C 1 PRJNA14280
+s__Boolarra_virus 1 PRJNA14850
+s__Tomato_leaf_curl_Bangladesh_betasatellite 1 PRJNA56017
+s__Groundnut_rosette_virus 1 PRJNA14762
+s__Stx2_converting_phage_II 1 PRJNA14310
+s__Streptomyces_canus 1 GCF_000383615
+s__White_eye_coronavirus_HKU16 1 PRJNA109273
+s__Dietzia_cinnamea 1 GCF_000186325
+s__Smithella_sp_ME_1 1 GCF_000495415
+s__Little_cherry_virus_1 1 PRJNA15346
+s__Little_cherry_virus_2 1 PRJNA15062
+s__Ovine_adenovirus_D 1 PRJNA14198
+s__Actinoplanes_sp_N902_109 1 GCF_000389965
+s__Ovine_adenovirus_A 2 PRJNA14497 PRJNA40309
+s__Enterobacteria_phage_P2 1 PRJNA14035
+s__Candidatus_Endolissoclinum_faulkneri 1 GCF_000319385
+s__Enterobacteria_phage_P1 1 PRJNA14493
+s__Verrucomicrobia_bacterium_SCGC_AAA164_A21 1 GCF_000264585
+s__Enterobacteria_phage_P4 1 PRJNA14414
+s__Pan_troglodytes_verus_polyomavirus_4 1 PRJNA183907
+s__Pan_troglodytes_verus_polyomavirus_5 1 PRJNA183908
+s__Parvimonas_sp_oral_taxon_110 1 GCF_000214475
+s__Propionibacterium_sp_434_HC2 1 GCF_000214535
+s__Human_bocavirus_2 1 PRJNA33891
+s__Human_bocavirus_3 1 PRJNA37291
+s__Human_bocavirus_4 1 PRJNA38243
+s__Tomato_yellow_mottle_virus 1 PRJNA184815
+s__Geobacillus_thermodenitrificans 1 GCF_000015745
+s__Tomato_leaf_curl_Sudan_virus 1 PRJNA14372
+s__Enterobacteria_phage_M13 1 PRJNA14549
+s__Zika_virus 1 PRJNA36615
+s__Escherichia_phage_KBNP135 1 PRJNA177528
+s__Thermus_oshimai 2 GCF_000373145 GCF_000309885
+s__Mycobacterium_phage_Che8 1 PRJNA14394
+s__Salmonella_phage_Fels_1 1 PRJNA29267
+s__Solenopsis_invicta_virus_3 1 PRJNA36613
+s__Clostridium_kluyveri 2 GCF_000010265 GCF_000016505
+s__Solenopsis_invicta_virus_1 1 PRJNA15042
+s__Mycobacterium_phage_Orion 1 PRJNA17151
+s__Aerococcus_viridans 2 GCF_000262085 GCF_000178435
+s__Cryptosporidium_hominis 1 GCA_000006425
+s__Vibrio_caribbenthicus 1 GCF_000165125
+s__Methanosphaerula_palustris 1 GCF_000021965
+s__Oat_necrotic_mottle_virus 1 PRJNA14899
+s__Geobacillus_sp_JF8 1 GCF_000445995
+s__Muricauda_ruestringensis 1 GCF_000224085
+s__Dugbe_virus 1 PRJNA14851
+s__Synechococcus_phage_S_CRM01 1 PRJNA67251
+s__Aguacate_virus 1 PRJNA66333
+s__Flavobacterium_phage_11b 1 PRJNA14565
+s__Vitreoscilla_stercoraria 1 GCF_000382305
+s__Pseudomonas_phage_Lu11 1 PRJNA167656
+s__Cetobacterium_somerae 1 GCF_000479045
+s__Chaerephon_polyomavirus_1 1 PRJNA185191
+s__Anopheles_gambiae_densonucleosis_virus 1 PRJNA32101
+s__Novosphingobium_pentaromativorans 1 GCF_000235975
+s__Grapevine_deformation_virus 1 PRJNA167162
+s__Thioalkalivibrio_nitratireducens 1 GCF_000321415
+s__Eubacterium_plexicaudatum 1 GCF_000364225
+s__Garlic_virus_X 1 PRJNA14987
+s__Influenza_A_virus 4 PRJNA14892 PRJNA15617 PRJNA15620 PRJNA15622
+s__Mycoplasma_haemofelis 2 GCF_000200735 GCF_000186985
+s__Leptolyngbya_sp_Heron_Island_J 1 GCF_000482245
+s__Pseudoalteromonas_rubra 1 GCF_000238295
+s__Paenibacillus_alvei 3 GCF_000442555 GCF_000442535 GCF_000293805
+s__Paenibacillus_terrigena 1 GCF_000374845
+s__Staphylococcus_lugdunensis 5 GCF_000185485 GCF_000270465 GCF_000316075 GCF_000025085 GCF_000247225
+s__Garlic_virus_E 1 PRJNA14834
+s__Garlic_virus_A 1 PRJNA14735
+s__Burkholderia_graminis 1 GCF_000172415
+s__Brevundimonas_abyssalis 1 GCF_000466985
+s__Propionibacterium_phage_PHL113M01 1 PRJNA219107
+s__Frankia_sp_CcI6 1 GCF_000503735
+s__Sphingobium_sp_SYK_6 1 GCF_000283515
+s__Frankia_sp_CcI3 1 GCF_000013345
+s__Sida_yellow_mottle_virus 1 PRJNA74527
+s__Mobilicoccus_pelagius 1 GCF_000247995
+s__Ophiostoma_mitovirus_3a 1 PRJNA14839
+s__Chrysanthemum_chlorotic_mottle_viroid 1 PRJNA14170
+s__Cyanobacterium_stanieri 1 GCF_000317655
+s__Cardiospermum_yellow_leaf_curl_betasatellite 1 PRJNA28647
+s__Caulobacter_phage_CcrColossus 1 PRJNA179419
+s__Potato_leafroll_virus 1 PRJNA15068
+s__Potato_latent_virus 1 PRJNA32629
+s__Acinetobacter_sp_CIP_51_11 1 GCF_000369665
+s__Canine_calicivirus 1 PRJNA14875
+s__Seoul_virus 1 PRJNA15027
+s__Xanthomonas_phage_phiL7 1 PRJNA38267
+s__Histophilus_somni 2 GCF_000019405 GCF_000011785
+s__Acidiphilium_sp_PM 1 GCF_000219295
+s__zeta_proteobacterium_SCGC_AB_133_G06 1 GCF_000379325
+s__Pseudomonas_sp_Ag1 1 GCF_000278565
+s__Flexal_virus 1 PRJNA29903
+s__Niabella_soli 1 GCF_000243115
+s__Vibrio_phage_pVp_1 1 PRJNA181224
+s__Respiratory_syncytial_virus 1 PRJNA15004
+s__Salmonella_phage_g341c 1 PRJNA39795
+s__Bacillus_sp_L1_2012 1 GCF_000334155
+s__Vibrio_phage_VGJphi 1 PRJNA14279
+s__Leptospira_sp_P2653 1 GCF_000346955
+s__Oceanimonas_sp_GK1 1 GCF_000243075
+s__Mycobacterium_phage_PLot 1 PRJNA17167
+s__Enterobacteria_phage_RB69 1 PRJNA15141
+s__Burkholderia_multivorans 7 GCF_000010545 GCF_000182275 GCF_000018505 GCF_000182295 GCF_000286575 GCF_000182255 GCF_000286555
+s__Lachnospiraceae_bacterium_1_1_57FAA 1 GCF_000218445
+s__Plantago_mottle_virus 1 PRJNA32683
+s__Squash_mild_leaf_curl_virus 1 PRJNA14407
+s__Enterobacteria_phage_SPC35 1 PRJNA64605
+s__Leuconostoc_phage_phiLN04 1 PRJNA195530
+s__endosymbiont_of_Riftia_pachyptila 1 GCF_000224455
+s__Kribbella_flavida 1 GCF_000024345
+s__Flavobacterium_sp_ACAM_123 1 GCF_000264055
+s__Actinomadura_madurae 1 GCF_000468475
+s__Haloquadratum_sp_J07HQX50 1 GCF_000416005
+s__Candidatus_Arthromitus_sp_SFB_mouse_SU 1 GCF_000252785
+s__Sunflower_chlorotic_mottle_virus 1 PRJNA47931
+s__Dickeya_paradisiaca 1 GCF_000400505
+s__Coleus_blumei_viroid 1 PRJNA14826
+s__Candidatus_Nitrospira_defluvii 1 GCF_000196815
+s__Phage_phiJL001 1 PRJNA16076
+s__Bradyrhizobium_diazoefficiens 1 GCF_000011365
+s__Salmon_pancreas_disease_virus 2 PRJNA15187 PRJNA15395
+s__Staphylococcus_sp_JGI_0001002_I23 1 GCF_000376205
+s__Cyanophage_9515_10a 1 PRJNA81181
+s__Mycobacterium_parascrofulaceum 1 GCF_000164135
+s__Iodobacteriophage_phiPLPE 1 PRJNA30965
+s__Thalassospira_profundimaris 1 GCF_000300275
+s__Helicobacter_suis 2 GCF_000187625 GCF_000187605
+s__Campylobacter_phage_CP81 1 PRJNA80911
+s__Endozoicomonas_elysicola 1 GCF_000373945
+s__Geobacter_sp_M21 1 GCF_000023645
+s__Desulfotomaculum_ruminis 1 GCF_000215085
+s__Dermabacter_sp_HFH0086 1 GCF_000413375
+s__Enterobacteria_phage_Mu 1 PRJNA14105
+s__Rhizobium_leguminosarum 17 GCF_000271785 GCF_000373285 GCF_000385155 GCF_000375705 GCF_000023185 GCF_000021345 GCF_000009265 GCF_000271845 GCF_000373425 GCF_000372105 GCF_000372205 GCF_000373325 GCF_000371905 GCF_000379005 GCF_000372305 GCF_000271825 GCF_000271805
+s__Cyclovirus_NGchicken15_NGA_2009 1 PRJNA61953
+s__Jatropha_yellow_mosaic_India_virus 1 PRJNA32075
+s__Thioalkalimicrobium_aerophilum 1 GCF_000227665
+s__Rhynchosia_yellow_mosaic_India_virus 1 PRJNA61861
+s__Salmonella_phage_SETP3 1 PRJNA19157
+s__Thiocapsa_marina 1 GCF_000223985
+s__Fluoribacter_dumoffii 2 GCF_000236145 GCF_000236165
+s__Salmonella_phage_SETP7 1 PRJNA226725
+s__Human_papillomavirus_type_136 1 PRJNA167866
+s__Flavobacterium_saliperosum 1 GCF_000498515
+s__Corynebacterium_ammoniagenes 1 GCF_000164115
+s__Vibrio_phage_VfO3K6 1 PRJNA14093
+s__Ruminococcus_callidus 1 GCF_000468015
+s__Persimmon_virus_A 1 PRJNA172457
+s__Actinokineospora_enzanensis 1 GCF_000374445
+s__Roseobacter_sp_MED193 1 GCF_000152965
+s__Nemesia_ring_necrosis_virus 1 PRJNA32681
+s__Halovirus_HHTV_1 1 PRJNA206495
+s__Halovirus_HHTV_2 1 PRJNA206494
+s__Nerine_virus_X 1 PRJNA16257
+s__Synechococcus_phage_S_CBS2 1 PRJNA66393
+s__Marinobacter_lipolyticus 2 GCF_000397065 GCF_000372805
+s__Choristoneura_occidentalis_granulovirus 1 PRJNA17097
+s__Thermococcus_sp_AM4 1 GCF_000151205
+s__Streptococcus_pasteurianus 1 GCF_000270165
+s__Peptoniphilus_duerdenii 1 GCF_000146345
+s__Sulfolobus_virus_Ragged_Hills 1 PRJNA14354
+s__Mycoplasma_genitalium 6 GCF_000292405 GCF_000292505 GCF_000167595 GCF_000292445 GCF_000292485 GCF_000027325
+s__Abutilon_mosaic_Brazil_virus 1 PRJNA81009
+s__Pandoravirus_dulcis 1 PRJNA213019
+s__Leuconostoc_carnosum 2 GCF_000260375 GCF_000300135
+s__Citrobacter_sp_KTE30 1 GCF_000398825
+s__Citrobacter_sp_KTE32 1 GCF_000398865
+s__Lettuce_mosaic_virus 1 PRJNA15342
+s__Methanosarcina_mazei 2 GCF_000007065 GCF_000341715
+s__Mycobacterium_indicus_pranii 1 GCF_000298095
+s__Corynebacterium_sp_KPL1989 1 GCF_000477955
+s__Dictyoglomus_turgidum 1 GCF_000021645
+s__Acinetobacter_phage_ZZ1 1 PRJNA169230
+s__Corynebacterium_sp_KPL1986 1 GCF_000477975
+s__Selenomonas_sp_oral_taxon_892 1 GCF_000468035
+s__Pseudomonas_sp_GM60 1 GCF_000282415
+s__Sanguibacter_keddieii 1 GCF_000024925
+s__Pseudomonas_sp_GM67 1 GCF_000282435
+s__Parana_virus 1 PRJNA29907
+s__Panicum_mosaic_virus 1 PRJNA14979
+s__Ralstonia_phage_RSM3 1 PRJNA32325
+s__Blackberry_virus_E 1 PRJNA68409
+s__Satsuma_dwarf_virus 1 PRJNA15409
+s__Pedobacter_heparinus 1 GCF_000023825
+s__Torque_teno_virus_27 1 PRJNA48147
+s__Mycobacterium_phage_Konstantine 1 PRJNA32015
+s__Clostridium_sp_BL8 1 GCF_000447315
+s__Collinsella_stercoris 1 GCF_000156215
+s__Dethiosulfovibrio_peptidovorans 1 GCF_000172975
+s__Arthrobacter_sp_FB24 1 GCF_000196235
+s__Corynebacterium_sp_HFH0082 1 GCF_000411235
+s__Blackberry_virus_Y 1 PRJNA18125
+s__Mycobacterium_sp_012931 1 GCF_000419295
+s__Clostridiales_genomosp_BVAB3 1 GCF_000025225
+s__Helicobacter_phage_KHP40 1 PRJNA184159
+s__Bacillus_nealsonii 1 GCF_000401235
+s__Arthrobacter_globiformis 1 GCF_000238915
+s__Aspergillus_nidulans 1 GCA_000149205
+s__Beet_severe_curly_top_virus 1 PRJNA14367
+s__Paraprevotella_xylaniphila 1 GCF_000205165
+s__Mycoplasma_conjunctivae 1 GCF_000026765
+s__Prevotella_bryantii 1 GCF_000179055
+s__Sri_Lankan_cassava_mosaic_virus 1 PRJNA15130
+s__Enterobacterial_phage_mEp234 1 PRJNA183153
+s__Ramie_mosaic_virus 1 PRJNA29985
+s__Horsegram_yellow_mosaic_virus 1 PRJNA14356
+s__Squash_leaf_curl_China_virus 1 PRJNA15591
+s__Actinomycetospora_chiangmaiensis 1 GCF_000379625
+s__Y73_sarcoma_virus 1 PRJNA16745
+s__Bacillus_amyloliquefaciens 19 GCF_000242855 GCF_000195515 GCF_000015785 GCF_000330805 GCF_000469015 GCF_000493375 GCF_000204275 GCF_000341875 GCF_000299615 GCF_000319475 GCF_000221645 GCF_000283695 GCF_000196735 GCF_000284395 GCF_000455565 GCF_000494835 GCF_000455585 GCF_000262385 GCF_000465655
+s__Leptothrix_cholodnii 1 GCF_000019785
+s__Gordonia_hirsuta 1 GCF_000333015
+s__Pseudomonas_phage_119X 1 PRJNA16385
+s__Streptomyce_phage_TG1 1 PRJNA177524
+s__Prochlorococcus_phage_MED4_184 1 PRJNA195504
+s__Waddlia_chondrophila 1 GCF_000092785
+s__Nitrosococcus_watsonii 1 GCF_000143085
+s__Mycobacterium_phage_Giles 1 PRJNA27907
+s__Streptococcus_sanguinis 22 GCF_000212835 GCF_000194945 GCF_000195125 GCF_000212815 GCF_000192275 GCF_000195045 GCF_000220275 GCF_000192245 GCF_000188275 GCF_000191105 GCF_000212855 GCF_000507745 GCF_000192205 GCF_000195025 GCF_000192185 GCF_000204475 GCF_000191085 GCF_000014205 GCF_000212795 GCF_000220315 GCF_000191125 GCF_000194965
+s__Salmonella_phage_FSL_SP_004 1 PRJNA212714
+s__Synechococcus_phage_S_SSM4 1 PRJNA195515
+s__Vibrio_harveyi 4 GCF_000182685 GCF_000347555 GCF_000275705 GCF_000259935
+s__Cronobacter_phage_vB_CsaM_GAP161 1 PRJNA179412
+s__Methanoregula_formicica 1 GCF_000327485
+s__Streptococcus_anginosus 8 GCF_000257765 GCF_000184365 GCF_000373605 GCF_000214555 GCF_000463465 GCF_000186545 GCF_000463505 GCF_000287595
+s__Pseudoalteromonas_sp_BSi20311 1 GCF_000239875
+s__Actinomyces_sp_oral_taxon_448 1 GCF_000220835
+s__Euproctis_pseudoconspersa_nucleopolyhedrovirus 1 PRJNA37827
+s__Tomato_mottle_leaf_curl_Zulia_virus 1 PRJNA62741
+s__Pepper_ringspot_virus 1 PRJNA14777
+s__Bacteroides_sp_3_1_19 1 GCF_000163655
+s__Maize_necrotic_streak_virus 1 PRJNA16323
+s__Moritella_marina 2 GCF_000381865 GCF_000291685
+s__Salmonella_phage_SKML_39 1 PRJNA184160
+s__Elusimicrobium_minutum 1 GCF_000020145
+s__Vibrio_rotiferianus 1 GCF_000195225
+s__Pan_troglodytes_verus_polyomavirus_3 1 PRJNA183906
+s__Vibriophage_VP4 1 PRJNA15449
+s__Vibrio_furnissii 2 GCF_000184325 GCF_000176175
+s__Malvastrum_leaf_curl_Philippines_betasatellite 1 PRJNA214366
+s__Frankia_sp_Iso899 1 GCF_000421445
+s__Bartonella_sp_R4_2010 1 GCF_000312525
+s__Streptomyces_fulvissimus 1 GCF_000385945
+s__Tamana_bat_virus 1 PRJNA15398
+s__Pear_latent_virus 1 PRJNA14879
+s__Faba_bean_necrotic_stunt_virus 1 PRJNA39929
+s__Sweet_potato_feathery_mottle_virus 1 PRJNA15347
+s__Hydrogenobacter_thermophilus 2 GCF_000010785 GCF_000164905
+s__Bartonella_sp_DB5_6 1 GCF_000278115
+s__Enterobacter_sp_MGH_8 1 GCF_000474805
+s__Tomato_leaf_curl_Kerala_virus 1 PRJNA30935
+s__Spiroplasma_kunkelii_virus_SkV1_CR2_3x 1 PRJNA27891
+s__Plantago_asiatica_mosaic_virus 1 PRJNA15073
+s__Gremmeniella_abietina_RNA_virus_MS2 1 PRJNA15232
+s__Pectobacterium_atrosepticum 1 GCF_000011605
+s__Groundnut_rosette_virus_satellite_RNA 1 PRJNA14429
+s__Acidianus_hospitalis 1 GCF_000213215
+s__Bifidobacterium_animalis 14 GCF_000021425 GCF_000277325 GCF_000092765 GCF_000240765 GCF_000025245 GCF_000172535 GCF_000022705 GCF_000471945 GCF_000414215 GCF_000022965 GCF_000220885 GCF_000224965 GCF_000277345 GCF_000260715
+s__Paenibacillus_sp_HGF5 1 GCF_000204455
+s__Glaciecola_lipolytica 1 GCF_000314975
+s__Melon_chlorotic_mosaic_virus 1 PRJNA51415
+s__Kosmotoga_olearia 1 GCF_000023325
+s__Dechloromonas_aromatica 1 GCF_000012425
+s__Listeria_phage_LP_037 1 PRJNA212948
+s__Lactobacillus_acidophilus 4 GCF_000191545 GCF_000389675 GCF_000159715 GCF_000011985
+s__Vibrio_vulnificus 6 GCF_000299635 GCF_000342305 GCF_000186585 GCF_000009745 GCF_000039765 GCF_000303175
+s__Actinomyces_urogenitalis 1 GCF_000159035
+s__Frankia_sp_BCU110501 1 GCF_000373365
+s__Ostreococcus_tauri_virus_2 1 PRJNA61087
+s__Ostreococcus_tauri_virus_1 1 PRJNA40907
+s__Obuda_pepper_virus 1 PRJNA14817
+s__Synechococcus_sp_CC9311 1 GCF_000014585
+s__Barley_yellow_dwarf_virus_GAV 1 PRJNA15035
+s__Tomato_severe_rugose_virus 1 PRJNA19973
+s__Slow_bee_paralysis_virus 1 PRJNA48587
+s__Brachybacterium_squillarum 1 GCF_000225825
+s__Glaciecola_chathamensis 1 GCF_000314955
+s__Arthroderma_otae 1 GCA_000151145
+s__Flavobacterium_johnsoniae 1 GCF_000016645
+s__Clostridium_termitidis 1 GCF_000350485
+s__Clostridium_saccharobutylicum 1 GCF_000473995
+s__Trichoplusia_ni_ascovirus_2c 1 PRJNA18003
+s__Corynebacterium_pseudogenitalium 1 GCF_000156615
+s__Brevundimonas_diminuta 2 GCF_000204035 GCF_000318405
+s__Raoultella_ornithinolytica 1 GCF_000367425
+s__Bartonella_australis 1 GCF_000341355
+s__Botryotinia_fuckeliana_totivirus_1 1 PRJNA19133
+s__Vibrio_ichthyoenteri 1 GCF_000222605
+s__Simian_T_cell_lymphotropic_virus_6 1 PRJNA32697
+s__Thermus_igniterrae 1 GCF_000376265
+s__Mesta_yellow_vein_mosaic_Bahraich_virus 1 PRJNA30083
+s__Mycobacterium_phage_Pukovnik 1 PRJNA30521
+s__Cupriavidus_sp_UYPR2_512 1 GCF_000379565
+s__Blechum_interveinal_chlorosis_virus 1 PRJNA178635
+s__Sulfurimonas_gotlandica 2 GCF_000156095 GCF_000242915
+s__Pseudomonas_pseudoalcaligenes 2 GCF_000297075 GCF_000262065
+s__Peptoniphilus_sp_JC140 1 GCF_000321025
+s__Enterococcus_gilvus 2 GCF_000407545 GCF_000394615
+s__Parascardovia_denticolens 3 GCF_000191785 GCF_000269845 GCF_000163835
+s__Serinicoccus_marinus 1 GCF_000421245
+s__Bacillus_sp_17376 1 GCF_000498695
+s__Staphylococcus_phage_phiNM3 1 PRJNA18329
+s__Pantoea_agglomerans 3 GCF_000241285 GCF_000475055 GCF_000330765
+s__Caloramator_australicus 1 GCF_000297115
+s__Succinivibrionaceae_bacterium_WG_1 1 GCF_000222855
+s__Okra_leaf_curl_Cameroon_virus 1 PRJNA60747
+s__Agrobacterium_sp_ATCC_31749 1 GCF_000214615
+s__Sparrow_coronavirus_HKU17 1 PRJNA17048
+s__Thioalkalivibrio_thiocyanoxidans 2 GCF_000227685 GCF_000385215
+s__Bradyrhizobiaceae_bacterium_SG_6C 1 GCF_000219645
+s__Lactobacillus_phage_LF1 1 PRJNA181083
+s__Mycobacterium_kansasii 1 GCF_000157895
+s__SAR406_cluster_bacterium_SCGC_AB_629_J13 1 GCF_000375825
+s__Bacillus_azotoformans 1 GCF_000307855
+s__Clitoria_yellow_mottle_virus 1 PRJNA80771
+s__Urochloa_streak_virus 1 PRJNA30033
+s__Cyanobium_gracile 1 GCF_000316515
+s__Bartonella_doshiae 1 GCF_000278155
+s__Arabis_mosaic_virus_small_satellite_RNA 1 PRJNA14021
+s__Streptomyces_sp_CNY228 1 GCF_000377545
+s__Orangutan_polyomavirus 1 PRJNA41471
+s__Solenopsis_invicta_virus_2 1 PRJNA19773
+s__Xanthomonas_sp_M97 1 GCF_000401255
+s__Petunia_vein_clearing_virus 1 PRJNA14031
+s__Natrialba_aegyptia 1 GCF_000337535
+s__Enterobacteria_phage_phiX174_sensu_lato 1 PRJNA14015
+s__Bacteroides_sp_3_2_5 1 GCF_000159855
+s__Cyclovirus_PKgoat21_PAK_2009 1 PRJNA61947
+s__Roseibium_sp_TrichSKD4 1 GCF_000148725
+s__Coprobacillus_sp_8_2_54BFAA 1 GCF_000244855
+s__Bhendi_yellow_vein_mosaic_betasatellite 1 PRJNA61777
+s__Erysipelotrichaceae_bacterium_2_2_44A 1 GCF_000225685
+s__Tomato_leaf_curl_Karnataka_virus_associated_DNA_beta 1 PRJNA17999
+s__Parvimonas_sp_oral_taxon_393 1 GCF_000223315
+s__Thermoproteus_uzoniensis 1 GCF_000193375
+s__Narcissus_degeneration_virus 1 PRJNA18729
+s__Arcobacter_nitrofigilis 1 GCF_000092245
+s__Tomato_yellow_leaf_curl_Saudi_virus 1 PRJNA217879
+s__Eubacterium_siraeum 4 GCF_000210635 GCF_000209915 GCF_000382085 GCF_000154325
+s__Leptospirillum_ferriphilum 1 GCF_000299235
+s__Helicobacter_bilis 2 GCF_000158435 GCF_000364285
+s__Oscillatoria_nigro_viridis 1 GCF_000317475
+s__Tomato_leaf_curl_virus_satellite 1 PRJNA14428
+s__Dolosigranulum_pigrum 1 GCF_000245815
+s__Carrot_red_leaf_luteovirus_associated_RNA 1 PRJNA14820
+s__Listeria_phage_A006 1 PRJNA20801
+s__Rheinheimera_sp_A13L 1 GCF_000217935
+s__Propionibacterium_phage_PAS50 1 PRJNA66339
+s__Actinomyces_sp_oral_taxon_877 1 GCF_000466305
+s__Bacillus_phage_phiNIT1 1 PRJNA213017
+s__Tomato_leaf_curl_Cotabato_virus 1 PRJNA28989
+s__Sphingobacterium_sp_21 1 GCF_000192845
+s__Ndumu_virus 1 PRJNA88115
+s__Phyllobacterium_sp_YR531 1 GCF_000282595
+s__Candidatus_Baumannia_cicadellinicola 1 GCF_000013185
+s__Enterobacteria_phage_If1 1 PRJNA14039
+s__Mycoreovirus_1 1 PRJNA29913
+s__Staphylococcus_phage_2638A 1 PRJNA15267
+s__Novispirillum_itersonii 1 GCF_000381985
+s__Sulfurospirillum_deleyianum 1 GCF_000024885
+s__Marinobacter_sp_BSs20148 1 GCF_000283275
+s__Pseudomonas_phage_MP29 1 PRJNA32999
+s__Gordonia_sp_NB4_1Y 1 GCF_000347295
+s__Pseudomonas_phage_MP22 1 PRJNA20961
+s__St_Croix_River_virus 1 PRJNA14941
+s__Arthroderma_benhamiae 1 GCA_000151125
+s__Cellulomonas_fimi 1 GCF_000212695
+s__Roseobacter_sp_AzwK_3b 1 GCF_000170875
+s__Dorea_formicigenerans 2 GCF_000225745 GCF_000169235
+s__Raven_circovirus 1 PRJNA17773
+s__Citrus_exocortis_viroid 1 PRJNA14637
+s__Rose_spring_dwarf_associated_virus 1 PRJNA30051
+s__Clostridium_spiroforme 1 GCF_000154805
+s__Azospirillum_brasilense 1 GCF_000237365
+s__Salinisphaera_shabanensis 1 GCF_000215955
+s__Streptococcus_phage_858 1 PRJNA28829
+s__Streptococcus_sp_SK643 1 GCF_000259505
+s__Pseudomonas_phage_phi8 1 PRJNA14731
+s__Haemophilus_phage_HP2 1 PRJNA14231
+s__Haemophilus_sp_oral_taxon_851 1 GCF_000242295
+s__Clostridium_sporogenes 2 GCF_000240115 GCF_000155085
+s__Pseudomonas_phage_phi6 1 PRJNA14788
+s__Trichoplusia_ni_single_nucleopolyhedrovirus 1 PRJNA15635
+s__TYLCAxV_Sic1_IT_Sic2_2_04 1 PRJNA30523
+s__Jatropha_mosaic_Nigerian_virus 1 PRJNA178634
+s__Acinetobacter_gyllenbergii 2 GCF_000488195 GCF_000413855
+s__Colwellia_piezophila 1 GCF_000378625
+s__Isosphaera_pallida 1 GCF_000186345
+s__Pseudomonas_mendocina 5 GCF_000287395 GCF_000465575 GCF_000016565 GCF_000204295 GCF_000295795
+s__Bifidobacterium_bifidum 8 GCF_000155395 GCF_000265095 GCF_000300215 GCF_000164965 GCF_000299595 GCF_000165905 GCF_000466525 GCF_000273525
+s__Magnaporthe_oryzae 1 GCA_000002495
+s__Nanoarchaeum_equitans 1 GCF_000008085
+s__Cotton_leaf_curl_Bangalore_virus 1 PRJNA15575
+s__Cactus_mild_mottle_virus 1 PRJNA33485
+s__Enterococcus_mundtii 4 GCF_000504125 GCF_000233395 GCF_000393815 GCF_000407465
+s__Escherichia_sp_1_1_43 1 GCF_000159895
+s__Entebbe_bat_virus 1 PRJNA18515
+s__Pseudomonas_chloritidismutans 1 GCF_000495915
+s__Zalophus_californianus_papillomavirus_1 1 PRJNA65277
+s__Psychromonas_sp_CNPT3 1 GCF_000153405
+s__Croton_yellow_vein_mosaic_betasatellite 1 PRJNA18249
+s__Verrucomicrobiae_bacterium_DG1235 1 GCF_000155695
+s__Grapevine_Algerian_latent_virus 1 PRJNA32675
+s__Prevotella_oulorum 1 GCF_000224615
+s__Streptomyces_sp_Mg1 1 GCF_000154885
+s__Moraxella_macacae 1 GCF_000320365
+s__Methylophilus_methylotrophus 1 GCF_000378225
+s__Paenibacillus_sp_Y412MC10 1 GCF_000024685
+s__Prevotella_salivae 1 GCF_000185845
+s__Verbesina_encelioides_leaf_curl_alphasatellite 1 PRJNA67961
+s__Spinach_latent_virus 1 PRJNA14810
+s__Hydrangea_chlorotic_mottle_virus 1 PRJNA38689
+s__Guar_leaf_curl_alphasatellite 1 PRJNA193981
+s__Fusarium_poae_virus_1 1 PRJNA14827
+s__Coriobacteriaceae_bacterium_BV3Ac1 1 GCF_000468855
+s__Candidatus_Uzinura_diaspidicola 1 GCF_000331975
+s__Sphingobium_lactosutens 1 GCF_000445105
+s__Okra_mottle_virus 1 PRJNA31095
+s__Eragrostis_streak_virus 1 PRJNA28825
+s__Rhizobium_sp_CF122 1 GCF_000282035
+s__Banana_streak_UM_virus 1 PRJNA66615
+s__Magnetococcus_marinus 1 GCF_000014865
+s__Propionibacterium_acnes 85 GCF_000145095 GCF_000231215 GCF_000144325 GCF_000496915 GCF_000144305 GCF_000145575 GCF_000147145 GCF_000144105 GCF_000144385 GCF_000144245 GCF_000177395 GCF_000221125 GCF_000144875 GCF_000342585 GCF_000008345 GCF_000144505 GCF_000217615 GCF_000144025 GCF_000145075 GCF_000145155 GCF_000144345 GCF_000144185 GCF_000144045 GCF_000178075 GCF_000240035 GCF_000144565 GCF_000144585 GCF_000252385 GCF_000178055 GCF_000194825 GCF_000194905 GCF_000145495 GCF_000144225 [...]
+s__Grapevine_rupestris_stem_pitting_associated_virus 1 PRJNA15249
+s__Fretibacterium_fastidiosum 1 GCF_000210715
+s__Methanosarcina_barkeri 1 GCF_000195895
+s__Streptomyces_gancidicus 1 GCF_000342345
+s__Tomato_rugose_yellow_leaf_curl_virus 1 PRJNA189211
+s__Lactobacillus_fermentum 8 GCF_000162395 GCF_000010145 GCF_000159215 GCF_000466785 GCF_000496435 GCF_000210515 GCF_000477515 GCF_000397165
+s__Burkholderia_phage_KS5 1 PRJNA64563
+s__Thermotoga_lettingae 1 GCF_000017865
+s__Acinetobacter_radioresistens 8 GCF_000162115 GCF_000301795 GCF_000248115 GCF_000368885 GCF_000286595 GCF_000368905 GCF_000175675 GCF_000308075
+s__Megasphaera_genomosp_type_1 1 GCF_000177555
+s__Prevotella_dentalis 2 GCF_000242335 GCF_000220215
+s__Rhodococcus_equi 3 GCF_000473915 GCF_000164155 GCF_000196695
+s__Burkholderia_phage_KS9 1 PRJNA39771
+s__Sulfurospirillum_barnesii 1 GCF_000265295
+s__Hippeastrum_latent_virus 1 PRJNA32685
+s__Singularimonas_variicoloris 1 GCF_000382285
+s__Natrialba_magadii 2 GCF_000025625 GCF_000337875
+s__Exiguobacterium_sibiricum 1 GCF_000019905
+s__Pseudomonas_sp_GM48 1 GCF_000282335
+s__Pseudomonas_sp_GM49 1 GCF_000282355
+s__Acinetobacter_sp_ANC_3789 1 GCF_000368265
+s__Corynebacterium_kroppenstedtii 1 GCF_000023145
+s__Goatpox_virus 1 PRJNA14197
+s__Haloterrigena_limicola 1 GCF_000337475
+s__Eubacterium_hallii 1 GCF_000173975
+s__Mycobacterium_phage_Angelica 1 PRJNA51667
+s__Bartonella_grahamii 1 GCF_000022725
+s__Janthinobacterium_lividum 1 GCF_000242815
+s__Wild_potato_mosaic_virus 1 PRJNA15404
+s__Desulfurococcus_kamchatkensis 1 GCF_000020905
+s__Ageratum_latent_virus 1 PRJNA216153
+s__Pedosphaera_parvula 1 GCF_000172555
+s__Acinetobacter_sp_CIP_101934 1 GCF_000369585
+s__Cypovirus_1 1 PRJNA14714
+s__Enterobacteria_phage_933W_sensu_lato 3 PRJNA14043 PRJNA14167 PRJNA14480
+s__Cypovirus_5 1 PRJNA29601
+s__Escherichia_phage_phiKT 1 PRJNA181222
+s__Astrovirus_VA4 1 PRJNA178562
+s__Gemella_haemolysans 2 GCF_000173915 GCF_000204355
+s__Mycobacterium_ulcerans 1 GCF_000013925
+s__Lumpy_skin_disease_virus 1 PRJNA14122
+s__Brachyspira_murdochii 1 GCF_000092845
+s__Enterobacteria_phage_CC31 1 PRJNA60119
+s__Rotavirus_F 1 PRJNA210412
+s__Psychrobacter_arcticus 1 GCF_000012305
+s__Gentian_mosaic_virus 1 PRJNA31113
+s__Astrovirus_VA2 1 PRJNA176435
+s__Murine_osteosarcoma_virus 1 PRJNA14655
+s__Astrovirus_VA3 1 PRJNA178564
+s__Kelp_fly_virus 1 PRJNA16201
+s__Lactobacillus_buchneri 3 GCF_000159195 GCF_000298115 GCF_000211375
+s__Mitsuokella_multacida 1 GCF_000155955
+s__Rotavirus_H 1 PRJNA16144
+s__Arthrobacter_crystallopoietes 1 GCF_000328305
+s__Upsilonpapillomavirus_2 1 PRJNA17117
+s__Enterobacter_hormaechei 3 GCF_000328905 GCF_000213995 GCF_000328885
+s__Gracilibacillus_lacisalsi 1 GCF_000377765
+s__Groundnut_ringspot_and_Tomato_chlorotic_spot_virus_reassortant 1 PRJNA66459
+s__Poplar_mosaic_virus 1 PRJNA15056
+s__Slackia_heliotrinireducens 1 GCF_000023885
+s__Haloarcula_hispanica_icosahedral_virus_2 1 PRJNA109269
+s__Squash_leaf_curl_Yunnan_virus 1 PRJNA15194
+s__Datura_leaf_distortion_virus 1 PRJNA176617
+s__Methylotenera_versatilis 2 GCF_000093025 GCF_000384375
+s__Deltapapillomavirus_2 1 PRJNA14073
+s__Nile_crocodilepox_virus 1 PRJNA16798
+s__Alcanivorax_sp_DG881 1 GCF_000155615
+s__Sida_yellow_vein_virus 1 PRJNA14264
+s__Piscine_myocarditis_virus_AL_V_708 1 PRJNA67963
+s__Planococcus_citri_densovirus 1 PRJNA14223
+s__Azospirillum_amazonense 1 GCF_000225995
+s__Menangle_virus 1 PRJNA16205
+s__Thioalkalivibrio_versutus 1 GCF_000374265
+s__Methanohalophilus_mahii 1 GCF_000025865
+s__Xanthomonas_fuscans 2 GCF_000175135 GCF_000175155
+s__Methylocystis_sp_SC2 1 GCF_000304315
+s__Lactobacillus_malefermentans 1 GCF_000260775
+s__Elephantid_herpesvirus_1 1 PRJNA192609
+s__Mouse_mammary_tumor_virus 1 PRJNA14435
+s__Gluconobacter_oxydans 3 GCF_000011685 GCF_000263255 GCF_000311765
+s__Corynebacterium_callunae 1 GCF_000344785
+s__Enterococcus_sp_GMD1E 1 GCF_000296975
+s__Streptococcus_didelphis 1 GCF_000380005
+s__Pseudomonas_phage_B3 1 PRJNA14542
+s__Leuconostoc_phage_P793 1 PRJNA195531
+s__Saimiriine_herpesvirus_2 1 PRJNA14417
+s__Saimiriine_herpesvirus_3 1 PRJNA78947
+s__Corynebacterium_bovis 1 GCF_000183325
+s__Hyperthermophilic_Archaeal_Virus_1 1 PRJNA50363
+s__Garlic_virus_C 1 PRJNA14736
+s__Hyperthermophilic_Archaeal_Virus_2 1 PRJNA50361
+s__Kineococcus_radiotolerans 1 GCF_000017305
+s__Alishewanella_aestuarii 1 GCF_000280055
+s__Whitewater_Arroyo_virus 1 PRJNA29833
+s__Thermococcus_onnurineus 1 GCF_000018365
+s__Natrialba_taiwanensis 1 GCF_000337595
+s__Norwalk_virus 1 PRJNA15520
+s__Staphylococcus_phage_42E 1 PRJNA15268
+s__Baboon_orthoreovirus 1 PRJNA71165
+s__Grapevine_rootstock_stem_lesion_associated_virus 1 PRJNA14880
+s__Pseudomonas_sp_GM80 1 GCF_000282515
+s__Flavobacterium_sp_CF136 1 GCF_000282055
+s__Streptomyces_sp_SM8 1 GCF_000299175
+s__Clostridium_sp_Maddingley_MBC34_26 1 GCF_000309845
+s__Pseudomonas_sp_GM84 1 GCF_000282535
+s__Corynebacterium_tuberculostearicum 1 GCF_000175635
+s__Paenisporosarcina_sp_TG20 1 GCF_000286315
+s__Weeksella_virosa 1 GCF_000189415
+s__Daphne_mosaic_virus 1 PRJNA16794
+s__Plasmodium_vivax 1 GCA_000002415
+s__Candidatus_Burkholderia_kirkii 1 GCF_000234195
+s__Enterococcus_phage_EFRM31 1 PRJNA64607
+s__Candidatus_Zinderia_insecticola 1 GCF_000147015
+s__Saguaro_cactus_virus 1 PRJNA14981
+s__Peptoniphilus_sp_BV3AC2 1 GCF_000478945
+s__Chloroflexus_aggregans 1 GCF_000021945
+s__Alistipes_finegoldii 1 GCF_000265365
+s__Acidobacterium_capsulatum 1 GCF_000022565
+s__Atopobium_parvulum 1 GCF_000024225
+s__Pothos_latent_virus 1 PRJNA15185
+s__Ceratocystis_polonica_partitivirus 1 PRJNA29847
+s__Vibrio_phage_Vf12 1 PRJNA14385
+s__Haemophilus_paraphrohaemolyticus 1 GCF_000260675
+s__Aeropyrum_pernix 1 GCF_000011125
+s__Hepatitis_E_virus 1 PRJNA15435
+s__Enterobacteria_phage_HK97 1 PRJNA14592
+s__Coniothyrium_minitans_RNA_virus 1 PRJNA16142
+s__Avian_orthoreovirus 1 PRJNA62875
+s__Prevotella_paludivivens 1 GCF_000373185
+s__Hydrangea_ringspot_virus 1 PRJNA15151
+s__Porphyromonas_somerae 1 GCF_000372405
+s__Propionibacterium_phage_P100_1 1 PRJNA177536
+s__Chlorobium_phaeovibrioides 1 GCF_000016085
+s__Erwinia_phage_ENT90 1 PRJNA184166
+s__Enterobacteria_phage_T7 1 PRJNA14460
+s__Sulfurimonas_sp_AST_10 1 GCF_000445475
+s__Enterobacteria_phage_T3 1 PRJNA14336
+s__Desulfitobacterium_sp_PCE1 1 GCF_000384015
+s__Mycobacterium_phage_Chah 1 PRJNA32021
+s__Staphylococcus_epidermidis 82 GCF_000304575 GCF_000276285 GCF_000276325 GCF_000418085 GCF_000417945 GCF_000308395 GCF_000205325 GCF_000418125 GCF_000247065 GCF_000245635 GCF_000276065 GCF_000247165 GCF_000418185 GCF_000276185 GCF_000275985 GCF_000276505 GCF_000390365 GCF_000160235 GCF_000257965 GCF_000247045 GCF_000247025 GCF_000247185 GCF_000011925 GCF_000177115 GCF_000247125 GCF_000276145 GCF_000314715 GCF_000276125 GCF_000276085 GCF_000276025 GCF_000276485 GCF_000257945 GCF_0002471 [...]
+s__Weissella_koreensis 2 GCF_000277645 GCF_000219805
+s__Magnaporthe_oryzae_virus_2 1 PRJNA28297
+s__Magnaporthe_oryzae_virus_1 1 PRJNA15041
+s__Pseudomonas_sp_UW4 1 GCF_000316175
+s__Sulfurihydrogenibium_yellowstonense 1 GCF_000173615
+s__Daphne_virus_S 1 PRJNA16749
+s__Malvastrum_leaf_curl_Guangdong_virus 1 PRJNA17593
+s__Siegesbeckia_yellow_vein_virus_associated_DNA_beta 1 PRJNA17269
+s__Basella_rugose_mosaic_virus 1 PRJNA20619
+s__Nostoc_punctiforme 1 GCF_000020025
+s__Nitrosomonas_europaea 1 GCF_000009145
+s__LuIII_virus 1 PRJNA14278
+s__Symbiobacterium_thermophilum 1 GCF_000009905
+s__Micromonospora_lupini 1 GCF_000297395
+s__actinobacterium_SCGC_AAA023_J06 1 GCF_000372265
+s__Desulfosporosinus_sp_OT 1 GCF_000224515
+s__Malvastrum_yellow_vein_Yunnan_virus_satellite_DNA_beta 1 PRJNA14567
+s__Acidovorax_sp_CF316 1 GCF_000276605
+s__Dictyostelium_discoideum 1 GCA_000004695
+s__Pyrobaculum_sp_1860 1 GCF_000234805
+s__Acinetobacter_sp_CIP_102637 1 GCF_000368425
+s__BK_polyomavirus 1 PRJNA14074
+s__Barley_yellow_dwarf_virus_PAS 1 PRJNA14698
+s__Rickettsia_akari 1 GCF_000018205
+s__Pseudomonas_phage_PRR1 1 PRJNA17481
+s__Ageratum_yellow_leaf_curl_betasatellite 1 PRJNA14439
+s__Corynebacterium_timonense 1 GCF_000312345
+s__Sphingomonas_sp_PAMC_26605 1 GCF_000241485
+s__Mycobacterium_phage_L5 1 PRJNA14459
+s__Blastopirellula_marina 1 GCF_000153105
+s__Pseudoalteromonas_sp_BSi20439 1 GCF_000241165
+s__Collinsella_sp_GD3 1 GCF_000333815
+s__Prochlorococcus_phage_Syn1 1 PRJNA64713
+s__Human_gyrovirus_type_1 1 PRJNA67891
+s__Southern_cowpea_mosaic_virus 1 PRJNA15331
+s__Mycobacterium_phage_Phaux 1 PRJNA206025
+s__Bhendi_yellow_vein_India_virus 1 PRJNA61555
+s__Thiovulum_sp_ES 1 GCF_000276965
+s__Blautia_hansenii 1 GCF_000156675
+s__Variola_virus 1 PRJNA15197
+s__Porphyromonas_asaccharolytica 2 GCF_000212375 GCF_000183605
+s__Sphingomonas_sp_Mn802worker 1 GCF_000382485
+s__Exiguobacterium_sp_MH3 1 GCF_000496635
+s__Bacteroides_sp_2_1_7 1 GCF_000157035
+s__Syntrophobotulus_glycolicus 1 GCF_000190635
+s__Salinicoccus_carnicancri 1 GCF_000330705
+s__Oceanicola_sp_S124 1 GCF_000220565
+s__Psychrobacter_phage_pOW20_A 1 PRJNA195475
+s__Halothiobacillus_neapolitanus 1 GCF_000024765
+s__Allofustis_seminis 1 GCF_000374325
+s__Sugarcane_streak_mosaic_virus 1 PRJNA47861
+s__Acinetobacter_sp_NIPH_973 1 GCF_000368065
+s__Citrobacter_sp_30_2 1 GCF_000158355
+s__Escherichia_coli 1472 GCF_000457635 GCF_000408425 GCF_000459975 GCF_000175755 GCF_000303895 GCF_000458215 GCF_000350945 GCF_000352525 GCF_000264195 GCF_000335215 GCF_000358335 GCF_000352005 GCF_000356105 GCF_000462545 GCF_000461195 GCF_000459135 GCF_000458295 GCF_000461955 GCF_000462645 GCF_000267805 GCF_000407765 GCF_000172055 GCF_000320195 GCF_000007445 GCF_000356505 GCF_000340255 GCF_000264135 GCF_000462465 GCF_000181775 GCF_000320075 GCF_000462665 GCF_000303295 GCF_000357865 GCF_0 [...]
+s__Desulfotomaculum_carboxydivorans 1 GCF_000214435
+s__Coconut_foliar_decay_virus 1 PRJNA14067
+s__Melaka_orthoreovirus 1 PRJNA191884
+s__Beet_yellows_virus 1 PRJNA15328
+s__Thermobifida_fusca 2 GCF_000401915 GCF_000012405
+s__Tomato_bushy_stunt_virus_satellite_RNA 1 PRJNA14430
+s__Acinetobacter_sp_NIPH_1859 1 GCF_000369765
+s__Acinetobacter_soli 2 GCF_000368705 GCF_000368725
+s__Acinetobacter_sp_NIPH_817 1 GCF_000368405
+s__Neisseria_sp_oral_taxon_020 1 GCF_000318235
+s__Methylophaga_thiooxydans 1 GCF_000156355
+s__Mycobacterium_intracellulare 5 GCF_000309055 GCF_000277145 GCF_000277125 GCF_000276825 GCF_000172115
+s__Cyanobacterium_aponinum 1 GCF_000317675
+s__Fibrisoma_limi 1 GCF_000296815
+s__Thunberg_fritillary_virus 1 PRJNA15483
+s__Human_parvovirus_4 1 PRJNA15414
+s__Mungbean_yellow_mosaic_India_virus_associated_betasatellite_India_Faizabad_Cow_Pea_2012 1 PRJNA177773
+s__Enterococcus_pallens 2 GCF_000407485 GCF_000393975
+s__Dehalobacter_sp_DCA 1 GCF_000305775
+s__Haemophilus_influenzae 21 GCF_000169775 GCF_000169835 GCF_000210875 GCF_000012185 GCF_000169815 GCF_000027305 GCF_000173315 GCF_000016485 GCF_000165575 GCF_000169855 GCF_000175455 GCF_000175435 GCF_000197875 GCF_000200475 GCF_000169735 GCF_000465255 GCF_000016465 GCF_000173335 GCF_000169795 GCF_000165525 GCF_000169755
+s__Staphylococcus_phage_phiSLT 1 PRJNA14137
+s__Vibrio_cholerae 183 GCF_000237705 GCF_000168915 GCF_000318075 GCF_000299495 GCF_000221485 GCF_000305115 GCF_000221345 GCF_000305075 GCF_000174295 GCF_000303105 GCF_000303125 GCF_000176435 GCF_000152425 GCF_000237445 GCF_000348225 GCF_000348365 GCF_000176455 GCF_000305625 GCF_000153785 GCF_000348385 GCF_000302775 GCF_000330905 GCF_000154005 GCF_000234435 GCF_000220765 GCF_000221425 GCF_000302875 GCF_000279435 GCF_000221385 GCF_000387725 GCF_000387625 GCF_000234395 GCF_000303085 GCF_000 [...]
+s__Dehalobacter_sp_CF 1 GCF_000305815
+s__Yersinia_phage_phiYeO3_12 1 PRJNA14591
+s__Tomato_marchitez_virus 1 PRJNA30365
+s__Candidatus_Pelagibacter_ubique 7 GCF_000153525 GCF_000012345 GCF_000419545 GCF_000372905 GCF_000504225 GCF_000472605 GCF_000384455
+s__Aeromonas_caviae 1 GCF_000208825
+s__Mycoplasma_anatis 1 GCF_000221305
+s__Trypanosoma_brucei 1 GCA_000002445
+s__Bamboo_mosaic_virus 1 PRJNA14728
+s__Chloroflexus_sp_Y_400_fl 1 GCF_000022185
+s__Lactobacillus_phage_PL_1 1 PRJNA227007
+s__Enterorhabdus_caecimuris 1 GCF_000403355
+s__Prevotella_multisaccharivorax 1 GCF_000218235
+s__Halorubrum_terrestre 1 GCF_000337435
+s__Enterococcus_avium 2 GCF_000407245 GCF_000406965
+s__Herbaspirillum_sp_CF444 1 GCF_000282135
+s__Staphylococcus_phage_YMC_09_04_R1988 1 PRJNA227002
+s__Capnocytophaga_canimorsus 1 GCF_000220625
+s__Halococcus_morrhuae 1 GCF_000336695
+s__Thermotoga_petrophila 1 GCF_000016785
+s__Rodent_hepacivirus 1 PRJNA198869
+s__Gordonia_terrae 2 GCF_000390025 GCF_000248035
+s__Colobus_monkey_papillomavirus 1 PRJNA68289
+s__Gordonia_malaquae 1 GCF_000344135
+s__Yokenella_regensburgei 1 GCF_000239335
+s__European_catfish_virus 1 PRJNA167164
+s__Gayfeather_mild_mottle_virus 1 PRJNA34755
+s__Leptospira_alexanderi 1 GCF_000243815
+s__Abutilon_Brazil_virus 1 PRJNA48591
+s__Rahnella_aquatilis 2 GCF_000255535 GCF_000241955
+s__Propionibacterium_sp_CC003_HC2 1 GCF_000221085
+s__Opitutaceae_bacterium_TAV1 1 GCF_000243495
+s__Velvet_bean_severe_mosaic_virus 1 PRJNA41175
+s__Sulfolobus_islandicus_rod_shaped_virus_1 1 PRJNA14514
+s__Sulfolobus_islandicus_rod_shaped_virus_2 1 PRJNA15191
+s__Methanohalobium_evestigatum 1 GCF_000196655
+s__Milk_vetch_dwarf_virus 1 PRJNA14173
+s__Flavobacteriaceae_bacterium_S85 1 GCF_000220525
+s__Propionibacterium_phage_P101A 1 PRJNA177531
+s__Rhizobium_sp_CF142 1 GCF_000281145
+s__Freesia_sneak_virus 1 PRJNA196748
+s__Microcoleus_sp_PCC_7113 1 GCF_000317515
+s__Cycas_necrotic_stunt_virus 1 PRJNA15397
+s__Psychroflexus_torquis 1 GCF_000153485
+s__Shewanella_baltica 9 GCF_000015845 GCF_000231345 GCF_000018765 GCF_000147735 GCF_000179535 GCF_000178875 GCF_000017325 GCF_000215895 GCF_000021665
+s__Plum_pox_virus 1 PRJNA15298
+s__Glaciecola_psychrophila 2 GCF_000347635 GCF_000315075
+s__Yersinia_pseudotuberculosis 4 GCF_000019465 GCF_000047365 GCF_000016945 GCF_000020085
+s__Nitritalea_halalkaliphila 1 GCF_000265075
+s__Agrobacterium_tumefaciens 6 GCF_000233975 GCF_000219665 GCF_000349865 GCF_000236125 GCF_000421945 GCF_000016265
+s__Thioalkalivibrio_sp_ALE6 1 GCF_000364565
+s__Acaricomes_phytoseiuli 1 GCF_000376245
+s__Acidaminococcus_fermentans 1 GCF_000025305
+s__Mycobacterium_vanbaalenii 1 GCF_000015305
+s__Streptococcus_henryi 1 GCF_000376985
+s__Actinobaculum_massiliense 1 GCF_000315465
+s__Hamster_polyomavirus 1 PRJNA14461
+s__Bat_hepatitis_virus 1 PRJNA195535
+s__Canine_bocavirus 1 PRJNA193977
+s__Soybean_chlorotic_spot_virus 1 PRJNA173351
+s__Alcanivorax_pacificus 1 GCF_000299335
+s__Pseudoflavonifractor_capillosus 1 GCF_000169255
+s__Cafeteria_roenbergensis_virus_BV_PW1 1 PRJNA59783
+s__Moraxella_boevrei 1 GCF_000379845
+s__Streptococcus_suis 21 GCF_000167375 GCF_000018185 GCF_000231325 GCF_000471985 GCF_000231905 GCF_000390245 GCF_000231885 GCF_000294495 GCF_000091905 GCF_000344765 GCF_000026725 GCF_000231925 GCF_000231865 GCF_000233575 GCF_000494895 GCF_000014325 GCF_000168355 GCF_000204625 GCF_000014305 GCF_000186405 GCF_000026745
+s__Ruminococcus_gnavus 2 GCF_000169475 GCF_000507805
+s__Corynebacterium_variabile 1 GCF_000179395
+s__Oenococcus_oeni 3 GCF_000372485 GCF_000168955 GCF_000014385
+s__Pediococcus_lolii 1 GCF_000319265
+s__Propionibacterium_sp_KPL2008 1 GCF_000477755
+s__Propionibacterium_sp_KPL2009 1 GCF_000477655
+s__Propionibacterium_sp_KPL2005 1 GCF_000477675
+s__Propionibacterium_sp_KPL2003 1 GCF_000477775
+s__Propionibacterium_sp_KPL2000 1 GCF_000477795
+s__Bacteroides_sp_HPS0048 1 GCF_000382465
+s__Vibrio_sp_AND4 1 GCF_000171815
+s__Streptococcus_sp_M334 1 GCF_000187745
+s__Pea_stem_necrosis_virus 1 PRJNA14894
+s__Eubacterium_ventriosum 1 GCF_000153885
+s__Trichechus_manatus_latirostris_papillomavirus_2 1 PRJNA84405
+s__Prevotella_melaninogenica 2 GCF_000163035 GCF_000144405
+s__Synechococcus_phage_S_CBS4 1 PRJNA82651
+s__Sida_yellow_mosaic_virus 1 PRJNA15496
+s__Rubrobacter_xylanophilus 1 GCF_000014185
+s__Streptomyces_griseus 2 GCF_000010605 GCF_000177175
+s__Enterobacteria_phage_HK578 1 PRJNA183138
+s__Massilia_niastensis 1 GCF_000382345
+s__Shigella_phage_pSf_1 1 PRJNA206484
+s__Dactylococcopsis_salina 1 GCF_000317615
+s__Saccharibacillus_kuerlensis 1 GCF_000378145
+s__Streptococcus_agalactiae 257 GCF_000289455 GCF_000289275 GCF_000310505 GCF_000311645 GCF_000310585 GCF_000289575 GCF_000288655 GCF_000186445 GCF_000289995 GCF_000288695 GCF_000311005 GCF_000288955 GCF_000289695 GCF_000290115 GCF_000311705 GCF_000290215 GCF_000007265 GCF_000310985 GCF_000311245 GCF_000289955 GCF_000311145 GCF_000310825 GCF_000311365 GCF_000311165 GCF_000322625 GCF_000323105 GCF_000322985 GCF_000322845 GCF_000322525 GCF_000289875 GCF_000289015 GCF_000288335 GCF_00031052 [...]
+s__Synechococcus_phage_S_CBS3 1 PRJNA66397
+s__Lachnospiraceae_bacterium_5_1_63FAA 1 GCF_000185525
+s__Vibrio_sp_HENC_02 1 GCF_000305735
+s__Vibrio_sp_HENC_03 1 GCF_000305755
+s__Vibrio_sp_HENC_01 1 GCF_000305715
+s__Enterobacter_sp_MGH_26 1 GCF_000492975
+s__Enterobacter_sp_MGH_24 1 GCF_000493015
+s__Enterobacter_sp_MGH_25 1 GCF_000492995
+s__Sphingomonas_sp_ATCC_31555 1 GCF_000282895
+s__Enterobacter_sp_MGH_23 1 GCF_000493035
+s__Mamastrovirus_13 1 PRJNA15095
+s__Mamastrovirus_10 1 PRJNA14897
+s__Staphylococcus_phage_CNPH82 1 PRJNA18523
+s__candidate_division_TG3_bacterium_ACht1 1 GCF_000474745
+s__Lamium_leaf_distortion_virus 1 PRJNA29877
+s__Lyngbya_aestuarii 1 GCF_000478195
+s__Haloferax_sp_BAB2207 1 GCF_000328285
+s__Sida_golden_mosaic_Honduras_virus 1 PRJNA14263
+s__Streptomyces_himastatinicus 1 GCF_000158915
+s__Ralstonia_phage_RSA1 1 PRJNA19481
+s__Saccharothrix_espanaensis 1 GCF_000328705
+s__Burkholderia_xenovorans 1 GCF_000013645
+s__Turkey_gallivirus 1 PRJNA172458
+s__Deinococcus_gobiensis 1 GCF_000252445
+s__Ludwigia_leaf_distortion_betasatellite 1 PRJNA29233
+s__Verrucomicrobia_bacterium_SCGC_AAA300_O17 1 GCF_000382685
+s__Jatropha_leaf_curl_virus 1 PRJNA31277
+s__Haladaptatus_paucihalophilus 2 GCF_000187225 GCF_000376445
+s__Enterobacteria_phage_vB_EcoP_ACG_C91 1 PRJNA179413
+s__Schlumbergera_virus_X 1 PRJNA33189
+s__Ustilaginoidea_virens_RNA_virus 1 PRJNA213142
+s__Alkaliphilus_oremlandii 1 GCF_000018325
+s__Clostridium_sordellii 2 GCF_000444095 GCF_000444075
+s__Enterococcus_phage_EFAP_1 1 PRJNA36375
+s__Alistipes_putredinis 1 GCF_000154465
+s__Australian_grapevine_viroid 1 PRJNA14976
+s__Halomicrobium_katesii 1 GCF_000379085
+s__Mirabilis_mosaic_virus 1 PRJNA14393
+s__Macrobrachium_rosenbergii_nodavirus 1 PRJNA15129
+s__Sugarcane_streak_Egypt_virus 1 PRJNA14365
+s__Aroa_virus 1 PRJNA18847
+s__Thermus_aquaticus 1 GCF_000173055
+s__Aeromicrobium_sp_JC14 2 GCF_000285435 GCF_000312105
+s__Sulfolobus_spindle_shaped_virus_7 1 PRJNA42357
+s__Vibrio_sp_RC341 1 GCF_000176215
+s__Congregibacter_litoralis 1 GCF_000153125
+s__Sulfolobus_spindle_shaped_virus_2 1 PRJNA14317
+s__Ruegeria_pomeroyi 1 GCF_000011965
+s__Sulfolobus_spindle_shaped_virus_1 1 PRJNA14014
+s__Pelagibaca_bermudensis 1 GCF_000153725
+s__Metallosphaera_sedula 1 GCF_000016605
+s__Streptomyces_sp_LaPpAH_95 1 GCF_000375725
+s__Paramecium_bursaria_Chlorella_virus_NY2A 1 PRJNA20989
+s__Carnation_etched_ring_virus 1 PRJNA14494
+s__Circovirus_like_genome_RW_D 1 PRJNA39623
+s__Burkholderia_cenocepacia 8 GCF_000236215 GCF_000009485 GCF_000333135 GCF_000019505 GCF_000014085 GCF_000203955 GCF_000333155 GCF_000152565
+s__Methanobacterium_sp_AL_21 1 GCF_000191585
+s__Malvastrum_yellow_vein_Yunnan_virus 1 PRJNA15231
+s__Algicola_sagamiensis 1 GCF_000374485
+s__Tsukamurella_sp_1534 1 GCF_000312385
+s__Frog_adenovirus_A 1 PRJNA14488
+s__Bovine_polyomavirus 1 PRJNA14017
+s__Cellulophaga_phage_phi10_1 1 PRJNA212964
+s__Lactobacillus_phage_phiJL_1 1 PRJNA15156
+s__Mesta_yellow_vein_mosaic_virus_associated_DNA_beta 1 PRJNA21015
+s__Blattabacterium_sp_Panesthia_angustipennis_spadica 1 GCF_000348805
+s__Paenisporosarcina_sp_TG_14 1 GCF_000297555
+s__Groundnut_bud_necrosis_virus 1 PRJNA14766
+s__Penicillium_stoloniferum_virus_S 1 PRJNA14950
+s__Massilia_timonae 1 GCF_000315425
+s__Penicillium_stoloniferum_virus_F 1 PRJNA15533
+s__Saccharomonospora_halophila 1 GCF_000383775
+s__Brucella_abortus 136 GCF_000479955 GCF_000366405 GCF_000478665 GCF_000366545 GCF_000369925 GCF_000245835 GCF_000370245 GCF_000370345 GCF_000245915 GCF_000370505 GCF_000370145 GCF_000245875 GCF_000370085 GCF_000370365 GCF_000413615 GCF_000157675 GCF_000480235 GCF_000366345 GCF_000366445 GCF_000366525 GCF_000472245 GCF_000370325 GCF_000366765 GCF_000366325 GCF_000182625 GCF_000366605 GCF_000366665 GCF_000298635 GCF_000370445 GCF_000413755 GCF_000369965 GCF_000370285 GCF_000480115 GCF_00 [...]
+s__Tomato_yellow_leaf_curl_Thailand_betasatellite 1 PRJNA14450
+s__Aichivirus_B 1 PRJNA14948
+s__Aichivirus_C 1 PRJNA82751
+s__WU_Polyomavirus 1 PRJNA19765
+s__Phaeodactylum_tricornutum 1 GCA_000150955
+s__Lactobacillus_parafarraginis 1 GCF_000238835
+s__Blackberry_chlorotic_ringspot_virus 1 PRJNA32707
+s__Gemmatimonas_aurantiaca 1 GCF_000010305
+s__Staphylococcus_phage_phi2958PVL 1 PRJNA32173
+s__Desulfotomaculum_reducens 1 GCF_000016165
+s__Tomato_severe_leaf_curl_virus 1 PRJNA14482
+s__Mycobacterium_phage_Jabbawokkie 1 PRJNA215115
+s__Cellulophaga_phage_phi19_1 1 PRJNA212942
+s__Cellulophaga_phage_phi19_3 1 PRJNA212945
+s__Escherichia_phage_D108 1 PRJNA42515
+s__Propionibacterium_phage_P14_4 1 PRJNA177530
+s__Methanocella_paludicola 1 GCF_000011005
+s__Pseudoalteromonas_atlantica 1 GCF_000014225
+s__Rosellinia_necatrix_partitivirus_2 1 PRJNA188731
+s__Nse_virus 1 PRJNA196420
+s__Calla_lily_latent_virus 1 PRJNA202315
+s__Verrucomicrobia_bacterium_SCGC_AAA164_L15 1 GCF_000285795
+s__Thioalkalivibrio_sulfidophilus 1 GCF_000021985
+s__Oceanobacillus_kimchii 1 GCF_000340475
+s__Rhodospirillum_rubrum 2 GCF_000225955 GCF_000013085
+s__Pasteurella_pneumotropica 1 GCF_000379905
+s__Spinach_curly_top_virus 1 PRJNA14373
+s__Cyanothece_sp_PCC_7424 1 GCF_000021825
+s__Cyanothece_sp_PCC_7425 1 GCF_000022045
+s__Borrelia_bissettii 1 GCF_000222305
+s__Fluviicola_taffensis 1 GCF_000194605
+s__Propionibacterium_sp_5_U_42AFAA 1 GCF_000233555
+s__Streptomyces_sp_PsTaAH_124 1 GCF_000373685
+s__Candidatus_Liberibacter_solanacearum 1 GCF_000183665
+s__Pseudomonas_phage_LBL3 1 PRJNA31053
+s__Terriglobus_saanensis 1 GCF_000179915
+s__Sulfurihydrogenibium_azorense 1 GCF_000021545
+s__Leptospira_kirschneri 25 GCF_000246335 GCF_000246155 GCF_000243655 GCF_000306395 GCF_000306175 GCF_000347215 GCF_000243695 GCF_000244515 GCF_000346895 GCF_000306595 GCF_000347015 GCF_000246675 GCF_000306555 GCF_000246175 GCF_000342725 GCF_000343555 GCF_000243615 GCF_000243855 GCF_000306355 GCF_000306515 GCF_000243915 GCF_000246355 GCF_000347235 GCF_000246295 GCF_000243875
+s__Pseudonocardia_dioxanivorans 1 GCF_000196675
+s__Alternaria_alternata_virus_1 1 PRJNA30367
+s__Tulare_apple_mosaic_virus 1 PRJNA14814
+s__Sphingomonas_sp_PAMC_26621 1 GCF_000251145
+s__Alternanthera_mosaic_virus 1 PRJNA16333
+s__Rickettsia_sibirica 2 GCF_000246715 GCF_000247625
+s__Infectious_hematopoietic_necrosis_virus 1 PRJNA14677
+s__Megasphaera_micronuciformis 1 GCF_000165735
+s__Corynebacterium_aurimucosum 2 GCF_000022905 GCF_000174695
+s__Streptococcus_phage_DT1 1 PRJNA15124
+s__Marvinbryantia_formatexigens 1 GCF_000173815
+s__Papaya_leaf_distortion_mosaic_virus 1 PRJNA15405
+s__SAR324_cluster_bacterium_SCGC_AB_629_J17 1 GCF_000375785
+s__Cotton_leaf_curl_Gezira_virus 1 PRJNA14095
+s__Leptospira_yanagawae 1 GCF_000332475
+s__Bradyrhizobium_sp_DFCI_1 1 GCF_000465325
+s__Ruminococcus_lactaris 2 GCF_000507785 GCF_000155205
+s__Caulobacter_phage_CcrKarma 1 PRJNA179420
+s__Pseudomonas_agarici 1 GCF_000280785
+s__Corynebacterium_nuruki 1 GCF_000213935
+s__Uliginosibacterium_gangwonense 1 GCF_000373965
+s__Dehalobacter_sp_E1 1 GCF_000309295
+s__Clostridium_saccharolyticum 2 GCF_000210535 GCF_000144625
+s__Rhynchosia_golden_mosaic_virus 1 PRJNA14258
+s__Methanococcus_voltae 1 GCF_000006175
+s__Citrus_vein_enation_virus 1 PRJNA209366
+s__Cymbidium_mosaic_virus 1 PRJNA15490
+s__Grapevine_fleck_virus 1 PRJNA15188
+s__HMO_Astrovirus_A 1 PRJNA41413
+s__Spodoptera_frugiperda_multiple_nucleopolyhedrovirus 1 PRJNA18827
+s__Verrucomicrobia_bacterium_SCGC_AAA164_O14 1 GCF_000264605
+s__Torque_teno_virus_19 1 PRJNA48155
+s__Torque_teno_virus_16 1 PRJNA48181
+s__Torque_teno_virus_15 1 PRJNA48191
+s__Torque_teno_virus_14 1 PRJNA48153
+s__Torque_teno_virus_12 1 PRJNA48149
+s__Colorado_tick_fever_virus 1 PRJNA14857
+s__Ranid_herpesvirus_2 1 PRJNA17183
+s__Collinsella_intestinalis 1 GCF_000156175
+s__Acidiphilium_cryptum 1 GCF_000016725
+s__Maritimibacter_alkaliphilus 1 GCF_000152805
+s__Mycobacterium_phage_Wheeler 1 PRJNA215110
+s__Amapari_virus 1 PRJNA28321
+s__Gluconobacter_frateurii 1 GCF_000284875
+s__Candidatus_Blochmannia_floridanus 1 GCF_000043285
+s__Escherichia_phage_KBNP21 1 PRJNA177527
+s__Streptomyces_globisporus 1 GCF_000261345
+s__Arthrobacter_phenanthrenivorans 1 GCF_000189535
+s__Enterococcus_durans 4 GCF_000406985 GCF_000315405 GCF_000350465 GCF_000407265
+s__Pseudoalteromonas_sp_BSi20495 1 GCF_000241185
+s__Roseovarius_sp_TM1035 1 GCF_000170775
+s__Pyramidobacter_piscolens 1 GCF_000177335
+s__Bat_sapovirus_TLC58_HK 1 PRJNA167111
+s__Avibacterium_paragallinarum 1 GCF_000348525
+s__Clostridium_thermocellum 6 GCF_000175715 GCF_000173015 GCF_000015865 GCF_000184925 GCF_000255575 GCF_000255615
+s__Herpetosiphon_aurantiacus 1 GCF_000018565
+s__Eggerthella_sp_1_3_56FAA 1 GCF_000185625
+s__Gremmeniella_abietina_type_B_RNA_virus_XL 1 PRJNA16657
+s__Lone_Star_virus 1 PRJNA203651
+s__Bacillus_sp_NRRL_B_14911 1 GCF_000153365
+s__Halomonas_sp_HAL1 1 GCF_000235725
+s__Herminiimonas_arsenicoxydans 1 GCF_000026125
+s__Watermelon_chlorotic_stunt_virus 1 PRJNA14176
+s__Hyphomicrobium_zavarzinii 1 GCF_000383415
+s__Sida_yellow_mosaic_virus_China_associated_DNA_beta 1 PRJNA15514
+s__Burkholderia_sp_JPY251 1 GCF_000372985
+s__Mesoplasma_florum 2 GCF_000479355 GCF_000008305
+s__Plautia_stali_symbiont 1 GCF_000180175
+s__Laccaria_bicolor 1 GCA_000143565
+s__Escherichia_phage_2_JES_2013 1 PRJNA219124
+s__Planctomyces_maris 1 GCF_000181475
+s__Burkholderia_lata 1 GCF_000012945
+s__Coxiella_burnetii 8 GCF_000007765 GCF_000019885 GCF_000168875 GCF_000019865 GCF_000169495 GCF_000017105 GCF_000018745 GCF_000300315
+s__Jonquetella_sp_BV3C21 1 GCF_000468895
+s__Pseudoalteromonas_phage_H105_1 1 PRJNA64761
+s__Thiomonas_intermedia 1 GCF_000092605
+s__African_green_monkey_simian_foamy_virus 1 PRJNA30095
+s__Ferroplasma_acidarmanus 1 GCF_000152265
+s__Planococcus_antarcticus 1 GCF_000264415
+s__Streptomyces_sp_BoleA5 1 GCF_000373665
+s__Hollyhock_leaf_crumple_virus_satellite_DNA 1 PRJNA14208
+s__Squash_yellow_mild_mottle_virus 1 PRJNA14186
+s__Wallal_virus 1 PRJNA222995
+s__Marinobacter_adhaerens 1 GCF_000166295
+s__Providencia_sneebia 1 GCF_000314895
+s__Marine_RNA_virus_JP_B 1 PRJNA20651
+s__Marine_RNA_virus_JP_A 1 PRJNA20649
+s__Cellvibrio_gilvus 1 GCF_000218545
+s__Microcystis_aeruginosa 12 GCF_000312245 GCF_000330925 GCF_000010625 GCF_000312285 GCF_000312205 GCF_000312265 GCF_000312725 GCF_000312185 GCF_000312225 GCF_000412595 GCF_000312165 GCF_000307995
+s__Chromobacterium_violaceum 1 GCF_000007705
+s__American_hop_latent_virus 1 PRJNA163147
+s__Variovorax_sp_CF313 1 GCF_000282635
+s__Dendrolimus_punctatus_densovirus 1 PRJNA14546
+s__East_African_cassava_mosaic_virus 1 PRJNA15177
+s__Saccharomonospora_cyanea 1 GCF_000244975
+s__Escherichia_sp_TW09308 1 GCF_000208565
+s__Burkholderiales_bacterium_JOSHI_001 1 GCF_000244995
+s__Enterobacter_sp_Ag1 1 GCF_000277545
+s__Cotton_leaf_curl_Multan_betasatellite 1 PRJNA15780
+s__Streptococcus_peroris 1 GCF_000187585
+s__Rudaea_cellulosilytica 1 GCF_000378125
+s__Chronic_bee_paralysis_virus 1 PRJNA29839
+s__Acheta_domestica_densovirus 1 PRJNA15222
+s__Arthrobacter_sp_M2012083 1 GCF_000281065
+s__Mycobacterium_phage_Angel 1 PRJNA38461
+s__Onion_yellow_dwarf_virus 1 PRJNA15407
+s__Paenibacillus_sp_A9 1 GCF_000346635
+s__Moniliophthora_perniciosa 1 GCA_000183025
+s__Burkholderia_sp_H160 1 GCF_000173575
+s__Cypovirus_15 1 PRJNA14102
+s__Cutthroat_trout_virus 1 PRJNA66895
+s__Lacinutrix_sp_5H_3_7_4 1 GCF_000211855
+s__Donggang_virus 1 PRJNA115527
+s__Candidatus_Blochmannia_pennsylvanicus 1 GCF_000011745
+s__Lactobacillus_rhamnosus 13 GCF_000160175 GCF_000235785 GCF_000026525 GCF_000195375 GCF_000418495 GCF_000418475 GCF_000466865 GCF_000311965 GCF_000226235 GCF_000173255 GCF_000235865 GCF_000233755 GCF_000311945
+s__Campylobacter_ureolyticus 2 GCF_000374605 GCF_000413435
+s__Neptuniibacter_caesariensis 1 GCF_000153345
+s__Succinispira_mobilis 1 GCF_000384135
+s__Acinetobacter_bouvetii 2 GCF_000373725 GCF_000368865
+s__Vibrio_sp_EJY3 1 GCF_000241385
+s__Blotched_snakehead_virus 1 PRJNA14921
+s__Mycobacterium_leprae 2 GCF_000195855 GCF_000026685
+s__Cleome_leaf_crumple_virus 1 PRJNA81005
+s__Helicobasidium_mompa_endornavirus_1 1 PRJNA41437
+s__Bacillus_phage_GIL16c 1 PRJNA15164
+s__Brucella_sp_F96_2 1 GCF_000371025
+s__Rhodobacteraceae_bacterium_HTCC2150 1 GCF_000169395
+s__Kazachstania_africana 1 GCA_000304475
+s__Halomonas_phage_phiHAP_1 1 PRJNA28763
+s__Rhodospirillum_photometricum 1 GCF_000284415
+s__Citrus_bent_leaf_viroid 3 PRJNA14903 PRJNA14969 PRJNA14972
+s__Yersinia_aldovae 1 GCF_000173735
+s__Octadecabacter_arcticus 1 GCF_000155735
+s__Listeria_phage_P40 1 PRJNA32073
+s__Papaya_lethal_yellowing_virus 1 PRJNA173050
+s__Brevibacterium_massiliense 1 GCF_000285915
+s__Mycobacterium_phage_Tweety 1 PRJNA20787
+s__Dendrolimus_punctatus_tetravirus 1 PRJNA15120
+s__Pseudomonas_phage_phi15 1 PRJNA63435
+s__Verrucomicrobium_sp_3C 1 GCF_000379365
+s__Leifsonia_sp_109 1 GCF_000380665
+s__Paenibacillus_sp_oral_taxon_786 1 GCF_000159955
+s__Entamoeba_dispar 1 GCA_000209125
+s__Staphylococcus_phage_80alpha 1 PRJNA19749
+s__Cassia_yellow_blotch_virus 1 PRJNA15419
+s__Aquareovirus_A 1 PRJNA16158
+s__Plesiocystis_pacifica 1 GCF_000170895
+s__Propionibacterium_granulosum 2 GCF_000464495 GCF_000463665
+s__Canarypox_virus 1 PRJNA14340
+s__Psychroflexus_gondwanensis 1 GCF_000355905
+s__Aciduliprofundum_boonei 2 GCF_000151085 GCF_000025665
+s__Thioalkalivibrio_sp_AKL10 1 GCF_000381845
+s__Thioalkalivibrio_sp_AKL11 1 GCF_000377845
+s__Thioalkalivibrio_sp_AKL12 1 GCF_000377925
+s__Thioalkalivibrio_sp_AKL17 1 GCF_000377885
+s__Prevotella_sp_oral_taxon_299 1 GCF_000163055
+s__Ruminococcus_obeum 2 GCF_000210015 GCF_000153905
+s__Mycoplasma_sp_G5847 1 GCF_000327395
+s__Sulfolobales_Mexican_fusellovirus_1 1 PRJNA195533
+s__Drosophila_C_virus 1 PRJNA14682
+s__Streptococcus_downei 1 GCF_000180055
+s__Mycobacterium_phage_Faith1 1 PRJNA67415
+s__Aliivibrio_salmonicida 1 GCF_000196495
+s__Avian_paramyxovirus_6 1 PRJNA14719
+s__Avian_paramyxovirus_4 1 PRJNA181250
+s__Promicromonospora_sukumoe 1 GCF_000385135
+s__Caldicellulosiruptor_saccharolyticus 1 GCF_000016545
+s__Mycobacterium_phage_First 1 PRJNA195529
+s__Sphingomonas_echinoides 1 GCF_000241465
+s__Beet_mild_yellowing_virus 1 PRJNA15079
+s__Prevotella_sp_oral_taxon_472 1 GCF_000163495
+s__Prevotella_sp_oral_taxon_473 1 GCF_000318095
+s__Bacteroides_sp_2_1_22 1 GCF_000162155
+s__Black_raspberry_virus_F 1 PRJNA20975
+s__Grapevine_Bulgarian_latent_virus 1 PRJNA66553
+s__Ageratum_yellow_vein_Taiwan_virus 1 PRJNA14249
+s__Marine_group_II_euryarchaeote_SCGC_AAA288_C18 1 GCF_000382765
+s__Pseudomonas_psychrotolerans 1 GCF_000236825
+s__Microbulbifer_agarilyticus 1 GCF_000220505
+s__Psychroflexus_tropicus 1 GCF_000378765
+s__Vibrio_phage_VBM1 1 PRJNA195494
+s__Rodent_pegivirus 1 PRJNA198868
+s__Yunnan_orbivirus 1 PRJNA16242
+s__Helicobacter_fennelliae 1 GCF_000509365
+s__Propionibacterium_phage_PHL112N00 1 PRJNA219110
+s__Alternanthera_yellow_vein_betasatellite 1 PRJNA19833
+s__Brevibacillus_laterosporus 4 GCF_000237005 GCF_000472325 GCF_000219535 GCF_000374385
+s__Acinetobacter_sp_NIPH_2036 1 GCF_000413935
+s__Nitrosomonas_eutropha 1 GCF_000014765
+s__Lily_virus_X 1 PRJNA15494
+s__Mycobacterium_phage_Leo 1 PRJNA209361
+s__Zaire_ebolavirus 1 PRJNA14703
+s__Halorubrum_pleomorphic_virus_2 1 PRJNA157257
+s__Tomato_leaf_curl_Madagascar_virus 1 PRJNA15211
+s__Glypta_fumiferanae_ichnovirus 1 PRJNA18767
+s__Lactobacillus_vaginalis 1 GCF_000159435
+s__Bacillus_phage_Bastille 1 PRJNA177550
+s__Gluconacetobacter_diazotrophicus 2 GCF_000021325 GCF_000067045
+s__Thioalkalivibrio_sp_ALJ20 1 GCF_000378585
+s__Thioalkalivibrio_sp_ALJ21 1 GCF_000378605
+s__Sedimentibacter_sp_B4 1 GCF_000309315
+s__Thioalkalivibrio_sp_ALJ24 1 GCF_000377785
+s__Pseudomonas_phage_F8 1 PRJNA16388
+s__Shigella_phage_Sf6 1 PRJNA14498
+s__Sulfitobacter_sp_NAS_14_1 1 GCF_000152645
+s__Chlamydophila_abortus 2 GCF_000026025 GCF_000213905
+s__Pseudomonas_resinovorans 1 GCF_000412695
+s__Human_coronavirus_229E 1 PRJNA14913
+s__Slackia_piriformis 1 GCF_000296445
+s__Crassocephalum_yellow_vein_virus 1 PRJNA18659
+s__Strawberry_latent_ringspot_virus_satellite_RNA 1 PRJNA15155
+s__Citrus_viroid_VI 1 PRJNA42701
+s__CAS_virus 1 PRJNA173353
+s__Staphylococcus_xylosus 1 GCF_000338275
+s__Pseudomonas_denitrificans 1 GCF_000349845
+s__Porphyromonas_crevioricanis 1 GCF_000509245
+s__Bacillus_phage_phi105 1 PRJNA14217
+s__Streptomyces_phage_phiBT1 1 PRJNA14276
+s__Yersinia_phage_phi80_18 1 PRJNA184145
+s__Pseudomonas_sp_R81 1 GCF_000257625
+s__Chicken_anemia_virus 1 PRJNA15484
+s__Escherichia_phage_HK639 1 PRJNA76729
+s__Mycobacterium_orygis 1 GCF_000353205
+s__Streptomyces_hygroscopicus 2 GCF_000340845 GCF_000245355
+s__Mycobacterium_phage_PegLeg 1 PRJNA206038
+s__Barfin_flounder_nervous_necrosis_virus 1 PRJNA41605
+s__Rhodococcus_phage_RER2 1 PRJNA81173
+s__Mycobacterium_phage_Ramsey 1 PRJNA32019
+s__Simian_adenovirus_20 1 PRJNA192869
+s__Candidatus_Poribacteria_sp_WGA_4E 1 GCF_000372285
+s__Halorubrum_tebenquichense 1 GCF_000337415
+s__Crenarchaeota_archaeon_SCGC_AAA471_O08 1 GCF_000398765
+s__Cymbidium_ringspot_virus 1 PRJNA15066
+s__Pseudomonas_phage_phiIBB_PF7A 1 PRJNA64561
+s__Chryseobacterium_taeanense 1 GCF_000304615
+s__Cupriavidus_basilensis 2 GCF_000282815 GCF_000243095
+s__Methanolinea_tarda 1 GCF_000235685
+s__Pseudoxanthomonas_suwonensis 1 GCF_000185965
+s__Dialister_micraerophilus 2 GCF_000183445 GCF_000194985
+s__Actinomyces_sp_ICM39 1 GCF_000282935
+s__Mycoplasma_cynos 1 GCF_000328725
+s__Cherry_leaf_roll_virus 1 PRJNA66187
+s__Singapore_grouper_iridovirus 1 PRJNA14544
+s__Halovirus_HRTV_7 1 PRJNA206491
+s__Halovirus_HRTV_5 1 PRJNA206492
+s__Tomato_yellow_leaf_curl_Sardinia_virus 1 PRJNA14484
+s__Bradyrhizobium_sp_ORS_375 1 GCF_000239775
+s__Saprospira_grandis 2 GCF_000275825 GCF_000250635
+s__Halovirus_HRTV_8 1 PRJNA206490
+s__Brucella_microti 1 GCF_000022745
+s__Tomato_chlorosis_virus 1 PRJNA15587
+s__Ramlibacter_tataouinensis 1 GCF_000215705
+s__Tomato_leaf_curl_Togo_virus 1 PRJNA34813
+s__Nevskia_ramosa 1 GCF_000420645
+s__Clostridium_arbusti 1 GCF_000246895
+s__Shinella_zoogloeoides 1 GCF_000496935
+s__Cyanophage_PP 1 PRJNA227004
+s__Human_picobirnavirus 1 PRJNA15248
+s__Desulfovibrio_piger 1 GCF_000156375
+s__Rhinolophus_bat_coronavirus_HKU2 1 PRJNA27911
+s__Pseudomonas_sp_UK4 1 GCF_000174915
+s__Syntrophobacter_fumaroxidans 1 GCF_000014965
+s__Turicella_otitidis 2 GCF_000296405 GCF_000297795
+s__Megasphaera_sp_NM10 1 GCF_000417505
+s__Nocardiopsis_prasina 1 GCF_000341265
+s__Candidatus_Sulcia_muelleri 5 GCF_000017525 GCF_000022945 GCF_000025785 GCF_000168155 GCF_000147035
+s__Leuconostoc_gelidum 2 GCF_000298875 GCF_000166715
+s__Sporomusa_ovata 1 GCF_000445445
+s__Cenarchaeum_symbiosum 1 GCF_000200715
+s__Pseudoalteromonas_phage_PM2 1 PRJNA14237
+s__Rhopalosiphum_padi_virus 1 PRJNA14648
+s__Enterobacteria_phage_ST104 1 PRJNA14499
+s__Mariprofundus_ferrooxydans 2 GCF_000379405 GCF_000153765
+s__Sida_leaf_curl_virus_associated_DNA_beta 1 PRJNA16226
+s__Amasya_cherry_disease_associated_mycovirus 1 PRJNA15010
+s__Polaromonas_sp_CF318 1 GCF_000282655
+s__Ruminococcus_torques 2 GCF_000153925 GCF_000210035
+s__Treponema_socranskii 3 GCF_000413015 GCF_000468115 GCF_000464455
+s__Mycobacterium_phage_Che9c 1 PRJNA14271
+s__Mycobacterium_phage_Che9d 1 PRJNA14339
+s__Coccidioides_immitis 1 GCA_000149335
+s__Staphylothermus_marinus 1 GCF_000015945
+s__Streptomyces_collinus 1 GCF_000444875
+s__Brevibacillus_borstelensis 1 GCF_000353565
+s__Peanut_witches_broom_phytoplasma 1 GCF_000364425
+s__Bacteroides_uniformis 4 GCF_000403175 GCF_000273785 GCF_000154205 GCF_000273275
+s__Cucurbit_chlorotic_yellows_virus 1 PRJNA170929
+s__Streptomyces_sp_AA4 1 GCF_000158875
+s__Cabbage_leaf_curl_virus 1 PRJNA14187
+s__Strawberry_mottle_virus 1 PRJNA14740
+s__Candidatus_Saccharimonas_aalborgensis 1 GCF_000392435
+s__Salmonella_phage_ST160 1 PRJNA61857
+s__Amycolatopsis_decaplanina 1 GCF_000342005
+s__Synechocystis_sp_PCC_6803 3 GCF_000340785 GCF_000270265 GCF_000009725
+s__Fangia_hongkongensis 1 GCF_000379445
+s__Cellulomonas_sp_JC225 1 GCF_000312005
+s__UR2_sarcoma_virus 1 PRJNA15322
+s__Anoxybacillus_kamchatkensis 1 GCF_000283415
+s__Tobacco_leaf_curl_Zimbabwe_virus 1 PRJNA14119
+s__Saccharum_streak_virus 1 PRJNA41611
+s__Cotton_leafroll_dwarf_virus 1 PRJNA53497
+s__Hymenobacter_norwichensis 1 GCF_000420705
+s__Starling_circovirus 1 PRJNA16796
+s__Simian_adenovirus_B 1 PRJNA64487
+s__Simian_adenovirus_C 1 PRJNA200956
+s__Simian_adenovirus_A 1 PRJNA14491
+s__Caladenia_virus_A 1 PRJNA174779
+s__Enterococcus_asini 2 GCF_000393955 GCF_000407365
+s__Lymphocystis_disease_virus_1 1 PRJNA14081
+s__Agrobacterium_vitis 1 GCF_000016285
+s__Yam_mild_mosaic_virus 1 PRJNA179432
+s__Staphylococcus_phage_K 1 PRJNA14479
+s__Paludibacter_propionicigenes 1 GCF_000183135
+s__Micromonas_sp_RCC1109_virus_MpV1 1 PRJNA61013
+s__Rosellinia_necatrix_megabirnavirus_1 1 PRJNA41609
+s__Cotton_leaf_curl_Gezira_betasatellite 1 PRJNA15166
+s__Fig_fleck_associated_virus 1 PRJNA64495
+s__Cauliflower_mosaic_virus 1 PRJNA14574
+s__Microbacterium_sp_TS_1 1 GCF_000509385
+s__Nupapillomavirus_1 1 PRJNA15485
+s__Helicobacter_acinonychis 1 GCF_000009305
+s__Erythrobacter_sp_NAP1 1 GCF_000152865
+s__Vaccinia_virus 1 PRJNA15241
+s__Murine_norovirus 1 PRJNA17577
+s__Kennedya_yellow_mosaic_virus 1 PRJNA14644
+s__Chromobacterium_sp_C_61 1 GCF_000285415
+s__Blautia_hydrogenotrophica 1 GCF_000157975
+s__Methylophilus_sp_1 1 GCF_000374225
+s__Streptomyces_sp_CNS615 1 GCF_000365385
+s__Potato_yellow_mosaic_virus 1 PRJNA14065
+s__Acinetobacter_sp_CIP_53_82 1 GCF_000369465
+s__Pantoea_sp_GM01 1 GCF_000282675
+s__Chaetomium_globosum 1 GCA_000143365
+s__Brucella_sp_F5_06 1 GCF_000370985
+s__Geobacillus_sp_A8 1 GCF_000447395
+s__Synechococcus_phage_S_SM1 1 PRJNA64701
+s__Synechococcus_phage_S_SM2 1 PRJNA64695
+s__Salmonella_phage_Vi_II_E1 1 PRJNA29079
+s__Spirochaeta_bajacaliforniensis 1 GCF_000378205
+s__Sphingobium_japonicum 1 GCF_000091125
+s__Clostridium_hiranonis 1 GCF_000156055
+s__Streptomyces_phage_mu1_6 1 PRJNA16706
+s__East_African_cassava_mosaic_Malawi_virus 1 PRJNA226083
+s__Marinimicrobia_bacterium_JGI_0000059_L03 1 GCF_000365325
+s__Chickpea_chlorotic_dwarf_virus 2 PRJNA28581 PRJNA30715
+s__Methanocaldococcus_vulcanius 1 GCF_000024625
+s__Burkholderiales_bacterium_1_1_47 1 GCF_000144975
+s__Roseburia_hominis 1 GCF_000225345
+s__Rhopapillomavirus_1 1 PRJNA14545
+s__Actinoplanes_friuliensis 1 GCF_000494755
+s__Caldicellulosiruptor_hydrothermalis 1 GCF_000166355
+s__Shewanella_pealeana 1 GCF_000018285
+s__Nocardia_brasiliensis 1 GCF_000250675
+s__Paenibacillus_phage_PG1 1 PRJNA209208
+s__Rabbit_vesivirus 1 PRJNA18289
+s__Pelargonium_necrotic_spot_virus 1 PRJNA15214
+s__Tomato_leaf_curl_Mali_virus 1 PRJNA14349
+s__Coprobacillus_sp_D6 1 GCF_000269565
+s__Coprobacillus_sp_D7 1 GCF_000158555
+s__Tomato_leaf_curl_Palampur_virus 1 PRJNA30181
+s__Barley_yellow_dwarf_virus 1 PRJNA208540
+s__Sphingomonas_phage_PAU 1 PRJNA181225
+s__Feline_immunodeficiency_virus 1 PRJNA15029
+s__Bacteroides_pectinophilus 1 GCF_000155855
+s__Halorubrum_aidingense 1 GCF_000336995
+s__Corchorus_yellow_vein_virus 1 PRJNA14563
+s__Leptotrichia_hofstadii 1 GCF_000162955
+s__Rhodococcus_ruber 2 GCF_000341965 GCF_000347955
+s__Halastavi_arva_RNA_virus 1 PRJNA77939
+s__Escherichia_hermannii 1 GCF_000248015
+s__Barfin_flounder_virus_BF93Hok 1 PRJNA30741
+s__Amycolatopsis_alba 1 GCF_000384215
+s__Candidatus_Midichloria_mitochondrii 1 GCF_000219355
+s__Lawsonia_intracellularis 2 GCF_000331715 GCF_000055945
+s__Mycobacterium_xenopi 1 GCF_000257745
+s__Bartonella_quintana 2 GCF_000294715 GCF_000046685
+s__Pseudomonas_phage_PA11 1 PRJNA16386
+s__Cupriavidus_necator 2 GCF_000009285 GCF_000219215
+s__Desulfovibrio_vulgaris 4 GCF_000166115 GCF_000021385 GCF_000195755 GCF_000015485
+s__actinobacterium_SCGC_AAA278_I18 1 GCF_000378865
+s__Cronobacter_sakazakii 7 GCF_000263215 GCF_000319615 GCF_000017665 GCF_000339015 GCF_000319595 GCF_000214745 GCF_000316155
+s__Bat_coronavirus_1B 1 PRJNA29249
+s__Pseudogulbenkiania_ferrooxidans 2 GCF_000174355 GCF_000462205
+s__Bovine_respiratory_coronavirus_bovine_US_OH_440_TC_1996 1 PRJNA39333
+s__Bifidobacterium_catenulatum 1 GCF_000173455
+s__Melanoplus_sanguinipes_entomopoxvirus 1 PRJNA14042
+s__Borrelia_duttonii 1 GCF_000019685
+s__Bizionia_argentinensis 1 GCF_000224335
+s__Tomato_leaf_curl_Guangxi_virus 1 PRJNA17607
+s__Cronobacter_condimenti 1 GCF_000319285
+s__Paenibacillus_mucilaginosus 3 GCF_000250655 GCF_000258535 GCF_000218915
+s__Alishewanella_jeotgali 1 GCF_000245735
+s__Hyphomicrobium_denitrificans 2 GCF_000230975 GCF_000143145
+s__Ophiostoma_mitovirus_6 1 PRJNA14844
+s__Ophiostoma_mitovirus_5 1 PRJNA14843
+s__Mossman_virus 1 PRJNA14915
+s__Gossypium_mustilinum_symptomless_alphasatellite 1 PRJNA39591
+s__Leptospira_noguchii 9 GCF_000244775 GCF_000216255 GCF_000350585 GCF_000350605 GCF_000346655 GCF_000306195 GCF_000243535 GCF_000243575 GCF_000306255
+s__Pelargonium_vein_banding_virus 1 PRJNA40631
+s__Eubacterium_eligens 1 GCF_000146185
+s__Botrytis_virus_X 1 PRJNA14947
+s__Botrytis_virus_F 1 PRJNA14707
+s__Legionella_anisa 1 GCF_000333755
+s__Clostridium_sp_7_3_54FAA 1 GCF_000233515
+s__Xanthomonas_fragariae 1 GCF_000376745
+s__Hepatitis_GB_virus_B 1 PRJNA15364
+s__Archaeal_BJ1_virus 1 PRJNA18503
+s__Citrus_sudden_death_associated_virus 1 PRJNA15170
+s__Brome_mosaic_virus 1 PRJNA15052
+s__Corynebacterium_sp_KPL1814 1 GCF_000478175
+s__Corynebacterium_sp_KPL1817 1 GCF_000478155
+s__Methanococcus_vannielii 1 GCF_000017165
+s__Clostridium_phage_phiCPV4 1 PRJNA169231
+s__Emticicia_oligotrophica 1 GCF_000263195
+s__Corynebacterium_sp_KPL1818 1 GCF_000478135
+s__Sweet_potato_chlorotic_stunt_virus 1 PRJNA14848
+s__Cryptobacterium_curtum 1 GCF_000023845
+s__Cell_fusing_agent_virus 1 PRJNA15326
+s__Salmonella_phage_SPN3UB 1 PRJNA181984
+s__Dolichos_yellow_mosaic_virus 1 PRJNA14344
+s__Eubacterium_sp_AS15 1 GCF_000287695
+s__Haloferax_alexandrinus 1 GCF_000336735
+s__Pseudomonas_taiwanensis 1 GCF_000500605
+s__Ruegeria_mobilis 1 GCF_000376545
+s__Aeromonas_media 1 GCF_000287215
+s__Saccharomonospora_saliphila 1 GCF_000383795
+s__Ajellomyces_dermatitidis 1 GCA_000003855
+s__Streptomyces_bingchenggensis 1 GCF_000092385
+s__Thalassomonas_phage_BA3 1 PRJNA27903
+s__Swinepox_virus 1 PRJNA14155
+s__Micromonospora_aurantiaca 1 GCF_000145235
+s__Bartonella_rattaustraliani 1 GCF_000312565
+s__Bombyx_mori_nucleopolyhedrovirus 2 PRJNA14089 PRJNA37971
+s__Rosellinia_necatrix_virus_1 1 PRJNA16156
+s__Desulfotomaculum_gibsoniae 1 GCF_000233715
+s__Salmonella_phage_FSL_SP_088 1 PRJNA212711
+s__Uncinocarpus_reesii 1 GCA_000003515
+s__Porcine_adenovirus_C 2 PRJNA14521 PRJNA40317
+s__Streptococcus_pyogenes 45 GCF_000499145 GCF_000483505 GCF_000013545 GCF_000012165 GCF_000013525 GCF_000011665 GCF_000499265 GCF_000013485 GCF_000483605 GCF_000263315 GCF_000454125 GCF_000250925 GCF_000018125 GCF_000483585 GCF_000275625 GCF_000007285 GCF_000011285 GCF_000290575 GCF_000444035 GCF_000483565 GCF_000483525 GCF_000011765 GCF_000006785 GCF_000230295 GCF_000444015 GCF_000290595 GCF_000468795 GCF_000250905 GCF_000483645 GCF_000483625 GCF_000422045 GCF_000499245 GCF_000499165 G [...]
+s__Rhizobium_sp_42MFCr_1 1 GCF_000377185
+s__Serratia_marcescens 5 GCF_000465615 GCF_000330865 GCF_000292365 GCF_000264275 GCF_000342205
+s__Porcine_bocavirus_3 1 PRJNA73547
+s__Okra_leaf_curl_India_virus 1 PRJNA61559
+s__Tomato_mosaic_virus 1 PRJNA14926
+s__Paenibacillus_vortex 1 GCF_000193415
+s__Halomonas_elongata 1 GCF_000196875
+s__Temperate_phage_phiNIH1_1 1 PRJNA14145
+s__Mobala_virus 1 PRJNA16582
+s__Eubacterium_ramulus 1 GCF_000469345
+s__Canna_yellow_streak_virus 1 PRJNA40629
+s__Streptomyces_griseoflavus 1 GCF_000158975
+s__Klebsiella_sp_1_1_55 1 GCF_000163075
+s__Snakehead_retrovirus 1 PRJNA14701
+s__Streptococcus_phage_SM1 1 PRJNA14295
+s__Neisseria_elongata 1 GCF_000176755
+s__Chaetoceros_lorenzianus_DNA_Virus 1 PRJNA63565
+s__Spirulina_subsalsa 1 GCF_000314005
+s__Thermobispora_bispora 1 GCF_000092645
+s__Salisaeta_longa 1 GCF_000419585
+s__Simian_sapelovirus 1 PRJNA14946
+s__Thetapapillomavirus_1 1 PRJNA14195
+s__Marinobacter_nanhaiticus 1 GCF_000364845
+s__Pseudomonas_phage_DMS3 1 PRJNA18521
+s__Aeromonas_phage_65 1 PRJNA64543
+s__Capnocytophaga_sp_CM59 1 GCF_000293175
+s__Marinobacter_hydrocarbonoclasticus 2 GCF_000015365 GCF_000284615
+s__Lactobacillus_rossiae 1 GCF_000277855
+s__Lodderomyces_elongisporus 1 GCA_000149685
+s__Dickeya_solani 1 GCF_000400565
+s__Myxococcus_phage_Mx8 1 PRJNA14391
+s__Soybean_dwarf_virus 1 PRJNA14715
+s__Streptococcus_mutans 145 GCF_000339395 GCF_000091645 GCF_000229045 GCF_000007465 GCF_000229645 GCF_000339835 GCF_000229745 GCF_000229465 GCF_000229025 GCF_000339935 GCF_000228865 GCF_000229325 GCF_000284575 GCF_000339795 GCF_000339195 GCF_000230025 GCF_000229165 GCF_000339355 GCF_000230045 GCF_000339155 GCF_000339695 GCF_000340055 GCF_000229905 GCF_000347795 GCF_000339475 GCF_000229725 GCF_000339575 GCF_000229425 GCF_000230005 GCF_000347835 GCF_000339815 GCF_000496535 GCF_000229065 GC [...]
+s__Streptococcus_phage_SMP 1 PRJNA18529
+s__Pseudomonas_phage_D3 1 PRJNA14500
+s__Actinomyces_sp_ph3 1 GCF_000308055
+s__Gordonia_sihwensis 1 GCF_000333035
+s__Serratia_sp_ATCC_39006 1 GCF_000463345
+s__Bacillus_sp_37MA 1 GCF_000372765
+s__Mycobacterium_phage_Phrux 1 PRJNA206029
+s__Candidatus_Regiella_insecticola 2 GCF_000284655 GCF_000143625
+s__Myzus_persicae_densovirus 1 PRJNA14299
+s__Weissella_paramesenteroides 1 GCF_000160575
+s__Deinococcus_apachensis 1 GCF_000381345
+s__Sporichthya_polymorpha 1 GCF_000384115
+s__Borrelia_hermsii 1 GCF_000012065
+s__Nocardiopsis_halotolerans 1 GCF_000341065
+s__Enterobacteria_phage_mEpX2 1 PRJNA183150
+s__Enterobacteria_phage_mEpX1 1 PRJNA183149
+s__Candidatus_Arthromitus_sp_SFB_co 1 GCF_000252765
+s__Aeromonas_phage_phiO18P 1 PRJNA19769
+s__Selenomonas_sp_CM52 1 GCF_000292955
+s__Streptococcus_phage_phi3396 1 PRJNA18859
+s__Penicillium_chrysogenum 1 GCA_000226395
+s__Rickettsia_prowazekii 12 GCF_000277265 GCF_000367405 GCF_000277225 GCF_000277245 GCF_000277165 GCF_000363905 GCF_000385495 GCF_000277185 GCF_000277205 GCF_000385475 GCF_000195735 GCF_000022785
+s__Loktanella_vestfoldensis 2 GCF_000152785 GCF_000382265
+s__Rhodonellum_psychrophilum 2 GCF_000473765 GCF_000381545
+s__Bacillus_phage_B4 1 PRJNA177520
+s__Lactobacillus_equicursoris 1 GCF_000312645
+s__Candidatus_Odyssella_thessalonicensis 1 GCF_000190415
+s__Streptococcus_ictaluri 1 GCF_000188015
+s__Omikronpapillomavirus_1 1 PRJNA15186
+s__Neisseria_polysaccharea 1 GCF_000176735
+s__Vibrio_phage_SIO_2 1 PRJNA80921
+s__Cherry_virus_A 1 PRJNA15080
+s__Raptor_adenovirus_A 1 PRJNA66343
+s__Chandipura_virus 1 PRJNA194137
+s__Archaeoglobus_sulfaticallidus 1 GCF_000385565
+s__Rhodobacter_phage_RcapNL 1 PRJNA192926
+s__Phlox_Virus_B 1 PRJNA27905
+s__Staphylococcus_phage_JD007 1 PRJNA183162
+s__Segniliparus_rugosus 1 GCF_000185725
+s__Tomato_yellow_leaf_curl_China_betasatellite 1 PRJNA181248
+s__Hop_stunt_viroid 1 PRJNA14720
+s__Malvastrum_yellow_mosaic_Cameroon_alphasatellite 1 PRJNA61909
+s__Cronobacter_turicensis 2 GCF_000319515 GCF_000027065
+s__Rickettsia_rhipicephali 1 GCF_000284075
+s__Leptospira_weilii 9 GCF_000243595 GCF_000217475 GCF_000332415 GCF_000246655 GCF_000243995 GCF_000244355 GCF_000244815 GCF_000246635 GCF_000216315
+s__Acetobacterium_woodii 1 GCF_000247605
+s__Lactobacillus_gastricus 1 GCF_000247775
+s__Saccharomonospora_viridis 1 GCF_000023865
+s__Campylobacter_gracilis 1 GCF_000175875
+s__Common_moorhen_coronavirus_HKU21 1 PRJNA109281
+s__Synechococcus_sp_WH_8102 1 GCF_000195975
+s__Mycoreovirus_3 1 PRJNA16143
+s__Methanoplanus_petrolearius 1 GCF_000147875
+s__Strawberry_vein_banding_virus 1 PRJNA15207
+s__Synechococcus_sp_WH_8109 1 GCF_000161795
+s__Escherichia_phage_phAPEC8 1 PRJNA185315
+s__Desulfotomaculum_kuznetsovii 1 GCF_000214705
+s__Dehalogenimonas_lykanthroporepellens 1 GCF_000143165
+s__Thermococcus_sp_CL1 1 GCF_000265525
+s__Thermaerobacter_subterraneus 1 GCF_000183545
+s__Enterococcus_dispar 2 GCF_000406945 GCF_000407585
+s__Synechococcus_sp_RS9917 1 GCF_000153065
+s__Synechococcus_sp_RS9916 1 GCF_000153825
+s__gamma_proteobacterium_HTCC2207 1 GCF_000153445
+s__Nitrosomonas_sp_AL212 1 GCF_000175095
+s__Sunflower_mild_mosaic_virus 1 PRJNA198478
+s__Lachnospiraceae_bacterium_3_1_46FAA 1 GCF_000209405
+s__Williamsia_sp_D3 1 GCF_000506245
+s__Bacillus_macauensis 1 GCF_000269865
+s__Thioalkalivibrio_sp_ALRh 1 GCF_000381425
+s__Hyperthermus_butylicus 1 GCF_000015145
+s__Thermodesulfobacterium_thermophilum 1 GCF_000421605
+s__Bilophila_sp_4_1_30 1 GCF_000224655
+s__Tannerella_sp_6_1_58FAA_CT1 1 GCF_000238695
+s__African_horse_sickness_virus 1 PRJNA14937
+s__Reticuloendotheliosis_virus 1 PRJNA15145
+s__Glaciecola_polaris 1 GCF_000315055
+s__Porcine_reproductive_and_respiratory_syndrome_virus 1 PRJNA15437
+s__Johnsongrass_mosaic_virus 1 PRJNA15349
+s__Haloterrigena_salina 1 GCF_000337495
+s__Granulicella_mallensis 1 GCF_000178955
+s__Chikungunya_virus 1 PRJNA14998
+s__Porcine_rubulavirus 1 PRJNA20055
+s__Burkholderia_sp_KJ006 1 GCF_000262695
+s__Tanapox_virus 2 PRJNA14595 PRJNA20981
+s__SAR324_cluster_bacterium_SCGC_AB_629_O05 1 GCF_000375805
+s__Leuconostoc_mesenteroides 4 GCF_000160595 GCF_000447945 GCF_000234825 GCF_000014445
+s__Selenomonas_sp_oral_taxon_138 1 GCF_000318175
+s__Balneola_vulgaris 1 GCF_000375465
+s__Selenomonas_sp_oral_taxon_137 1 GCF_000183625
+s__Eremococcus_coleocola 1 GCF_000183205
+s__Thauera_selenatis 1 GCF_000284915
+s__Cryptococcus_neoformans 2 GCA_000091045 GCA_000149385
+s__Neisseria_weaveri 2 GCF_000224255 GCF_000224275
+s__Roseiflexus_castenholzii 1 GCF_000017805
+s__Streptococcus_sp_M143 1 GCF_000162495
+s__Streptomyces_phage_Sujidade 1 PRJNA206036
+s__Erythrobacter_sp_SD_21 1 GCF_000181515
+s__Ammonifex_degensii 1 GCF_000024605
+s__Plasmodium_chabaudi 1 GCA_000003075
+s__Staphylococcus_pseudintermedius 3 GCF_000189495 GCF_000390045 GCF_000185885
+s__Leptothrix_ochracea 1 GCF_000262525
+s__Pepper_mottle_virus 1 PRJNA15312
+s__Gordonia_bronchialis 1 GCF_000024785
+s__Campylobacter_sp_03_427 1 GCF_000495505
+s__Allobaculum_stercoricanis 1 GCF_000384195
+s__Mycoplasma_leachii 2 GCF_000183365 GCF_000253095
+s__Beet_curly_top_virus 1 PRJNA14366
+s__Lamprocystis_purpurea 1 GCF_000379525
+s__Moorella_thermoacetica 1 GCF_000013105
+s__Opitutus_terrae 1 GCF_000019965
+s__Candidatus_Koribacter_versatilis 1 GCF_000014005
+s__Rangifer_tarandus_papillomavirus_2 1 PRJNA214364
+s__Pseudoplusia_includens_densovirus 1 PRJNA181249
+s__Malvastrum_leaf_curl_betasatellite 2 PRJNA16301 PRJNA16320
+s__Staphylococcus_sp_MDS7B 1 GCF_000387985
+s__Clostridium_sporosphaeroides 1 GCF_000383295
+s__Methanocorpusculum_labreanum 1 GCF_000015765
+s__Prevotella_bergensis 1 GCF_000160535
+s__Nocardiopsis_kunsanensis 1 GCF_000340965
+s__Peanut_mottle_virus 1 PRJNA15352
+s__Taterapox_virus 1 PRJNA17483
+s__Scotophilus_bat_coronavirus_512 1 PRJNA20135
+s__Arthrobacter_arilaitensis 1 GCF_000197735
+s__Thioalkalivibrio_sp_ALM2T 1 GCF_000381505
+s__Hyphomonas_neptunium 1 GCF_000013025
+s__Yam_bean_mosaic_virus 1 PRJNA78927
+s__Acidovorax_sp_KKS102 1 GCF_000302535
+s__Melon_necrotic_spot_virus 1 PRJNA15502
+s__Sweet_potato_leaf_curl_Shanghai_virus 1 PRJNA217878
+s__Streptococcus_phage_SP_QS1 1 PRJNA213015
+s__Propionibacterium_sp_KPL1838 1 GCF_000477735
+s__Mahella_australiensis 1 GCF_000213255
+s__Pectobacterium_phage_My1 1 PRJNA177525
+s__Spring_beauty_latent_virus 1 PRJNA15009
+s__Listeria_phage_LP_110 1 PRJNA212944
+s__Betacoronavirus_Erinaceus_VMC_DEU_2012 1 PRJNA226084
+s__Rheinheimera_perlucida 1 GCF_000382165
+s__Faba_bean_necrotic_yellows_virus 1 PRJNA14427
+s__Pumpkin_yellow_mosaic_virus 1 PRJNA30159
+s__Bacillus_sp_105MF 1 GCF_000374885
+s__Nitratireductor_pacificus 1 GCF_000300335
+s__Staphylococcus_phage_tp310_2 1 PRJNA20661
+s__Staphylococcus_phage_tp310_3 1 PRJNA20663
+s__Streptomyces_purpureus 1 GCF_000384175
+s__Comamonas_sp_B_9 1 GCF_000410635
+s__Tomato_planta_macho_viroid 1 PRJNA15000
+s__Cyanothece_sp_CCY0110 1 GCF_000169335
+s__Parabacteroides_johnsonii 2 GCF_000156495 GCF_000307375
+s__Borrelia_crocidurae 1 GCF_000259345
+s__Eggerthella_sp_HGA1 1 GCF_000191845
+s__Xenorhabdus_bovienii 1 GCF_000027225
+s__Simonsiella_muelleri 1 GCF_000163775
+s__Marinimicrobia_bacterium_SCGC_AAA160_C11 1 GCF_000402795
+s__Candidatus_Hamiltonella_defensa 1 GCF_000021705
+s__Luna_virus 1 PRJNA76617
+s__Erwinia_phage_vB_EamP_S6 1 PRJNA181230
+s__Treponema_maltophilum 1 GCF_000413055
+s__Oenococcus_kitaharae 1 GCF_000241055
+s__Velvet_tobacco_mottle_virus_Satellite_RNA 1 PRJNA14194
+s__Ralstonia_phage_RSS20 1 PRJNA213020
+s__Eubacteriaceae_bacterium_CM5 1 GCF_000238135
+s__Erwinia_phage_phiEa100 1 PRJNA184154
+s__Eubacteriaceae_bacterium_CM2 1 GCF_000238095
+s__Planococcus_donghaensis 1 GCF_000189395
+s__Mycobacterium_phage_Fruitloop 1 PRJNA32013
+s__Treponema_caldaria 1 GCF_000219725
+s__Lactobacillus_phage_A2 1 PRJNA14602
+s__Banana_streak_IM_virus 1 PRJNA66619
+s__Epiphyas_postvittana_nucleopolyhedrovirus 1 PRJNA14127
+s__Campylobacter_phage_CP21 1 PRJNA181241
+s__Aeromonas_phage_Aes508 1 PRJNA181986
+s__Solobacterium_moorei 1 GCF_000186945
+s__Clostridium_difficile 209 GCF_000452345 GCF_000449885 GCF_000085225 GCF_000448825 GCF_000448925 GCF_000154685 GCF_000450885 GCF_000451905 GCF_000451965 GCF_000450945 GCF_000449325 GCF_000027105 GCF_000451865 GCF_000451885 GCF_000449025 GCF_000448765 GCF_000450865 GCF_000450645 GCF_000450625 GCF_000450385 GCF_000155065 GCF_000164175 GCF_000451385 GCF_000450145 GCF_000449625 GCF_000450405 GCF_000449565 GCF_000451765 GCF_000009205 GCF_000449465 GCF_000450425 GCF_000451485 GCF_000473585 G [...]
+s__Pepper_yellow_dwarf_virus_New_Mexico 1 PRJNA31127
+s__Tomato_leaf_curl_Karnataka_virus 1 PRJNA14192
+s__Aminobacterium_colombiense 1 GCF_000025885
+s__Herbaspirillum_sp_JC206 1 GCF_000312045
+s__Hydrogenobaculum_sp_SN 1 GCF_000348765
+s__Deftia_phage_phiW_14 1 PRJNA42945
+s__Xenococcus_sp_PCC_7305 1 GCF_000332055
+s__Marinimicrobia_bacterium_SCGC_AAA298_D23 1 GCF_000402655
+s__Xanthomonas_vasicola 6 GCF_000277995 GCF_000278035 GCF_000278075 GCF_000278055 GCF_000278015 GCF_000159795
+s__Polaribacter_irgensii 1 GCF_000153225
+s__Succinimonas_amylolytica 1 GCF_000378405
+s__Spodoptera_exigua_multiple_nucleopolyhedrovirus 1 PRJNA14134
+s__Clostridium_ultunense 1 GCF_000344075
+s__Furcraea_necrotic_streak_virus 1 PRJNA192610
+s__Natrialba_hulunbeirensis 1 GCF_000337575
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/metaphlan2.git
More information about the debian-med-commit
mailing list