[med-svn] [Git][med-team/metaphlan2][upstream] New upstream version 3.1.0

Andreas Tille (@tille) gitlab at salsa.debian.org
Tue Aug 2 14:10:32 BST 2022



Andreas Tille pushed to branch upstream at Debian Med / metaphlan2


Commits:
94e55251 by Andreas Tille at 2022-08-02T15:07:25+02:00
New upstream version 3.1.0
- - - - -


15 changed files:

- README.md
- bioconda_recipe/meta.yaml
- changeset.txt
- metaphlan/metaphlan.py
- metaphlan/strainphlan.py
- metaphlan/utils/add_metadata_tree.py
- metaphlan/utils/external_exec.py
- metaphlan/utils/extract_markers.py
- metaphlan/utils/metaphlan2krona.py
- metaphlan/utils/parallelisation.py
- metaphlan/utils/plot_tree_graphlan.py
- metaphlan/utils/sample2markers.py
- metaphlan/utils/strain_transmission.py
- metaphlan/utils/util_fun.py
- setup.py


Changes:

=====================================
README.md
=====================================
@@ -1,20 +1,16 @@
 # MetaPhlAn: Metagenomic Phylogenetic Analysis
 [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/metaphlan/README.html) [![PyPI - Downloads](https://img.shields.io/pypi/dm/metaphlan?label=MetaPhlAn%20on%20PyPi)](https://pypi.org/project/MetaPhlAn/) [![MetaPhlAn on DockerHub](https://img.shields.io/docker/pulls/biobakery/metaphlan?label=MetaPhlAn%20on%20DockerHub)](https://hub.docker.com/r/biobakery/metaphlan) [![Build MetaPhlAn package](https://github.com/biobakery/MetaPhlAn/workflows/Build%20MetaPhlAn%20package/badge.svg?branch=3.0)](https://github.com/biobakery/MetaPhlAn/actions?query=workflow%3A%22Build+MetaPhlAn+package%22)
-## What's new in version 3
-* New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
-* Estimation of metagenome composed by unknown microbes with parameter `--unknown_estimation`
-* Automatic retrieval and installation of the latest MetaPhlAn database  with parameter `--index latest`
-* Virus profiling with `--add_viruses`
-* Calculation of metagenome size for improved estimation of reads mapped to a given clade
-* Inclusion of NCBI taxonomy ID in the ouput file
-* CAMI (Taxonomic) Profiling Output Format included
-* Removal of reads with low MAPQ values
+## What's new in version 3.1
+* 433 low-quality species were removed from the MetaPhlAn 3.1 marker database and 2,680 species were added (for a new total of 15,766; a 17% increase).
+* Marker genes for a subset of existing bioBakery 3 species were also revised.
+* Most existing bioBakery 3 species pangenomes were updated with revised or expanded gene content.
+* MetaPhlAn 3.1 software has been updated to work with revised marker database.
 -------------
 
 ## Description
 MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
 
-MetaPhlAn relies on ~1.1M unique clade-specific marker genes (the latest marker information file `mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2` can be found  [here](https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAAlyQITZuUCtBUJxpxhIroIa/mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2?dl=1)) identified from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic), allowing:
+MetaPhlAn relies on ~1.1M unique clade-specific marker genes (the latest marker information file `mpa_v31_CHOCOPhlAn_201901_marker_info.txt.bz2` can be found  [here](http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v31_CHOCOPhlAn_201901_marker_info.txt.bz2)) identified from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic), allowing:
 
 * unambiguous taxonomic assignments;
 * accurate estimation of organismal relative abundance;


=====================================
bioconda_recipe/meta.yaml
=====================================
@@ -1,5 +1,5 @@
 {% set name = "metaphlan" %}
-{% set version = "3.0" %}
+{% set version = "3.1" %}
 
 package:
   name: {{ name }}
@@ -40,6 +40,14 @@ requirements:
     - pysam
     - raxml >=8.2.10
     - samtools >=1.9
+    - r-base
+    - r-essentials
+    - r-optparse
+    - r-rbiom
+    - r-ape
+    - r-compositions
+    - r-biocmanager
+    - bioconductor-microbiome
 
 test:
   commands:


=====================================
changeset.txt
=====================================
@@ -1,3 +1,9 @@
+=== Version 3.1
+* 433 low-quality species were removed from the MetaPhlAn 3.1 marker database and 2,680 species were added (for a new total of 15,766; a 17% increase).
+* Marker genes for a subset of existing bioBakery 3 species were also revised.
+* Most existing bioBakery 3 species pangenomes were updated with revised or expanded gene content.
+* MetaPhlAn 3.1 software has been updated to work with revised marker database.
+
 === Version 3.0
 * New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
 * Estimation of metagenome composed by unknown microbes with parameter `--unknown_estimation`


=====================================
metaphlan/metaphlan.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Francesco Beghini (francesco.beghini at unitn.it),'
               'Duy Tin Truong, '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 import sys
 try:
@@ -1053,6 +1053,13 @@ def main():
                              .format(pars['bowtie2db']))
             sys.exit(1)
 
+        # check for an incomplete build
+        if bow and not (abs(os.path.getsize(".".join([str(pars['bowtie2db']), "1.bt2"])) - os.path.getsize(".".join([str(pars['bowtie2db']), "rev.1.bt2"]))) <= 1000):
+            sys.stderr.write("Partial MetaPhlAn BowTie2 database found at {}. "
+                             "Please remove and rebuild the database.\nExiting..."
+                             .format(pars['bowtie2db']))
+            sys.exit(1)
+
         if bow:
             run_bowtie2(pars['inp'], pars['bowtie2out'], pars['bowtie2db'],
                                 pars['bt2_ps'], pars['nproc'], file_format=pars['input_type'],


=====================================
metaphlan/strainphlan.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 
 import sys
@@ -29,7 +29,7 @@ from Bio.Seq import Seq
 metaphlan_script_install_folder = os.path.dirname(os.path.abspath(__file__))
 DEFAULT_DB_FOLDER = os.path.join(metaphlan_script_install_folder, "metaphlan_databases")
 DEFAULT_DB_FOLDER = os.environ.get('METAPHLAN_DB_DIR', DEFAULT_DB_FOLDER)
-DEFAULT_DB_NAME =  "mpa_v30_CHOCOPhlAn_201901.pkl"
+DEFAULT_DB_NAME =  "mpa_v31_CHOCOPhlAn_201901.pkl"
 DEFAULT_DATABASE = os.path.join(DEFAULT_DB_FOLDER, DEFAULT_DB_NAME)
 PHYLOPHLAN_MODES = ['accurate', 'fast']
 


=====================================
metaphlan/utils/add_metadata_tree.py
=====================================
@@ -1,8 +1,8 @@
 #!/usr/bin/env python
 __author__ = ('Duy Tin Truong (duytin.truong at unitn.it), '
               'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0'
-__date__    = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__    = '25 Jul 2021'
 
 import argparse as ap
 import pandas
@@ -102,4 +102,4 @@ def main():
             ofile.write(line)
 
 if __name__ == '__main__':
-    main()
\ No newline at end of file
+    main()


=====================================
metaphlan/utils/external_exec.py
=====================================
@@ -3,10 +3,10 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.8'
-__date__ = '7 May 2021'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
-import os, sys, re, shutil
+import os, sys, re, shutil, tempfile
 import subprocess as sb
 try:
     from .util_fun import info, error
@@ -245,7 +245,9 @@ Generates a FASTA file with the markers form a MetaPhlAn database
 :param output_dir: the output directory
 """
 def generate_markers_fasta(database, output_dir):
-    db_markers_faa = output_dir+"db_markers.fna"
+    
+    file_out, db_markers_faa = tempfile.mkstemp(dir=output_dir,prefix="db_markers_temp_",suffix=".fna")
+    os.close(file_out)
     bowtie_database, _ = os.path.splitext(database)
     params = {
         "program_name" : "bowtie2-inspect",


=====================================
metaphlan/utils/extract_markers.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 import sys
 try:
@@ -31,7 +31,7 @@ except ImportError:
 metaphlan_script_install_folder = os.path.dirname(os.path.abspath(__file__))
 DEFAULT_DB_FOLDER = os.path.join(metaphlan_script_install_folder, "../metaphlan_databases")
 DEFAULT_DB_FOLDER = os.environ.get('METAPHLAN_DB_DIR', DEFAULT_DB_FOLDER)
-DEFAULT_DB_NAME =  "mpa_v30_CHOCOPhlAn_201901.pkl"
+DEFAULT_DB_NAME =  "mpa_v31_CHOCOPhlAn_201901.pkl"
 DEFAULT_DATABASE = os.path.join(DEFAULT_DB_FOLDER, DEFAULT_DB_NAME)
 
 """


=====================================
metaphlan/utils/metaphlan2krona.py
=====================================
@@ -39,7 +39,7 @@ def main():
 
             x_cells = x.split('\t')
             lineage = '\t'.join(x_cells[0:(len(x_cells) -1)])
-            abundance = float(x_cells[-1].rstrip('\n')) 
+            abundance = float(x_cells[-2].rstrip('\n')) 
 
             metaPhLan_FH.write('%s\n'%(str(abundance) + '\t' + lineage))
 


=====================================
metaphlan/utils/parallelisation.py
=====================================
@@ -3,8 +3,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 try:
     from .util_fun import error


=====================================
metaphlan/utils/plot_tree_graphlan.py
=====================================
@@ -1,8 +1,8 @@
 #!/usr/bin/env python
 __author__ = ('Duy Tin Truong (duytin.truong at unitn.it), '
               'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0'
-__date__    = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__    = '25 Jul 2022'
 
 import argparse as ap
 import dendropy


=====================================
metaphlan/utils/sample2markers.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 import sys
 try:


=====================================
metaphlan/utils/strain_transmission.py
=====================================
@@ -1,7 +1,7 @@
 __author__ = ('Aitor Blanco (aitor.blancomiguez at unitn.it), '
              'Mireia Valles-Colomer (mireia.vallescolomer at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 import os, time, sys
 import argparse as ap


=====================================
metaphlan/utils/util_fun.py
=====================================
@@ -3,8 +3,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
               'Francesco Asnicar (f.asnicar at unitn.it), '
               'Moreno Zolfo (moreno.zolfo at unitn.it), '
               'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
 
 
 import os, sys, re, pickletools, pickle, time, bz2, gzip
@@ -134,4 +134,4 @@ def is_number(s):
         int(s)
         return True
     except ValueError:
-        return False
\ No newline at end of file
+        return False


=====================================
setup.py
=====================================
@@ -13,7 +13,7 @@ if sys.version_info[0] < 3:
 
 setuptools.setup(
     name='MetaPhlAn',
-    version='3.0.14',
+    version='3.1.0',
     author='Francesco Beghini',
     author_email='francesco.beghini at unitn.it',
     url='http://github.com/biobakery/MetaPhlAn/',



View it on GitLab: https://salsa.debian.org/med-team/metaphlan2/-/commit/94e552517e1a0c952346013af04990c154743b55

-- 
View it on GitLab: https://salsa.debian.org/med-team/metaphlan2/-/commit/94e552517e1a0c952346013af04990c154743b55
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220802/966215c5/attachment-0001.htm>


More information about the debian-med-commit mailing list