[med-svn] [Git][med-team/metaphlan2][upstream] New upstream version 3.1.0
Andreas Tille (@tille)
gitlab at salsa.debian.org
Tue Aug 2 14:10:32 BST 2022
Andreas Tille pushed to branch upstream at Debian Med / metaphlan2
Commits:
94e55251 by Andreas Tille at 2022-08-02T15:07:25+02:00
New upstream version 3.1.0
- - - - -
15 changed files:
- README.md
- bioconda_recipe/meta.yaml
- changeset.txt
- metaphlan/metaphlan.py
- metaphlan/strainphlan.py
- metaphlan/utils/add_metadata_tree.py
- metaphlan/utils/external_exec.py
- metaphlan/utils/extract_markers.py
- metaphlan/utils/metaphlan2krona.py
- metaphlan/utils/parallelisation.py
- metaphlan/utils/plot_tree_graphlan.py
- metaphlan/utils/sample2markers.py
- metaphlan/utils/strain_transmission.py
- metaphlan/utils/util_fun.py
- setup.py
Changes:
=====================================
README.md
=====================================
@@ -1,20 +1,16 @@
# MetaPhlAn: Metagenomic Phylogenetic Analysis
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/metaphlan/README.html) [![PyPI - Downloads](https://img.shields.io/pypi/dm/metaphlan?label=MetaPhlAn%20on%20PyPi)](https://pypi.org/project/MetaPhlAn/) [![MetaPhlAn on DockerHub](https://img.shields.io/docker/pulls/biobakery/metaphlan?label=MetaPhlAn%20on%20DockerHub)](https://hub.docker.com/r/biobakery/metaphlan) [![Build MetaPhlAn package](https://github.com/biobakery/MetaPhlAn/workflows/Build%20MetaPhlAn%20package/badge.svg?branch=3.0)](https://github.com/biobakery/MetaPhlAn/actions?query=workflow%3A%22Build+MetaPhlAn+package%22)
-## What's new in version 3
-* New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
-* Estimation of metagenome composed by unknown microbes with parameter `--unknown_estimation`
-* Automatic retrieval and installation of the latest MetaPhlAn database with parameter `--index latest`
-* Virus profiling with `--add_viruses`
-* Calculation of metagenome size for improved estimation of reads mapped to a given clade
-* Inclusion of NCBI taxonomy ID in the ouput file
-* CAMI (Taxonomic) Profiling Output Format included
-* Removal of reads with low MAPQ values
+## What's new in version 3.1
+* 433 low-quality species were removed from the MetaPhlAn 3.1 marker database and 2,680 species were added (for a new total of 15,766; a 17% increase).
+* Marker genes for a subset of existing bioBakery 3 species were also revised.
+* Most existing bioBakery 3 species pangenomes were updated with revised or expanded gene content.
+* MetaPhlAn 3.1 software has been updated to work with revised marker database.
-------------
## Description
MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
-MetaPhlAn relies on ~1.1M unique clade-specific marker genes (the latest marker information file `mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2` can be found [here](https://www.dropbox.com/sh/7qze7m7g9fe2xjg/AAAlyQITZuUCtBUJxpxhIroIa/mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2?dl=1)) identified from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic), allowing:
+MetaPhlAn relies on ~1.1M unique clade-specific marker genes (the latest marker information file `mpa_v31_CHOCOPhlAn_201901_marker_info.txt.bz2` can be found [here](http://cmprod1.cibio.unitn.it/biobakery3/metaphlan_databases/mpa_v31_CHOCOPhlAn_201901_marker_info.txt.bz2)) identified from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic), allowing:
* unambiguous taxonomic assignments;
* accurate estimation of organismal relative abundance;
=====================================
bioconda_recipe/meta.yaml
=====================================
@@ -1,5 +1,5 @@
{% set name = "metaphlan" %}
-{% set version = "3.0" %}
+{% set version = "3.1" %}
package:
name: {{ name }}
@@ -40,6 +40,14 @@ requirements:
- pysam
- raxml >=8.2.10
- samtools >=1.9
+ - r-base
+ - r-essentials
+ - r-optparse
+ - r-rbiom
+ - r-ape
+ - r-compositions
+ - r-biocmanager
+ - bioconductor-microbiome
test:
commands:
=====================================
changeset.txt
=====================================
@@ -1,3 +1,9 @@
+=== Version 3.1
+* 433 low-quality species were removed from the MetaPhlAn 3.1 marker database and 2,680 species were added (for a new total of 15,766; a 17% increase).
+* Marker genes for a subset of existing bioBakery 3 species were also revised.
+* Most existing bioBakery 3 species pangenomes were updated with revised or expanded gene content.
+* MetaPhlAn 3.1 software has been updated to work with revised marker database.
+
=== Version 3.0
* New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
* Estimation of metagenome composed by unknown microbes with parameter `--unknown_estimation`
=====================================
metaphlan/metaphlan.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Francesco Beghini (francesco.beghini at unitn.it),'
'Duy Tin Truong, '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import sys
try:
@@ -1053,6 +1053,13 @@ def main():
.format(pars['bowtie2db']))
sys.exit(1)
+ # check for an incomplete build
+ if bow and not (abs(os.path.getsize(".".join([str(pars['bowtie2db']), "1.bt2"])) - os.path.getsize(".".join([str(pars['bowtie2db']), "rev.1.bt2"]))) <= 1000):
+ sys.stderr.write("Partial MetaPhlAn BowTie2 database found at {}. "
+ "Please remove and rebuild the database.\nExiting..."
+ .format(pars['bowtie2db']))
+ sys.exit(1)
+
if bow:
run_bowtie2(pars['inp'], pars['bowtie2out'], pars['bowtie2db'],
pars['bt2_ps'], pars['nproc'], file_format=pars['input_type'],
=====================================
metaphlan/strainphlan.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import sys
@@ -29,7 +29,7 @@ from Bio.Seq import Seq
metaphlan_script_install_folder = os.path.dirname(os.path.abspath(__file__))
DEFAULT_DB_FOLDER = os.path.join(metaphlan_script_install_folder, "metaphlan_databases")
DEFAULT_DB_FOLDER = os.environ.get('METAPHLAN_DB_DIR', DEFAULT_DB_FOLDER)
-DEFAULT_DB_NAME = "mpa_v30_CHOCOPhlAn_201901.pkl"
+DEFAULT_DB_NAME = "mpa_v31_CHOCOPhlAn_201901.pkl"
DEFAULT_DATABASE = os.path.join(DEFAULT_DB_FOLDER, DEFAULT_DB_NAME)
PHYLOPHLAN_MODES = ['accurate', 'fast']
=====================================
metaphlan/utils/add_metadata_tree.py
=====================================
@@ -1,8 +1,8 @@
#!/usr/bin/env python
__author__ = ('Duy Tin Truong (duytin.truong at unitn.it), '
'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2021'
import argparse as ap
import pandas
@@ -102,4 +102,4 @@ def main():
ofile.write(line)
if __name__ == '__main__':
- main()
\ No newline at end of file
+ main()
=====================================
metaphlan/utils/external_exec.py
=====================================
@@ -3,10 +3,10 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.8'
-__date__ = '7 May 2021'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
-import os, sys, re, shutil
+import os, sys, re, shutil, tempfile
import subprocess as sb
try:
from .util_fun import info, error
@@ -245,7 +245,9 @@ Generates a FASTA file with the markers form a MetaPhlAn database
:param output_dir: the output directory
"""
def generate_markers_fasta(database, output_dir):
- db_markers_faa = output_dir+"db_markers.fna"
+
+ file_out, db_markers_faa = tempfile.mkstemp(dir=output_dir,prefix="db_markers_temp_",suffix=".fna")
+ os.close(file_out)
bowtie_database, _ = os.path.splitext(database)
params = {
"program_name" : "bowtie2-inspect",
=====================================
metaphlan/utils/extract_markers.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import sys
try:
@@ -31,7 +31,7 @@ except ImportError:
metaphlan_script_install_folder = os.path.dirname(os.path.abspath(__file__))
DEFAULT_DB_FOLDER = os.path.join(metaphlan_script_install_folder, "../metaphlan_databases")
DEFAULT_DB_FOLDER = os.environ.get('METAPHLAN_DB_DIR', DEFAULT_DB_FOLDER)
-DEFAULT_DB_NAME = "mpa_v30_CHOCOPhlAn_201901.pkl"
+DEFAULT_DB_NAME = "mpa_v31_CHOCOPhlAn_201901.pkl"
DEFAULT_DATABASE = os.path.join(DEFAULT_DB_FOLDER, DEFAULT_DB_NAME)
"""
=====================================
metaphlan/utils/metaphlan2krona.py
=====================================
@@ -39,7 +39,7 @@ def main():
x_cells = x.split('\t')
lineage = '\t'.join(x_cells[0:(len(x_cells) -1)])
- abundance = float(x_cells[-1].rstrip('\n'))
+ abundance = float(x_cells[-2].rstrip('\n'))
metaPhLan_FH.write('%s\n'%(str(abundance) + '\t' + lineage))
=====================================
metaphlan/utils/parallelisation.py
=====================================
@@ -3,8 +3,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
try:
from .util_fun import error
=====================================
metaphlan/utils/plot_tree_graphlan.py
=====================================
@@ -1,8 +1,8 @@
#!/usr/bin/env python
__author__ = ('Duy Tin Truong (duytin.truong at unitn.it), '
'Aitor Blanco Miguez (aitor.blancomiguez at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import argparse as ap
import dendropy
=====================================
metaphlan/utils/sample2markers.py
=====================================
@@ -4,8 +4,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import sys
try:
=====================================
metaphlan/utils/strain_transmission.py
=====================================
@@ -1,7 +1,7 @@
__author__ = ('Aitor Blanco (aitor.blancomiguez at unitn.it), '
'Mireia Valles-Colomer (mireia.vallescolomer at unitn.it)')
-__version__ = '3.0.14'
-__date__ = '19 Jan 2022'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import os, time, sys
import argparse as ap
=====================================
metaphlan/utils/util_fun.py
=====================================
@@ -3,8 +3,8 @@ __author__ = ('Aitor Blanco Miguez (aitor.blancomiguez at unitn.it), '
'Francesco Asnicar (f.asnicar at unitn.it), '
'Moreno Zolfo (moreno.zolfo at unitn.it), '
'Francesco Beghini (francesco.beghini at unitn.it)')
-__version__ = '3.0'
-__date__ = '21 Feb 2020'
+__version__ = '3.1.0'
+__date__ = '25 Jul 2022'
import os, sys, re, pickletools, pickle, time, bz2, gzip
@@ -134,4 +134,4 @@ def is_number(s):
int(s)
return True
except ValueError:
- return False
\ No newline at end of file
+ return False
=====================================
setup.py
=====================================
@@ -13,7 +13,7 @@ if sys.version_info[0] < 3:
setuptools.setup(
name='MetaPhlAn',
- version='3.0.14',
+ version='3.1.0',
author='Francesco Beghini',
author_email='francesco.beghini at unitn.it',
url='http://github.com/biobakery/MetaPhlAn/',
View it on GitLab: https://salsa.debian.org/med-team/metaphlan2/-/commit/94e552517e1a0c952346013af04990c154743b55
--
View it on GitLab: https://salsa.debian.org/med-team/metaphlan2/-/commit/94e552517e1a0c952346013af04990c154743b55
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220802/966215c5/attachment-0001.htm>
More information about the debian-med-commit
mailing list