[med-svn] [phylophlan] 01/10: Inject first try to package phylophlan

Andreas Tille tille at debian.org
Fri Dec 1 15:01:02 UTC 2017


This is an automated email from the git hooks/post-receive script.

tille pushed a commit to branch master
in repository phylophlan.

commit 03288cae443bc8f259445142983cb9966d4c475e
Author: Andreas Tille <tille at debian.org>
Date:   Mon May 23 16:02:05 2016 +0000

    Inject first try to package phylophlan
---
 debian/bin/phylophlan              |  4 ++
 debian/changelog                   |  5 +++
 debian/clean                       |  1 +
 debian/compat                      |  1 +
 debian/control                     | 66 +++++++++++++++++++++++++++
 debian/copyright                   | 31 +++++++++++++
 debian/install                     |  4 ++
 debian/manpages                    |  1 +
 debian/patches/datadir.patch       | 33 ++++++++++++++
 debian/patches/debian_tools.patch  | 15 +++++++
 debian/patches/fasttree_name.patch | 17 +++++++
 debian/patches/series              |  4 ++
 debian/patches/use_vsearch.patch   | 92 ++++++++++++++++++++++++++++++++++++++
 debian/phylophlan.1                | 65 +++++++++++++++++++++++++++
 debian/postinst                    | 19 ++++++++
 debian/postrm                      | 19 ++++++++
 debian/rules                       |  9 ++++
 debian/source/format               |  1 +
 debian/upstream/metadata           | 11 +++++
 debian/watch                       |  3 ++
 20 files changed, 401 insertions(+)

diff --git a/debian/bin/phylophlan b/debian/bin/phylophlan
new file mode 100755
index 0000000..c5cf2fa
--- /dev/null
+++ b/debian/bin/phylophlan
@@ -0,0 +1,4 @@
+#!/bin/bash
+EXE=`basename $0`
+export PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}/usr/share/${EXE}/taxcuration"
+python /usr/share/${EXE}/${EXE}.py $@
diff --git a/debian/changelog b/debian/changelog
new file mode 100644
index 0000000..89c718a
--- /dev/null
+++ b/debian/changelog
@@ -0,0 +1,5 @@
+phylophlan (1.1.0-1) UNRELEASED; urgency=medium
+
+  * Initial release (Closes: #<bug>)
+
+ -- Andreas Tille <tille at debian.org>  Mon, 23 May 2016 16:09:13 +0200
diff --git a/debian/clean b/debian/clean
new file mode 100644
index 0000000..c17e466
--- /dev/null
+++ b/debian/clean
@@ -0,0 +1 @@
+taxcuration/*.pyc
diff --git a/debian/compat b/debian/compat
new file mode 100644
index 0000000..ec63514
--- /dev/null
+++ b/debian/compat
@@ -0,0 +1 @@
+9
diff --git a/debian/control b/debian/control
new file mode 100644
index 0000000..6bf2529
--- /dev/null
+++ b/debian/control
@@ -0,0 +1,66 @@
+Source: phylophlan
+Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
+Uploaders: Andreas Tille <tille at debian.org>
+Section: science
+Priority: optional
+Build-Depends: debhelper (>= 9),
+               python-all,
+               dh-python
+Standards-Version: 3.9.8
+Vcs-Browser: https://anonscm.debian.org/viewvc/debian-med/trunk/packages/phylophlan/trunk/
+Vcs-Svn: svn://anonscm.debian.org/debian-med/trunk/packages/phylophlan/trunk/
+Homepage: https://bitbucket.org/nsegata/phylophlan/wiki/Home
+
+Package: phylophlan
+Architecture: all
+Depends: ${python:Depends},
+         ${misc:Depends},
+         fasttree,
+         vsearch,
+         muscle,
+         ncbi-blast+,
+         python-biopython
+Description: microbial Tree of Life using 400 universal proteins
+ PhyloPhlAn is a computational pipeline for reconstructing highly
+ accurate and resolved phylogenetic trees based on whole-genome sequence
+ information. The pipeline is scalable to thousands of genomes and uses
+ the most conserved 400 proteins for extracting the phylogenetic signal.
+ PhyloPhlAn also implements taxonomic curation, estimation, and insertion
+ operations.
+ .
+ The main features of PhyloPhlAn are:
+  * completely automatic, as the user needs only to provide the
+    (unannotated) protein sequences of the input genomes (as multifasta
+    files of peptides - not nucleotides)
+  * very high topological accuracy and resolution because of the use of
+    up to 400 previously identified most conserved proteins
+  * the possibility of integrating new genomes in the already
+    reconstructed most comprehensive tree of life (3,171 microbial
+    genomes)
+  * taxonomy estimation for the newly inserted genomes
+  * taxonomic curation for the produced phylogenetic trees
+
+Package: phylophlan-examples
+Architecture: all
+Depends: ${misc:Depends},
+Description: microbial Tree of Life using 400 universal proteins (example data)
+ PhyloPhlAn is a computational pipeline for reconstructing highly
+ accurate and resolved phylogenetic trees based on whole-genome sequence
+ information. The pipeline is scalable to thousands of genomes and uses
+ the most conserved 400 proteins for extracting the phylogenetic signal.
+ PhyloPhlAn also implements taxonomic curation, estimation, and insertion
+ operations.
+ .
+ The main features of PhyloPhlAn are:
+  * completely automatic, as the user needs only to provide the
+    (unannotated) protein sequences of the input genomes (as multifasta
+    files of peptides - not nucleotides)
+  * very high topological accuracy and resolution because of the use of
+    up to 400 previously identified most conserved proteins
+  * the possibility of integrating new genomes in the already
+    reconstructed most comprehensive tree of life (3,171 microbial
+    genomes)
+  * taxonomy estimation for the newly inserted genomes
+  * taxonomic curation for the produced phylogenetic trees
+ .
+ This package contains some example data.
diff --git a/debian/copyright b/debian/copyright
new file mode 100644
index 0000000..27e2f39
--- /dev/null
+++ b/debian/copyright
@@ -0,0 +1,31 @@
+Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
+Upstream-Name: PhyloPhlAn
+Source: https://bitbucket.org/nsegata/phylophlan/downloads
+
+Files: *
+Copyright: 2012-206 Nicola Segata and Curtis Huttenhower
+License: expat
+
+Files: debian/*
+Copyright: © 2016 maintainername <maintainer at e.mail>
+License: expat
+
+License: expat
+ Permission is hereby granted, free of charge, to any person obtaining a
+ copy of this software and associated documentation files (the
+ "Software"), to deal in the Software without restriction, including
+ without limitation the rights to use, copy, modify, merge, publish,
+ distribute, sublicense, and/or sell copies of the Software, and to
+ permit persons to whom the Software is furnished to do so, subject to
+ the following conditions:
+ .
+ The above copyright notice and this permission notice shall be included
+ in all copies or substantial portions of the Software.
+ .
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ IN NO EV ENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ SOFTWARE OR TH E USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/debian/install b/debian/install
new file mode 100644
index 0000000..685fad2
--- /dev/null
+++ b/debian/install
@@ -0,0 +1,4 @@
+phylophlan.py	usr/share/phylophlan
+taxcuration	usr/share/phylophlan
+data		var/lib/phylophlan
+debian/bin	usr
diff --git a/debian/manpages b/debian/manpages
new file mode 100644
index 0000000..0f65186
--- /dev/null
+++ b/debian/manpages
@@ -0,0 +1 @@
+debian/*.1
diff --git a/debian/patches/datadir.patch b/debian/patches/datadir.patch
new file mode 100644
index 0000000..c4e0e62
--- /dev/null
+++ b/debian/patches/datadir.patch
@@ -0,0 +1,33 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Mon, 23 May 2016 16:09:13 +0200
+Description: Move data to user writable dir
+
+--- a/phylophlan.py
++++ b/phylophlan.py
+@@ -31,16 +31,16 @@ import time
+ 
+ 
+ download = ""
+-ppa_fna = "data/ppa.seeds.faa"
+-ppa_fna_40 = "data/ppa.seeds.40.faa"
+-ppa_aln = "data/ppafull.aln.faa"
+-ppa_up2prots = "data/ppafull.up2prots.txt"
+-ppa_ors2prots = "data/ppafull.orgs2prots.txt"
+-ppa_tax = "data/ppafull.tax.txt"
+-ppa_alns = ("data/ppaalns/list.txt","data/ppaalns/ppa.aln.tar.bz2")
+-ppa_alns_fol = "data/ppaalns/"
+-ppa_xml = "data/ppafull.xml"
+-ppa_wdb = "data/ppa.wdb"
++ppa_fna = "/var/lib/phylophlan/data/ppa.seeds.faa"
++ppa_fna_40 = "/var/lib/phylophlan/data/ppa.seeds.40.faa"
++ppa_aln = "/var/lib/phylophlan/data/ppafull.aln.faa"
++ppa_up2prots = "/var/lib/phylophlan/data/ppafull.up2prots.txt"
++ppa_ors2prots = "/var/lib/phylophlan/data/ppafull.orgs2prots.txt"
++ppa_tax = "/var/lib/phylophlan/data/ppafull.tax.txt"
++ppa_alns = ("/var/lib/phylophlan/data/ppaalns/list.txt","/var/lib/phylophlan/data/ppaalns/ppa.aln.tar.bz2")
++ppa_alns_fol = "/var/lib/phylophlan/data/ppaalns/"
++ppa_xml = "/var/lib/phylophlan/data/ppafull.xml"
++ppa_wdb = "/var/lib/phylophlan/data/ppa.wdb"
+ up2prots = "up2prots.txt"
+ ors2prots = "orgs2prots.txt"
+ aln_tot = "aln.fna"
diff --git a/debian/patches/debian_tools.patch b/debian/patches/debian_tools.patch
new file mode 100644
index 0000000..1817a55
--- /dev/null
+++ b/debian/patches/debian_tools.patch
@@ -0,0 +1,15 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Mon, 23 May 2016 16:09:13 +0200
+Description: Use Debian packaged tools with correct names
+
+--- a/phylophlan.py
++++ b/phylophlan.py
+@@ -66,7 +66,7 @@ def error(s):
+ 
+ 
+ def dep_checks():
+-    for prog in ["FastTree", "usearch", "muscle", "tblastn"]:
++    for prog in ["fasttree", "vsearch", "muscle", "tblastn"]:
+         try:
+             with open(os.devnull, 'w') as devnull:
+                 sb.call([prog], stdout=devnull, stderr=devnull)
diff --git a/debian/patches/fasttree_name.patch b/debian/patches/fasttree_name.patch
new file mode 100644
index 0000000..09c5a7c
--- /dev/null
+++ b/debian/patches/fasttree_name.patch
@@ -0,0 +1,17 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Mon, 23 May 2016 16:09:13 +0200
+Description: Debian's executable has lower case spelling
+
+--- a/phylophlan.py
++++ b/phylophlan.py
+@@ -671,8 +671,8 @@ def fasttree( proj, integrate ):
+     if os.path.exists( outt ):
+         info("Final tree already built ("+outt+")!\n")
+         return
+-    info("Start building the tree with FastTree ... \n")
+-    cmd = [ "FastTree", "-quiet",
++    info("Start building the tree with fasttree ... \n")
++    cmd = [ "fasttree", "-quiet",
+             #"-fastest","-noml"
+             #"-gamma",
+             "-bionj","-slownni",
diff --git a/debian/patches/series b/debian/patches/series
new file mode 100644
index 0000000..12c55fa
--- /dev/null
+++ b/debian/patches/series
@@ -0,0 +1,4 @@
+debian_tools.patch
+datadir.patch
+use_vsearch.patch
+fasttree_name.patch
diff --git a/debian/patches/use_vsearch.patch b/debian/patches/use_vsearch.patch
new file mode 100644
index 0000000..447acd0
--- /dev/null
+++ b/debian/patches/use_vsearch.patch
@@ -0,0 +1,92 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Mon, 23 May 2016 16:09:13 +0200
+Description: Debian can not package usearch which is non-free but it has
+ the free replacement vsearch which is currently under active development
+ and performs better than usearch
+
+--- a/phylophlan.py
++++ b/phylophlan.py
+@@ -153,8 +153,9 @@ def init():
+                 info("Done!\n")
+ 
+     if not os.path.exists( ppa_wdb ):
+-        info("Generating "+ppa_wdb+" (usearch indexed DB)... ")
+-        sb.call( ["usearch","-quiet",
++        info("Generating "+ppa_wdb+" (vsearch indexed DB)... ")
++        print("vsearch -quiet --makewdb %s --output %s" % (ppa_fna,ppa_wdb))
++        sb.call( ["vsearch","-quiet",
+                   "--makewdb",ppa_fna,
+                   "--output",ppa_wdb])
+         info("Done!\n")
+@@ -265,29 +266,29 @@ def exe_usearch(x):
+         screen_usearch_wdb( x[5] )
+         info( x[5] + " generated!\n" )
+     except OSError:
+-        error( "OSError: fatal error running usearch." )
++        error( "OSError: fatal error running vsearch." )
+         return
+     except ValueError:
+-        error( "ValueError: fatal error running usearch." )
++        error( "ValueError: fatal error running vsearch." )
+         return
+     except KeyboardInterrupt:
+-        error( "KeyboardInterrupt: usearch process interrupted." )
++        error( "KeyboardInterrupt: vsearch process interrupted." )
+         return
+ 
+ 
+ def faa2ppafaa( inps, nproc, proj ):
+     inp_fol = "input/"+proj+"/"
+-    dat_fol = "data/"+proj+"/usearch/"
++    dat_fol = "data/"+proj+"/vsearch/"
+     pool = mp.Pool( nproc )
+     mmap = [(inp_fol+i+'.faa', dat_fol+i+'.b6o') for i in inps if not os.path.exists(dat_fol+i+'.b6o')]
+ 
+     if not os.path.isdir(dat_fol): os.mkdir(dat_fol) # create the tmp directory if does not exists
+ 
+     if not mmap:
+-        info("All usearch runs already performed!\n")
++        info("All vsearch runs already performed!\n")
+     else:
+         info("Looking for PhyloPhlAn proteins in input faa files\n")
+-        us_cmd = [ ["usearch","-quiet",
++        us_cmd = [ ["vsearch","-quiet",
+                     "-wdb",ppa_wdb,
+                     "-blast6out",o,
+                     "-query",i,
+@@ -295,7 +296,7 @@ def faa2ppafaa( inps, nproc, proj ):
+         pool.map_async( exe_usearch, us_cmd )
+         pool.close()
+         pool.join()
+-        info("All usearch runs performed!\n")
++        info("All vsearch runs performed!\n")
+ 
+     if os.path.exists(dat_fol+up2prots):
+         return
+@@ -418,7 +419,7 @@ def blast(inps, nproc, proj, blast_full=
+ 
+ def gens2prots(inps, proj ):
+     inp_fol = "input/"+proj+"/"
+-    dat_fol = "data/"+proj+"/usearch/"
++    dat_fol = "data/"+proj+"/vsearch/"
+ 
+     if not os.path.isdir(dat_fol): os.mkdir(dat_fol) # create the tmp directory if does not exists
+ 
+@@ -523,7 +524,7 @@ def exe_muscle(x):
+         error( "ValueError: fatal error running muscle." )
+         raise e
+     except KeyboardInterrupt, e:
+-        error( "KeyboardInterrupt: usearch process muscle." )
++        error( "KeyboardInterrupt: vsearch process muscle." )
+         raise e
+     except Exception, e:
+         error( e )
+@@ -881,7 +882,7 @@ def tax_curation_test( proj, tax,
+ 
+ def merge_usearch_blast(inps, proj):
+     dat_fol = 'data/'+proj+'/'
+-    usearch_fol = 'data/'+proj+'/usearch/'
++    usearch_fol = 'data/'+proj+'/vsearch/'
+     tblastn_fol = 'data/'+proj+'/tblastn/'
+     usearch_files = []
+     tblastn_files = []
diff --git a/debian/phylophlan.1 b/debian/phylophlan.1
new file mode 100644
index 0000000..7c0d551
--- /dev/null
+++ b/debian/phylophlan.1
@@ -0,0 +1,65 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.46.4.
+.TH PHYLOPHLAN "1" "May 2016" "phylophlan 1.1.0" "User Commands"
+.SH NAME
+phylophlan \- computational pipeline for reconstructing highly accurate and resolved phylogenetic trees
+.SH SYNOPSIS
+.B phylophlan
+[\-h] [\-i] [\-u] [\-t] [\-\-tax_test TAX_TEST] [\-c]
+[\-\-cleanall] [\-\-nproc N] [\-\-blast_full] [\-v]
+[PROJECT NAME]
+.SH DESCRIPTION
+PhyloPhlAn is a computational pipeline for reconstructing highly accurate and resolved
+phylogenetic trees based on whole\-genome sequence information. The pipeline is scalable
+to thousands of genomes and uses the most conserved 400 proteins for extracting the
+phylogenetic signal.
+PhyloPhlAn also implements taxonomic curation, estimation, and insertion operations.
+.SH OPTIONS
+.SS "positional arguments:"
+.TP
+PROJECT NAME
+The basename of the project corresponding to the name of the input data folder inside
+input/. The input data consist of a collection of multifasta files (extension .faa)
+containing the proteins in each genome.
+If the project already exists, the already executed steps are not re\-ran.
+The results will be stored in a folder with the project basename in output/
+Multiple project can be generated and they safetely coexists.
+.SS "optional arguments:"
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.TP
+\fB\-i\fR, \fB\-\-integrate\fR
+Integrate user genomes into the PhyloPhlAn tree
+.TP
+\fB\-u\fR, \fB\-\-user_tree\fR
+Build a phylogenetic tree using user genomes only
+.TP
+\fB\-t\fR, \fB\-\-taxonomic_analysis\fR
+Check taxonomic inconsistencies and refine/correct taxonomic labels
+.TP
+\fB\-\-tax_test\fR TAX_TEST
+nerrors:type:taxl:tmin:tex:name (alpha version, experimental!)
+.TP
+\fB\-c\fR, \fB\-\-clean\fR
+Clean the final and partial data produced for the specified project.
+(use \fB\-\-cleanall\fR for removing general installation and database files)
+.TP
+\fB\-\-cleanall\fR
+Remove all instalation and database file leaving untouched the initial compressed data
+that is automatically extracted and formatted at the first pipeline run.
+Projects are not remove (specify a project and use \fB\-c\fR for removing projects).
+.TP
+\fB\-\-nproc\fR N
+The number of CPUs to use for parallelizing the blasting
+[default 1, i.e. no parallelism]
+.TP
+\fB\-\-blast_full\fR
+If specified, tells blast to use the full dataset of universal proteins
+[default False, i.e. the small dataset of universal proteins is used]
+.TP
+\fB\-v\fR, \fB\-\-version\fR
+Prints the current PhyloPhlAn version and exit
+.SH AUTHOR
+Nicola Segata (nsegata at hsph.harvard.edu) and Curtis Huttenhower (chuttenh at hsph.harvard.edu)
+.P
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
diff --git a/debian/postinst b/debian/postinst
new file mode 100755
index 0000000..54cadfc
--- /dev/null
+++ b/debian/postinst
@@ -0,0 +1,19 @@
+#!/bin/sh -e
+
+case "$1" in
+  configure)
+    # enable users to unpack data files
+    find /var/lib/phylophlan -type d -exec chmod a+w \{\} \;
+  ;;
+  abort-upgrade|abort-remove|abort-deconfigure)
+    echo "$1"
+  ;;
+  *)
+    echo "postinst called with unknown argument \`\$1'" >&2
+    exit 0
+  ;;
+esac
+
+#DEBHELPER#
+
+exit 0
diff --git a/debian/postrm b/debian/postrm
new file mode 100755
index 0000000..16a023d
--- /dev/null
+++ b/debian/postrm
@@ -0,0 +1,19 @@
+#!/bin/sh -e
+
+case "$1" in
+  remove|purge)
+    # Remove postentially unpackaged data
+    rm -rf /var/lib/phylophlan
+  ;;
+  upgrade|failed-upgrade|abort-install|abort-upgrade|disappear)
+    echo $1
+  ;;
+  *)
+    echo "postrm called with unknown argument \`\$1'" >&2
+    exit 0
+  ;;
+esac
+
+#DEBHELPER#
+
+exit 0
diff --git a/debian/rules b/debian/rules
new file mode 100755
index 0000000..8fdd6bc
--- /dev/null
+++ b/debian/rules
@@ -0,0 +1,9 @@
+#!/usr/bin/make -f
+
+# DH_VERBOSE := 1
+export LC_ALL=C.UTF-8
+
+
+%:
+	dh $@  --with python2
+
diff --git a/debian/source/format b/debian/source/format
new file mode 100644
index 0000000..163aaf8
--- /dev/null
+++ b/debian/source/format
@@ -0,0 +1 @@
+3.0 (quilt)
diff --git a/debian/upstream/metadata b/debian/upstream/metadata
new file mode 100644
index 0000000..66ea9bb
--- /dev/null
+++ b/debian/upstream/metadata
@@ -0,0 +1,11 @@
+Reference:
+  Author: Nicola Segata and Daniela Börnigen and Xochitl C. Morgan and Curtis Huttenhower
+  Title: "PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes"
+  Journal: Nature Communications
+  Year: 2013
+  Volume: 4
+  Pages: 2304
+  DOI: 10.1038/ncomms3304
+  PMID: 23942190
+  URL: http://www.nature.com/ncomms/2013/130814/ncomms3304/full/ncomms3304.html
+  eprint: http://www.nature.com/ncomms/2013/130814/ncomms3304/pdf/ncomms3304.pdf
diff --git a/debian/watch b/debian/watch
new file mode 100644
index 0000000..2567a1a
--- /dev/null
+++ b/debian/watch
@@ -0,0 +1,3 @@
+version=3
+
+https://bitbucket.org/nsegata/phylophlan/downloads .*/(\d\S*)\.tar\.gz

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/phylophlan.git



More information about the debian-med-commit mailing list