[med-svn] [Git][med-team/parsnp][master] 8 commits: d/watch: Fix watch regex
Nilesh Patra (@nilesh)
gitlab at salsa.debian.org
Sun May 30 23:34:32 BST 2021
Nilesh Patra pushed to branch master at Debian Med / parsnp
Commits:
d87c9ba9 by Nilesh Patra at 2021-05-31T02:41:56+05:30
d/watch: Fix watch regex
- - - - -
7c256597 by Nilesh Patra at 2021-05-31T02:42:55+05:30
New upstream version 1.5.6+dfsg
- - - - -
892e8331 by Nilesh Patra at 2021-05-31T02:42:56+05:30
Update upstream source from tag 'upstream/1.5.6+dfsg'
Update to upstream version '1.5.6+dfsg'
with Debian dir 3b2c010492eced522dbb8ee77120b9b4d3105df6
- - - - -
d193f938 by Nilesh Patra at 2021-05-31T02:43:39+05:30
Refresh patches
- - - - -
45271299 by Nilesh Patra at 2021-05-30T22:11:39+00:00
Add d/createmanpages script
- - - - -
cd92783a by Nilesh Patra at 2021-05-30T22:11:51+00:00
Update manpage
- - - - -
b481fd04 by Nilesh Patra at 2021-05-31T04:01:00+05:30
d/p/blhc-fix.patch: Propagate -D_FORTIFY_SOURCE=2 to fix blhc
- - - - -
e046fee0 by Nilesh Patra at 2021-05-31T04:01:54+05:30
Interim changelog entry
- - - - -
9 changed files:
- README.md
- debian/changelog
- + debian/creatmanpages
- debian/parsnp.1
- + debian/patches/blhc-fix.patch
- debian/patches/proper_calls_to_tools.patch
- debian/patches/series
- debian/watch
- parsnp
Changes:
=====================================
README.md
=====================================
@@ -6,7 +6,11 @@ Parsnp is a command-line-tool for efficient microbial core genome alignment and
# Installation
-
+## From conda
+Parsnp is available on the [Bioconda](https://bioconda.github.io/user/install.html#set-up-channels) channel. This is the recommended method of installation and can be installed via
+```
+conda install parsnp
+```
## From source
=====================================
debian/changelog
=====================================
@@ -1,3 +1,15 @@
+parsnp (1.5.6+dfsg-1) UNRELEASED; urgency=medium
+
+ * Team Upload.
+ * d/watch: Fix watch regex
+ * New upstream version 1.5.6+dfsg
+ * Refresh patches
+ * Add d/createmanpages script
+ * Update manpage to latest upstream
+ * d/p/blhc-fix.patch: Propagate -D_FORTIFY_SOURCE=2 to fix blhc
+
+ -- Nilesh Patra <nilesh at debian.org> Mon, 31 May 2021 04:01:09 +0530
+
parsnp (1.5.4+dfsg-1) unstable; urgency=medium
* Team upload.
=====================================
debian/creatmanpages
=====================================
@@ -0,0 +1,55 @@
+#!/bin/sh
+
+set -e
+
+if [ ! -x /usr/bin/help2man ]; then
+ echo "E: Missing /usr/bin/help2man, please install it from the cognate package."
+ exit 1
+fi
+
+if [ ! -n "$NAME" ]; then
+ NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
+fi
+
+if [ ! -n "$VERSION" ]; then
+ VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
+fi
+
+if [ ! -n "$PROGNAME" ]; then
+ PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//' | head -n1`
+fi
+
+MANDIR=debian
+HELPOPTION="--help"
+
+echo "PROGNAME: '$PROGNAME'"
+echo "NAME: '$NAME'"
+echo "VERSION: '$VERSION'"
+echo "MANDIR: '$MANDIR'"
+echo "HELPOPTION: '$HELPOPTION'"
+
+mkdir -p $MANDIR
+
+AUTHOR=".SH AUTHOR\n \
+This manpage was written by $DEBFULLNAME for the Debian distribution and\n \
+can be used for any other usage of the program.\
+"
+
+# If program name is different from package name or title should be
+# different from package short description change this here
+progname=${PROGNAME}
+help2man --no-info --no-discard-stderr --help-option="$HELPOPTION" \
+ --name="$NAME" \
+ --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+echo "$MANDIR/*.1" > debian/manpages
+
+cat <<EOT
+Please enhance the help2man output in '$MANDIR/${progname}.1'.
+To inspect it, try 'nroff -man $MANDIR/${progname}.1'.
+If very unhappy, try passing the HELPOPTION as an environment variable.
+The following web page might be helpful in doing so:
+ http://liw.fi/manpages/
+EOT
+
=====================================
debian/parsnp.1
=====================================
@@ -1,80 +1,125 @@
-.TH PARSNP "1" "July 2016" "parsnp 1.2" "User Commands"
+.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16.
+.TH PARSNP "1" "May 2021" "parsnp 1.5.4" "User Commands"
.SH NAME
parsnp \- rapid core genome multi-alignment
-.SH SYNOPSIS
-.B parsnp
-[options] [\-g|\-r|\-q] \fB\-d\fR <genome_dir> \fB\-p\fR <threads>
.SH DESCRIPTION
-Parsnp was designed to align the core genome of hundreds to thousands of
-bacterial genomes within a few minutes to few hours. Input can be both
-draft assemblies and finished genomes, and output includes variant (SNP)
-calls, core genome phylogeny and multi-alignments. Parsnp leverages
-contextual information provided by multi-alignments surrounding SNP
-sites for filtration/cleaning, in addition to existing tools for
-recombination detection/filtration and phylogenetic reconstruction.
-.SH OPTIONS
-.SS input/output
-.HP
-\fB\-c\fR = <flag>: (c)urated genome directory, use all genomes in dir and ignore MUMi? (default = NO)
-.HP
-\fB\-d\fR = <path>: (d)irectory containing genomes/contigs/scaffolds
-.HP
-\fB\-r\fR = <path>: (r)eference genome (set to ! to pick random one from genome dir)
-.HP
-\fB\-g\fR = <string>: Gen(b)ank file(s) (gbk), comma separated list (default = None)
-.HP
-\fB\-o\fR = <string>: output directory? default [./P_CURRDATE_CURRTIME]
-.HP
-\fB\-q\fR = <path>: (optional) specify (assembled) query genome to use, in addition to genomes found in genome dir (default = NONE)
-.SS MUMi
-.HP
-\fB\-U\fR = <float>: max MUMi distance value for MUMi distribution
-.HP
-\fB\-M\fR = <flag>: calculate MUMi and exit? overrides all other choices! (default: NO)
-.HP
-\fB\-i\fR = <float>: max MUM(i) distance (default: autocutoff based on distribution of MUMi values)
-.SS MUM search
-.HP
-\fB\-a\fR = <int>: min (a)NCHOR length (default = 1.1*Log(S))
-.HP
-\fB\-C\fR = <int>: maximal cluster D value? (default=100)
-.HP
-\fB\-z\fR = <path>: min LCB si(z)e? (default = 25)
-.SS LCB alignment
-.HP
-\fB\-D\fR = <float>: maximal diagonal difference? Either percentage (e.g. 0.2) or bp (e.g. 100bp) (default = 0.12)
-.HP
-\fB\-e\fR = <flag> greedily extend LCBs? experimental! (default = NO)
-.HP
-\fB\-n\fR = <string>: alignment program (default: libMUSCLE)
-.HP
-\fB\-u\fR = <flag>: output unaligned regions? .unaligned (default: NO)
-.SS Recombination filtration
-.HP
-\fB\-x\fR = <flag>: enable filtering of SNPs located in PhiPack identified regions of recombination? (default: NO)
-.SS Misc
-.HP
-\fB\-h\fR = <flag>: (h)elp: print this message and exit
-.HP
-\fB\-p\fR = <int>: number of threads to use? (default= 1)
-.HP
-\fB\-P\fR = <int>: max partition size? limits memory usage (default= 15000000)
-.HP
-\fB\-v\fR = <flag>: (v)erbose output? (default = NO)
-.HP
-\fB\-V\fR = <flag>: output (V)ersion and exit
-.SH EXAMPLES
-Parsnp quick start for three example scenarios:
-.SS With reference & genbank file:
+|\-\-Parsnp 1.5.6\-\-|
+For detailed documentation please see \fB\-\-\fR> http://harvest.readthedocs.org/en/latest
+usage: parsnp [\-h] [\-c] \fB\-d\fR SEQUENCES [SEQUENCES ...] [\-r REFERENCE]
.IP
-parsnp \fB\-g\fR <reference_genbank_file1,reference_genbank_file2,..> \fB\-d\fR <genome_dir> \fB\-p\fR <threads>
-.SS With reference but without genbank file:
+[\-g GENBANK [GENBANK ...]] [\-o OUTPUT_DIR] [\-q QUERY]
+[\-U MAX_MUMI_DISTR_DIST | \fB\-mmd\fR MAX_MUMI_DISTANCE] [\-F] [\-M]
+[\-\-use\-ani] [\-\-min\-ani MIN_ANI] [\-\-use\-mash]
+[\-\-max\-mash\-dist MAX_MASH_DIST] [\-a MIN_ANCHOR_LENGTH]
+[\-m MUM_LENGTH] [\-C MAX_CLUSTER_D] [\-z MIN_CLUSTER_SIZE]
+[\-D MAX_DIAG_DIFF] [\-n {mafft,muscle,fsa,prank}] [\-u]
+[\-\-use\-fasttree] [\-\-vcf] [\-p THREADS] [\-P MAX_PARTITION_SIZE]
+[\-v] [\-x] [\-i INIFILE] [\-e] [\-V]
.IP
-parsnp \fB\-r\fR <reference_genome> \fB\-d\fR <genome_dir> \fB\-p\fR <threads>
-.SS Autorecruit reference to a draft assembly:
+Parsnp quick start for three example scenarios:
+1) With reference & genbank file:
+python Parsnp.py \fB\-g\fR <reference_genbank_file1 reference_genbank_file2 ...> \fB\-d\fR <seq_file1 seq_file2 ...> \fB\-p\fR <threads>
.IP
-parsnp \fB\-q\fR <draft_assembly> \fB\-d\fR <genome_db> \fB\-p\fR <threads>
-.SH SEE ALSO
-For detailed documentation please see \fB\-\-\fR> http://harvest.readthedocs.org/en/latest
+2) With reference but without genbank file:
+python Parsnp.py \fB\-r\fR <reference_genome> \fB\-d\fR <seq_file1 seq_file2 ...> \fB\-p\fR <threads>
+.IP
+3) Autorecruit reference to a draft assembly:
+python Parsnp.py \fB\-q\fR <draft_assembly> \fB\-d\fR <seq_file1 seq_file2 ...> \fB\-p\fR <threads>
+.SS "optional arguments:"
+.TP
+\fB\-h\fR, \fB\-\-help\fR
+show this help message and exit
+.SS "Input/Output:"
+.TP
+\fB\-c\fR, \fB\-\-curated\fR
+(c)urated genome directory, use all genomes in dir and ignore MUMi?
+.TP
+\fB\-d\fR SEQUENCES [SEQUENCES ...], \fB\-\-sequences\fR SEQUENCES [SEQUENCES ...]
+A list of files containing genomes/contigs/scaffolds
+.TP
+\fB\-r\fR REFERENCE, \fB\-\-reference\fR REFERENCE
+(r)eference genome (set to ! to pick random one from sequence dir)
+.TP
+\fB\-g\fR GENBANK [GENBANK ...], \fB\-\-genbank\fR GENBANK [GENBANK ...]
+A list of Genbank file(s) (gbk)
+.HP
+\fB\-o\fR OUTPUT_DIR, \fB\-\-output\-dir\fR OUTPUT_DIR
+.TP
+\fB\-q\fR QUERY, \fB\-\-query\fR QUERY
+Specify (assembled) query genome to use, in addition to genomes found in genome dir
+.SS "MUMi:"
+.TP
+\fB\-U\fR MAX_MUMI_DISTR_DIST, \fB\-\-max\-mumi\-distr\-dist\fR MAX_MUMI_DISTR_DIST, \fB\-\-MUMi\fR MAX_MUMI_DISTR_DIST
+Max MUMi distance value for MUMi distribution
+.TP
+\fB\-mmd\fR MAX_MUMI_DISTANCE, \fB\-\-max\-mumi\-distance\fR MAX_MUMI_DISTANCE
+Max MUMi distance (default: autocutoff based on distribution of MUMi values)
+.TP
+\fB\-F\fR, \fB\-\-fastmum\fR
+Fast MUMi calculation
+.TP
+\fB\-M\fR, \fB\-\-mumi_only\fR, \fB\-\-onlymumi\fR
+Calculate MUMi and exit? overrides all other choices!
+.TP
+\fB\-\-use\-ani\fR
+Use ani for genome recruitment
+.TP
+\fB\-\-min\-ani\fR MIN_ANI
+Min ANI value to allow for genome recruitment.
+.TP
+\fB\-\-use\-mash\fR
+Use mash for genome recruitment
+.TP
+\fB\-\-max\-mash\-dist\fR MAX_MASH_DIST
+Max mash distance.
+.SS "MUM search:"
+.TP
+\fB\-a\fR MIN_ANCHOR_LENGTH, \fB\-\-min\-anchor\-length\fR MIN_ANCHOR_LENGTH, \fB\-\-anchorlength\fR MIN_ANCHOR_LENGTH
+Min (a)NCHOR length (default = 1.1*(Log(S)))
+.TP
+\fB\-m\fR MUM_LENGTH, \fB\-\-mum\-length\fR MUM_LENGTH, \fB\-\-mumlength\fR MUM_LENGTH
+Mum length
+.TP
+\fB\-C\fR MAX_CLUSTER_D, \fB\-\-max\-cluster\-d\fR MAX_CLUSTER_D, \fB\-\-clusterD\fR MAX_CLUSTER_D
+Maximal cluster D value
+.TP
+\fB\-z\fR MIN_CLUSTER_SIZE, \fB\-\-min\-cluster\-size\fR MIN_CLUSTER_SIZE, \fB\-\-minclustersize\fR MIN_CLUSTER_SIZE
+Minimum cluster size
+.SS "LCB alignment:"
+.TP
+\fB\-D\fR MAX_DIAG_DIFF, \fB\-\-max\-diagonal\-difference\fR MAX_DIAG_DIFF, \fB\-\-DiagonalDiff\fR MAX_DIAG_DIFF
+Maximal diagonal difference. Either percentage (e.g. 0.2) or bp (e.g. 100bp)
+.TP
+\fB\-n\fR {mafft,muscle,fsa,prank}, \fB\-\-alignment\-program\fR {mafft,muscle,fsa,prank}, \fB\-\-alignmentprog\fR {mafft,muscle,fsa,prank}
+Alignment program to use
+.TP
+\fB\-u\fR, \fB\-\-unaligned\fR
+Ouput unaligned regions
+.SS "Misc:"
+.TP
+\fB\-\-use\-fasttree\fR
+Use fasttree instead of RaxML
+.TP
+\fB\-\-vcf\fR
+Generate VCF file.
+.TP
+\fB\-p\fR THREADS, \fB\-\-threads\fR THREADS
+Number of threads to use
+.TP
+\fB\-P\fR MAX_PARTITION_SIZE, \fB\-\-max\-partition\-size\fR MAX_PARTITION_SIZE
+Max partition size (limits memory usage)
+.TP
+\fB\-v\fR, \fB\-\-verbose\fR
+Verbose output
+.HP
+\fB\-x\fR, \fB\-\-xtrafast\fR
+.HP
+\fB\-i\fR INIFILE, \fB\-\-inifile\fR INIFILE, \fB\-\-ini\-file\fR INIFILE
+.HP
+\fB\-e\fR, \fB\-\-extend\fR
+.TP
+\fB\-V\fR, \fB\-\-version\fR
+show program's version number and exit
.SH AUTHOR
-This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
+ This manpage was written by Nilesh Patra for the Debian distribution and
+ can be used for any other usage of the program.
=====================================
debian/patches/blhc-fix.patch
=====================================
@@ -0,0 +1,11 @@
+Description: Append -D_FORTIFY_SOURCE=2 to parsnp CXX flags to fix blhc
+Author: Nilesh Patra <nilesh at debian.org>
+Last-Update: 2021-05-31
+--- a/src/Makefile.am
++++ b/src/Makefile.am
+@@ -1,4 +1,4 @@
+-parsnp_core_CXXFLAGS = -fopenmp -O2 -funroll-all-loops -fomit-frame-pointer -ftree-vectorize -std=gnu++0x
++parsnp_core_CXXFLAGS = -fopenmp -O2 -funroll-all-loops -fomit-frame-pointer -ftree-vectorize -std=gnu++0x -D_FORTIFY_SOURCE=2
+ LIBS = -fopenmp -lstdc++ -lpthread -L$(libmuscle)/lib -lMUSCLE
+ bin_PROGRAMS = parsnp_core
+ parsnp_core_SOURCES = MuscleInterface.cpp MuscleInterface.h parsnp.cpp parsnp.hh LCB.cpp LCB.hh LCR.cpp LCR.hh TMum.cpp TMum.hh Converter.cpp Converter.hh ./ext/iniFile.cpp ./ext/iniFile.h
=====================================
debian/patches/proper_calls_to_tools.patch
=====================================
@@ -13,7 +13,7 @@ Description: Fix name of phipack executable
run_command(command,1)
os.chdir(currdir)
-@@ -595,7 +595,7 @@
+@@ -600,7 +600,7 @@
# Check for dependencies
missing = False
@@ -22,7 +22,7 @@ Description: Fix name of phipack executable
if shutil.which(exe) is None:
missing = True
logger.critical("{} not in system path!".format(exe))
-@@ -892,7 +892,7 @@
+@@ -898,7 +898,7 @@
if xtrafast or 1:
extend = False
@@ -31,7 +31,7 @@ Description: Fix name of phipack executable
inifiled = inifiled.replace("$REF", ref)
inifiled = inifiled.replace("$EXTEND", "%d"%(extend))
inifiled = inifiled.replace("$ANCHORS", str(anchor))
-@@ -949,10 +949,10 @@
+@@ -955,10 +955,10 @@
logger.info("Recruiting genomes...")
if use_parsnp_mumi:
if not inifile_exists:
@@ -44,7 +44,7 @@ Description: Fix name of phipack executable
run_command(command)
try:
mumif = open(os.path.join(outputDir, "all.mumi"),'r')
-@@ -1155,14 +1155,14 @@
+@@ -1161,14 +1161,14 @@
if command == "" and xtrafast and 0:
command = "%s/parsnpA_fast %sparsnpAligner.ini"%(PARSNP_DIR,outputDir+os.sep)
elif command == "":
@@ -62,7 +62,7 @@ Description: Fix name of phipack executable
run_command(command)
if not os.path.exists(os.path.join(outputDir, "parsnpAligner.xmfa")):
-@@ -1366,7 +1366,7 @@
+@@ -1375,7 +1375,7 @@
break
if not use_fasttree:
with tempfile.TemporaryDirectory() as raxml_output_dir:
=====================================
debian/patches/series
=====================================
@@ -4,3 +4,4 @@ proper_calls_to_tools.patch
drop_m64.patch
fix_build_with_as-needed.patch
non-versioned-libs.patch
+blhc-fix.patch
=====================================
debian/watch
=====================================
@@ -1,4 +1,4 @@
version=4
opts="repacksuffix=+dfsg,dversionmangle=s/\+dfsg//g,repack,compression=xz" \
- https://github.com/marbl/parsnp/releases .*/archive/v(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz)
+ https://github.com/marbl/parsnp/releases .*/archive/.*/v(\d[\d.-]+)\.(?:tar(?:\.gz|\.bz2)?|tgz)
=====================================
parsnp
=====================================
@@ -14,7 +14,7 @@ import signal
import inspect
from multiprocessing import *
-__version__ = "1.5.4"
+__version__ = "1.5.6"
reroot_tree = True #use --midpoint-reroot
try:
@@ -475,6 +475,10 @@ def parse_args():
"--use-fasttree",
action = "store_true",
help = "Use fasttree instead of RaxML")
+ misc_args.add_argument(
+ "--vcf",
+ action = "store_true",
+ help = "Generate VCF file.")
misc_args.add_argument(
"-p",
"--threads",
@@ -583,6 +587,7 @@ if __name__ == "__main__":
genbank_ref = ""
reflen = 0
use_gingr = ""
+ generate_vcf = args.vcf
filtreps = False
repfile = ""
@@ -759,9 +764,10 @@ if __name__ == "__main__":
logger.critical("No seqs provided, yet required. exit!")
sys.exit(0) # TODO Should this exit value be 0?
elif not ref and query:
- logger.warning("No reference genome specified, going to autopick from %s as closest to %s\n"%(seqdir, query))
+ logger.warning("No reference genome specified, going to autopick from input as closest to %s\n"%(query))
autopick_ref = True
ref = query
+ print("Ref", ref)
logger.info("""
{}
@@ -1357,6 +1363,8 @@ Please verify recruited genomes are all strain of interest""")
run_command("harvesttools -q -b %s,REP,\"Intragenomic repeats > 100bp\" -o %s/parsnp.ggr -i %s/parsnp.ggr"%(repfile,outputDir,outputDir))
run_command("harvesttools -q -i %s/parsnp.ggr -S "%(outputDir)+outputDir+os.sep+"parsnp.snps.mblocks")
+ if generate_vcf:
+ run_command("harvesttools -q -i %s/parsnp.ggr -V "%(outputDir)+outputDir+os.sep+"parsnp.vcf")
logger.info("Reconstructing core genome phylogeny...")
with open(os.path.join(outputDir, "parsnp.snps.mblocks")) as mblocks:
@@ -1477,6 +1485,8 @@ Please verify recruited genomes are all strain of interest""")
if not VERBOSE and os.path.exists("%s/all_mumi.ini"%(outputDir)):
os.remove("%s/all_mumi.ini"%(outputDir))
+ if not VERBOSE and os.path.exists("%s/tmp"%(outputDir)):
+ shutil.rmtree("%s/tmp"%(outputDir))
if os.path.exists("%s/parsnp.snps.mblocks"%(outputDir)):
os.remove("%s/parsnp.snps.mblocks"%(outputDir))
View it on GitLab: https://salsa.debian.org/med-team/parsnp/-/compare/7bf28601bd95efc84dfc48fd0e910799032b984a...e046fee0ce35b095be110877a22fa365939e7d86
--
View it on GitLab: https://salsa.debian.org/med-team/parsnp/-/compare/7bf28601bd95efc84dfc48fd0e910799032b984a...e046fee0ce35b095be110877a22fa365939e7d86
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20210530/8be1a788/attachment-0001.htm>
More information about the debian-med-commit
mailing list