[med-svn] [Git][med-team/segemehl][master] 7 commits: Adapt build to new version

Andreas Tille gitlab at salsa.debian.org
Fri Oct 5 11:02:08 BST 2018


Andreas Tille pushed to branch master at Debian Med / segemehl


Commits:
ad517e8b by Andreas Tille at 2018-10-05T07:49:23Z
Adapt build to new version

- - - - -
3b747e9a by Andreas Tille at 2018-10-05T07:52:57Z
Refresh hardening patch

- - - - -
c7639729 by Andreas Tille at 2018-10-05T07:56:19Z
Do not specify rpath

- - - - -
66d161c1 by Andreas Tille at 2018-10-05T08:31:30Z
cme fix dpkg-control

- - - - -
6c2bbee6 by Andreas Tille at 2018-10-05T08:53:05Z
Enhance description

- - - - -
05cb4b1f by Andreas Tille at 2018-10-05T09:51:30Z
Rewritten manpages

- - - - -
bba0c690 by Andreas Tille at 2018-10-05T10:01:45Z
Fix spelling

- - - - -


16 changed files:

- debian/changelog
- − debian/clean
- debian/compat
- debian/control
- + debian/createmanpages
- + debian/haarz.1
- debian/install
- debian/manpages
- − debian/mans/lack.x.1
- − debian/mans/testrealign.x.1
- debian/patches/hardening.patch
- + debian/patches/rpath.patch
- debian/patches/series
- + debian/patches/spelling.patch
- debian/rules
- debian/mans/segemehl.x.1 → debian/segemehl.1


Changes:

=====================================
debian/changelog
=====================================
@@ -1,5 +1,5 @@
-segemehl (0.2.0+dfsg-1) UNRELEASED; urgency=medium
+segemehl (0.3-1) UNRELEASED; urgency=medium
 
   * Initial release (Closes: #<bug>)
 
- -- Andreas Tille <tille at debian.org>  Mon, 20 Jun 2016 12:09:31 +0200
+ -- Andreas Tille <tille at debian.org>  Fri, 05 Oct 2018 09:54:55 +0200


=====================================
debian/clean deleted
=====================================
@@ -1 +0,0 @@
-segemehl/*.x


=====================================
debian/compat
=====================================
@@ -1 +1 @@
-9
+11


=====================================
debian/control
=====================================
@@ -3,12 +3,14 @@ Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.
 Uploaders: Andreas Tille <tille at debian.org>
 Section: science
 Priority: optional
-Build-Depends: debhelper (>= 9),
+Build-Depends: debhelper (>= 11),
+               pkg-config,
+               libhts-dev,
                libncurses-dev,
                zlib1g-dev
-Standards-Version: 3.9.8
-Vcs-Browser: https://anonscm.debian.org/cgit/debian-med/segemehl.git
-Vcs-Git: https://anonscm.debian.org/git/debian-med/segemehl.git
+Standards-Version: 4.2.1
+Vcs-Browser: https://salsa.debian.org/med-team/segemehl
+Vcs-Git: https://salsa.debian.org/med-team/segemehl.git
 Homepage: http://www.bioinf.uni-leipzig.de/Software/segemehl/
 
 Package: segemehl
@@ -16,12 +18,22 @@ Architecture: any
 Depends: ${shlibs:Depends},
          ${misc:Depends}
 Description: short read mapping with gaps
- segemehl is a software to map short sequencer reads to reference
- genomes. Unlike other methods, segemehl is able to detect not only
- mismatches but also insertions and deletions. Furthermore, segemehl
- is not limited to a specific read length and is able to mapprimer-
- or polyadenylation contaminated reads correctly. segemehl implements
- a matching strategy based on enhanced suffix arrays (ESA). Segemehl
- now supports the SAM format, reads gziped queries to save both disk
- and memory space and allows bisulfite sequencing mapping and split
- read mapping.
+ Segemehl is a software to map short sequencer reads to reference
+ genomes. Segemehl implements a matching strategy based on enhanced
+ suffix arrays (ESA). Segemehl accepts fasta and fastq queries (gzip’ed
+ and bgzip'ed). In addition to the alignment of reads from standard DNA-
+ and RNA-seq protocols, it also allows the mapping of bisulfite converted
+ reads (Lister and Cokus) and implements a split read mapping strategy.
+ The output of segemehl is a SAM or BAM formatted alignment file. In the
+ case of split-read mapping, additional BED files are written to the
+ disc. These BED files may be summarized with the postprocessing tool
+ haarz. In the case of the alignment of bisulfite converted reads, raw
+ methylation rates may also be called with haarz.
+ .
+ In brief, for each suffix of a read, segemehl aims to find the
+ best-scoring seed. Seeds might contain insertions, deletions, and
+ mismatches (differences). The number of differences allowed within a
+ single seed is user-controlled and is crucial for the runtime of the
+ program.  Subsequently, seeds that undercut the user-defined E-value are
+ passed on to an exact semi-global alignment procedure. Finally, reads
+ with a minimum accuracy of percent are reported to the user.


=====================================
debian/createmanpages
=====================================
@@ -0,0 +1,33 @@
+#!/bin/sh
+MANDIR=debian
+mkdir -p $MANDIR
+
+VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
+NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
+PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//'`
+
+AUTHOR=".SH AUTHOR\nThis manpage was written by $DEBFULLNAME for the Debian distribution and
+can be used for any other usage of the program.
+"
+
+# If program name is different from package name or title should be
+# different from package short description change this here
+progname=${PROGNAME}
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="$NAME" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=haarz
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="Heuristic mapping of short sequences" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+echo "$MANDIR/*.1" > debian/manpages
+
+cat <<EOT
+Please enhance the help2man output.
+The following web page might be helpful in doing so:
+    http://liw.fi/manpages/
+EOT


=====================================
debian/haarz.1
=====================================
@@ -0,0 +1,29 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.7.
+.TH HAARZ "1" "October 2018" "haarz 0.3" "User Commands"
+.SH NAME
+haarz \- Heuristic mapping of short sequences
+.SH DESCRIPTION
+The program haarz belongs to the segemehl package.
+.SH SYNOPSIS
+.B haarz
+<program>
+.SH OPTIONS
+.SS available programs
+.TP
+callmethyl
+generate methylation vcf from bam
+.TP
+methylstring
+get SAM file with methylation string annotation
+.TP
+split
+summarize and annotate segemehl split info
+.SH BUGS
+Please report bugs to steve at bioinf.uni\-leipzig.de
+.SH REFERENCES
+.IP
+2008 Bioinformatik Leipzig
+.IP
+2018 Leibniz Institute on Aging (FLI)
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/install
=====================================
@@ -1 +1,2 @@
-segemehl/*.x	usr/bin
+segemehl	usr/bin
+haarz		usr/bin


=====================================
debian/manpages
=====================================
@@ -1 +1 @@
-debian/mans/*.1
+debian/*.1


=====================================
debian/mans/lack.x.1 deleted
=====================================
@@ -1,62 +0,0 @@
-.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.4.
-.TH LACK.X "1" "June 2016" "lack.x 0.2.0" "User Commands"
-.SH NAME
-lack.x \- Remapping of unmapped reads
-.SH SYNOPSIS
-.B lack.x
-[\-s] \fB\-d\fR <file> [<file> ...] \fB\-q\fR <file> [<file> ...] [\-o <string>] \fB\-r\fR <file> [\-u <file>] [\-t <n>] [\-A <n>] [\-W <n>] [\-U <n>] [\-Z <n>] [\-M <n>]
-.SH DESCRIPTION
-This program belongs to the segemehl package (see segemehl(1))
-.P
-Segemehl is a software to map short sequencer reads to reference
-genomes. Unlike other methods, segemehl is able to detect not only
-mismatches but also insertions and deletions. Furthermore, segemehl
-is not limited to a specific read length and is able to mapprimer-
-or polyadenylation contaminated reads correctly. segemehl implements
-a matching strategy based on enhanced suffix arrays (ESA). Segemehl
-now supports the SAM format, reads gziped queries to save both disk
-and memory space and allows bisulfite sequencing mapping and split
-read mapping.
-.SH OPTIONS
-.TP
-\fB\-d\fR, \fB\-\-database\fR <file> [<file> ...]
-list of path/filename(s) of database sequence(s)
-.TP
-\fB\-q\fR, \fB\-\-query\fR <file> [<file> ...]
-path/filename of alignment file
-.TP
-\fB\-o\fR, \fB\-\-outfile\fR <string>
-outputfile (default:none)
-.TP
-\fB\-r\fR, \fB\-\-remapfilename\fR <file>
-filename for reads to be remapped (default:none)
-.TP
-\fB\-u\fR, \fB\-\-nomatchfilename\fR <file>
-filename for unmatched reads (default:none)
-.TP
-\fB\-t\fR, \fB\-\-threads\fR <n>
-start <n> threads for remapping (default:1)
-.TP
-\fB\-s\fR, \fB\-\-silent\fR
-shut up!
-.TP
-\fB\-A\fR, \fB\-\-accuracy\fR <n>
-min percentage of matches per read in semi\-global alignment (default:90)
-.TP
-\fB\-W\fR, \fB\-\-minsplicecover\fR <n>
-min coverage for spliced transcripts (default:80)
-.TP
-\fB\-U\fR, \fB\-\-minfragscore\fR <n>
-min score of a spliced fragment (default:5)
-.TP
-\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
-min length of a spliced fragment (default:5)
-.TP
-\fB\-M\fR, \fB\-\-maxdist\fR <n>
-max number of distant sites to consider, 0 to disable (default:100)
-.SH BUGS
-Please report bugs to christian at bioinf.uni\-leipzig.de
-.SH AUTHOR
-This software was written by Christian Otto and others at Bioinformatik Leipzig
-.P
-This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/testrealign.x.1 deleted
=====================================
@@ -1,56 +0,0 @@
-.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.4.
-.TH TESTREALIGN.X "1" "June 2016" "testrealign.x 0.2.0" "User Commands"
-.SH NAME
-testrealign.x \- Heuristic mapping of short sequences
-.SH SYNOPSIS
-.B testrealign.x
-[\-Evn] \fB\-d\fR <file> [<file> ...] \fB\-q\fR <file> [<file> ...] [\-t <n>] [\-U <file>] [\-T <file>] [\-o <file>] [\-M <n>]
-.SH DESCRIPTION
-This program belongs to the segemehl package (see segemehl(1))
-.P
-Segemehl is a software to map short sequencer reads to reference
-genomes. Unlike other methods, segemehl is able to detect not only
-mismatches but also insertions and deletions. Furthermore, segemehl
-is not limited to a specific read length and is able to mapprimer-
-or polyadenylation contaminated reads correctly. segemehl implements
-a matching strategy based on enhanced suffix arrays (ESA). Segemehl
-now supports the SAM format, reads gziped queries to save both disk
-and memory space and allows bisulfite sequencing mapping and split
-read mapping.
-.SH OPTIONS
-.TP
-\fB\-d\fR, \fB\-\-database\fR <file> [<file> ...]
-list of path/filename(s) of database sequence(s)
-.TP
-\fB\-q\fR, \fB\-\-query\fR <file> [<file> ...]
-path/filename of alignment file
-.TP
-\fB\-E\fR, \fB\-\-expand\fR
-expand
-.TP
-\fB\-v\fR, \fB\-\-verbose\fR
-verbose
-.TP
-\fB\-n\fR, \fB\-\-norealign\fR
-do not realign
-.TP
-\fB\-t\fR, \fB\-\-threads\fR <n>
-start <n> threads for realigning (default:1)
-.TP
-\fB\-U\fR, \fB\-\-splitfile\fR <file>
-path/filename of the split bedfile (default:"splicesites.bed")
-.TP
-\fB\-T\fR, \fB\-\-transfile\fR <file>
-path/filename of bed files containing trans\-split (default:"transrealigned.bed")
-.TP
-\fB\-o\fR, \fB\-\-outfile\fR <file>
-path/filename of output sam file (default:none)
-.TP
-\fB\-M\fR, \fB\-\-maxdist\fR <n>
-max number of distant sites to consider, 0 to disable (default:100)
-.SH BUGS
-Please report bugs to steve at bioinf.uni\-leipzig.de
-.SH AUTHOR
-This software was written by Christian Otto and others at Bioinformatik Leipzig
-.P
-This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/patches/hardening.patch
=====================================
@@ -1,16 +1,28 @@
 Author: Andreas Tille <tille at debian.org>
-Last-Update: Mon, 20 Jun 2016 12:09:31 +0200
+Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
 Description: Propagate hardening options
 
---- a/segemehl/Makefile
-+++ b/segemehl/Makefile
-@@ -1,7 +1,7 @@
-   CC=gcc
-   LD=${CC} 
--  CFLAGS= -Wall -pedantic -std=c99 -g -O3 -DFIXINSMALL -DFIXINBACKSPLICE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Isrc -Ilibs -Ilibs/sufarray -Lsrc
--  LDFLAGS= -lm -lpthread -lz -lncurses 
-+  CFLAGS+= -Wall -pedantic -std=c99 -g -O3 -DFIXINSMALL -DFIXINBACKSPLICE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Isrc -Ilibs -Ilibs/sufarray -Lsrc
-+  LDFLAGS+= -lm -lpthread -lz -lncurses 
-   CTAGS=ctags > tags
-   LIBS=-lob -lm -lpthread 
+--- a/Makefile
++++ b/Makefile
+@@ -1,10 +1,10 @@
+ CC?=gcc
+ LD=${CC}
+-CFLAGS=  -Wall -pedantic -std=c99 -g -O3 -DSORTEDUNMAPPED -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Ilibs -Ilibs/sufarray -Isamtools
++CFLAGS +=  -Wall -pedantic -std=c99 -g -O3 -DSORTEDUNMAPPED -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DDBGNFO -DSHOWALIGN -DDBGLEVEL=0 -DPROGNFO -Ilibs -Ilibs/sufarray -Isamtools
+ CFLAGS += `pkg-config --cflags htslib`
+ INC := -I include
+ CTAGS = ctags > tags
+-LIB = -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
++LIB += -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
+ LIB += `pkg-config --libs htslib`
+ LIB += "-Wl,-rpath,`pkg-config --variable=libdir htslib`"
  
+@@ -30,7 +30,7 @@ LIBOBJECTS :=  $(patsubst $(LIBDIR)/%,$(
+ 
+ $(PRGTARGETS): $(OBJECTS)
+ 	@echo "Linking $@";	
+-	$(LD) $(LIBOBJECTS) $(BUILDDIR)/$@.o -o $(TARGETDIR)/$@$(TARGETEXT) $(LIB)
++	$(LD) $(LIBOBJECTS) $(BUILDDIR)/$@.o -o $(TARGETDIR)/$@$(TARGETEXT) $(LIB) $(LDFLAGS)
+ 
+ 
+ $(BUILDDIR)/%.o: $(LIBDIR)/%.c


=====================================
debian/patches/rpath.patch
=====================================
@@ -0,0 +1,14 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
+Description: Do not specify rpath
+
+--- a/Makefile
++++ b/Makefile
+@@ -6,7 +6,6 @@ INC := -I include
+ CTAGS = ctags > tags
+ LIB += -lm -lpthread -lz -lncurses -L libs -lform -lmenu -L/usr/local/lib/
+ LIB += `pkg-config --libs htslib`
+-LIB += "-Wl,-rpath,`pkg-config --variable=libdir htslib`"
+ 
+ 
+ PRGTARGETS := segemehl haarz


=====================================
debian/patches/series
=====================================
@@ -1 +1,3 @@
 hardening.patch
+rpath.patch
+spelling.patch


=====================================
debian/patches/spelling.patch
=====================================
@@ -0,0 +1,86 @@
+Author: Andreas Tille <tille at debian.org>
+Last-Update: Fri, 05 Oct 2018 09:54:55 +0200
+Description: Fix spelling
+
+--- a/libs/biofiles.c
++++ b/libs/biofiles.c
+@@ -1586,7 +1586,7 @@ bl_fastxAddMate(void *space,
+ 
+     if(bl_fastaCheckMateID(f, n, descr, descrlen) == 0) {  
+     NFO("The fasta/fastq IDs in both mate files do not match.\n", NULL); 
+-    NFO("The first mismatch occured at fastq number %u\n", n);
++    NFO("The first mismatch occurred at fastq number %u\n", n);
+     NFO("Exiting.\n", NULL);
+     exit(EXIT_FAILURE);
+     }
+@@ -3448,7 +3448,7 @@ bl_annotationRead (void *space, char *fn
+   } else if(!strcmp(suf, ".gff") || !strcmp(suf, ".gff3")) {
+    annot = bl_GFFread(NULL, fn); 
+   } else {
+-    NFO("please provide a bed or gff file with the approriate extension.\n", NULL);
++    NFO("please provide a bed or gff file with the appropriate extension.\n", NULL);
+     exit(EXIT_FAILURE);
+   }
+   return annot;
+--- a/libs/gzidx.c
++++ b/libs/gzidx.c
+@@ -554,7 +554,7 @@ int bl_bgzFillStream(FILE *fp, unsigned
+   n = fread(&input[strm->avail_in], 1, CHUNK-strm->avail_in, fp);
+   if (ferror(fp)) {
+     fprintf(stderr, "error reading bgz file.\n");
+-    perror("The following error occured:");
++    perror("The following error occurred:");
+     exit(EXIT_FAILURE);
+   }
+ 
+--- a/libs/haarz.c
++++ b/libs/haarz.c
+@@ -413,7 +413,7 @@ int main(int argc,char** argv) {
+   prg = manopt_getopts(&prgset, MIN(argc,2), argv);
+ 
+   if(prg->noofvalues == 1) { 
+-    manopt_help(&prgset, "programm needs to be selected\n");
++    manopt_help(&prgset, "program needs to be selected\n");
+   }
+ 
+   manopt_initoptionset(&optset, argv[0], NULL, 
+@@ -848,7 +848,7 @@ int main(int argc,char** argv) {
+     FREEMEMORY(NULL, unflagged);
+ 
+   } else {
+-    manopt_help(&prgset, "unkown program selected\n");
++    manopt_help(&prgset, "unknown program selected\n");
+   }
+ 
+   manopt_destructoptionset(&optset);
+--- a/libs/manopt.c
++++ b/libs/manopt.c
+@@ -1039,7 +1039,7 @@ manopt_checkconstraint(manopt_optionset*
+     case MANOPT_BLOCKSEPARATOR:
+       break;
+     default:
+-      manopt_help(optset, "unkown option %s type\n", argset->args[arg].flagname);
++      manopt_help(optset, "unknown option %s type\n", argset->args[arg].flagname);
+       break;
+   }
+ 
+--- a/libs/multicharseq.c
++++ b/libs/multicharseq.c
+@@ -423,7 +423,7 @@ initMultiCharSeqAlignment(
+   a->refstart = MAX(sub_start, (Lint)pos-loff);
+   
+   if(a->refstart > sub_end) {
+-    fprintf(stderr, "refstart > substart: skiping MultiCharSeqAlignment\n");
++    fprintf(stderr, "refstart > substart: skipping MultiCharSeqAlignment\n");
+     return 0;
+   }
+ 
+@@ -480,7 +480,7 @@ initMultiCharSeqAlignmentOpt(
+ 
+   //this should not happen  
+   if(a->refstart > sub_end) {
+-    fprintf(stderr, "refstart > substart: skiping MultiCharSeqAlignment\n");
++    fprintf(stderr, "refstart > substart: skipping MultiCharSeqAlignment\n");
+     return 0;
+   }
+ 


=====================================
debian/rules
=====================================
@@ -3,4 +3,13 @@
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 
 %:
-	dh $@ --sourcedirectory=segemehl
+	dh $@
+
+override_dh_auto_build:
+	dh_auto_build -- all
+
+override_dh_install:
+	mv segemehl.x	segemehl
+	mv haarz.x	haarz
+	find . -name "*.x"
+	dh_install


=====================================
debian/mans/segemehl.x.1 → debian/segemehl.1
=====================================
@@ -1,27 +1,37 @@
-.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.4.
-.TH SEGEMEHL.X "1" "June 2016" "segemehl.x 0.2.0" "User Commands"
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.7.
+.TH SEGEMEHL "1" "October 2018" "segemehl 0.3" "User Commands"
 .SH NAME
-segemehl.x \- Heuristic mapping of short sequences
+segemehl \- Heuristic mapping of short sequences
 .SH SYNOPSIS
-.B segemehl.x
-[\-sbcKVTYCO] \fB\-d\fR <file> [<file>] [\-q <file>] [\-p <file>] [\-i <file>] [\-j <file>] [\-x <file>] [\-y <file>] [\-B <string>] [\-F <n>] [\-m <n>] [\-t <n>] [\-o <string>] [\-u <file>] [\-D <n>] [\-J <n>]
-[\-E <double>] [\-w <double>] [\-M <n>] [\-r <n>] [\-S] [\-\-nohead] [\-e <n>] [\-n <n>] [\-X <n>] [\-A <n>] [\-W <n>] [\-U <n>] [\-Z <n>] [\-l <f>] [\-H] [\-\-showalign] [\-P <string>] [\-Q <string>]
-[\-R <n>] [\-I <n>]
+.B segemehl
+[\-besVOc] \fB\-d\fR <file> [<file>] [\-q <file>] [\-p <file>] [\-i <file>] [\-j <file>] [\-x <file>] [\-y <file>] [\-G <file>] [\-g <string>] [\-t <n>] [\-o <string>] [\-u <file>] [\-B <string>] [\-F <n>]
+[\-S [<basename>]] [\-A <n>] [\-D <n>] [\-E <double>] [\-H] [\-m <n>] [\-Z <n>] [\-W <n>] [\-U <n>] [\-l <f>] [\-w <double>] [\-X <n>] [\-J <n>] [\-I <n>] [\-M <n>] [\-n <n>] [\-r <n>] [\-\-skipidcheck]
+[\-\-showalign] [\-\-nohead]
 .SH DESCRIPTION
 Segemehl is a software to map short sequencer reads to reference
-genomes. Unlike other methods, segemehl is able to detect not only
-mismatches but also insertions and deletions. Furthermore, segemehl
-is not limited to a specific read length and is able to mapprimer-
-or polyadenylation contaminated reads correctly. segemehl implements
-a matching strategy based on enhanced suffix arrays (ESA). Segemehl
-now supports the SAM format, reads gziped queries to save both disk
-and memory space and allows bisulfite sequencing mapping and split
-read mapping.
+genomes. Segemehl implements a matching strategy based on enhanced
+suffix arrays (ESA). Segemehl accepts fasta and fastq queries (gzip’ed
+and bgzip'ed). In addition to the alignment of reads from standard DNA-
+and RNA-seq protocols, it also allows the mapping of bisulfite converted
+reads (Lister and Cokus) and implements a split read mapping strategy.
+The output of segemehl is a SAM or BAM formatted alignment file. In the
+case of split-read mapping, additional BED files are written to the
+disc. These BED files may be summarized with the postprocessing tool
+haarz. In the case of the alignment of bisulfite converted reads, raw
+methylation rates may also be called with haarz.
+.P
+In brief, for each suffix of a read, segemehl aims to find the
+best-scoring seed. Seeds might contain insertions, deletions, and
+mismatches (differences). The number of differences allowed within a
+single seed is user-controlled and is crucial for the runtime of the
+program.  Subsequently, seeds that undercut the user-defined E-value are
+passed on to an exact semi-global alignment procedure. Finally, reads
+with a minimum accuracy of percent are reported to the user.
 .SH OPTIONS
-.SS Input options
+.SS INPUT
 .TP
 \fB\-d\fR, \fB\-\-database\fR <file> [<file>]
-list of path/filename(s) of database sequence(s)
+list of path/filename(s) of fasta database sequence(s)
 .TP
 \fB\-q\fR, \fB\-\-query\fR <file>
 path/filename of query sequences (default:none)
@@ -41,78 +51,61 @@ generate db index and store to disk (default:none)
 \fB\-y\fR, \fB\-\-generate2\fR <file>
 generate second db index and store to disk (default:none)
 .TP
-\fB\-B\fR, \fB\-\-filebins\fR <string>
-file bins with basename <string> for easier data handling (default:none)
-.TP
-\fB\-F\fR, \fB\-\-bisulfite\fR <n>
-bisulfite mapping with methylC\-seq/Lister et al. (=1) or bs\-seq/Cokus et al. protocol (=2) (default:0)
-.SS General options
+\fB\-G\fR, \fB\-\-readgroupfile\fR <file>
+filename to read @RG header (default:none)
 .TP
-\fB\-m\fR, \fB\-\-minsize\fR <n>
-minimum size of queries (default:12)
-.TP
-\fB\-s\fR, \fB\-\-silent\fR
-shut up!
-.TP
-\fB\-b\fR, \fB\-\-brief\fR
-brief output
-.TP
-\fB\-c\fR, \fB\-\-checkidx\fR
-check index
+\fB\-g\fR, \fB\-\-readgroupid\fR <string>
+read group id (default:none)
 .TP
 \fB\-t\fR, \fB\-\-threads\fR <n>
 start <n> threads (default:1)
+.SS OUTPUT
 .TP
 \fB\-o\fR, \fB\-\-outfile\fR <string>
 outputfile (default:none)
 .TP
+\fB\-b\fR, \fB\-\-bamabafixoida\fR
+generate a bam output (\fB\-o\fR <filename> required)
+.TP
 \fB\-u\fR, \fB\-\-nomatchfilename\fR <file>
 filename for unmatched reads (default:none)
-.SS Options for SEEDPARAMS
-.TP
-\fB\-D\fR, \fB\-\-differences\fR <n>
-search seeds initially with <n> differences (default:1)
-.TP
-\fB\-J\fR, \fB\-\-jump\fR <n>
-search seeds with jump size <n> (0=automatic) (default:0)
 .TP
-\fB\-E\fR, \fB\-\-evalue\fR <double>
-max evalue (default:5.000000)
+\fB\-e\fR, \fB\-\-briefcigar\fR
+brief cigar string (M vs X and =)
 .TP
-\fB\-w\fR, \fB\-\-maxsplitevalue\fR <double>
-max evalue for splits (default:50.000000)
+\fB\-s\fR, \fB\-\-progressbar\fR
+show a progress bar
 .TP
-\fB\-M\fR, \fB\-\-maxinterval\fR <n>
-maximum width of a suffix array interval, i.e. a query seed will be omitted if it matches more than <n> times (default:100)
+\fB\-B\fR, \fB\-\-filebins\fR <string>
+file bins with basename <string> for easier data handling (default:none)
 .TP
-\fB\-r\fR, \fB\-\-maxout\fR <n>
-maximum number of alignments that will be reported. If set to zero, all alignments will be reported (default:0)
+\fB\-V\fR, \fB\-\-MEOP\fR
+output MEOP field for easier variance calling in SAM (XE:Z:)
+.SS ALIGNMENT
 .TP
-\fB\-S\fR, \fB\-\-splits\fR
-detect split/spliced reads (default:none)
+\fB\-F\fR, \fB\-\-bisulfite\fR <n>
+bisulfite aln with methylC\-seq/Lister et al. (=1) or bs\-seq/Cokus et al. protocol (=2) (default:0)
 .TP
-\fB\-K\fR, \fB\-\-SEGEMEHL\fR
-output SEGEMEHL format (needs to be selected for brief)
+\fB\-S\fR, \fB\-\-splits\fR [<basename>]
+detect split/spliced reads. (default:none)
 .TP
-\fB\-V\fR, \fB\-\-MEOP\fR
-output MEOP field for easier variance calling in SAM (XE:Z:)
+\fB\-A\fR, \fB\-\-accuracy\fR <n>
+min percentage of matches per read in semi\-global alignment (default:90)
 .TP
-\fB\-\-nohead\fR
-do not output header
-.SS Options for SEEDEXTENSIONPARAMS
+\fB\-D\fR, \fB\-\-differences\fR <n>
+search seeds initially with <n> differences (default:1)
 .TP
-\fB\-e\fR, \fB\-\-extensionscore\fR <n>
-score of a match during extension (default:2)
+\fB\-E\fR, \fB\-\-evalue\fR <double>
+max evalue (default:5.000000)
 .TP
-\fB\-n\fR, \fB\-\-extensionpenalty\fR <n>
-penalty for a mismatch during extension (default:4)
+\fB\-H\fR, \fB\-\-hitstrategy\fR
+report only best scoring hits (=1) or all (=0) (default:1)
 .TP
-\fB\-X\fR, \fB\-\-dropoff\fR <n>
-dropoff parameter for extension (default:8)
-.SS Options for ALIGNPARAMS
+\fB\-m\fR, \fB\-\-minsize\fR <n>
+minimum length of queries (default:12)
 .TP
-\fB\-A\fR, \fB\-\-accuracy\fR <n>
-min percentage of matches per read in semi\-global alignment (default:90)
+\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
+min length of a spliced fragment (default:20)
 .TP
 \fB\-W\fR, \fB\-\-minsplicecover\fR <n>
 min coverage for spliced transcripts (default:80)
@@ -120,44 +113,53 @@ min coverage for spliced transcripts (default:80)
 \fB\-U\fR, \fB\-\-minfragscore\fR <n>
 min score of a spliced fragment (default:18)
 .TP
-\fB\-Z\fR, \fB\-\-minfraglen\fR <n>
-min length of a spliced fragment (default:20)
-.TP
 \fB\-l\fR, \fB\-\-splicescorescale\fR <f>
-report spliced alignment with score s only if <f>*s is larger than next best spliced alignment (default:1.000000)
+report spliced alignment with score s only if <f>*s is larger than next best spliced alignment (default:0.900000)
 .TP
-\fB\-H\fR, \fB\-\-hitstrategy\fR
-report only best scoring hits (=1) or all (=0) (default:1)
+\fB\-w\fR, \fB\-\-maxsplitevalue\fR <double>
+max evalue for splits (default:50.000000)
+.SS SPECIAL
 .TP
-\fB\-\-showalign\fR
-show alignments
+\fB\-X\fR, \fB\-\-dropoff\fR <n>
+dropoff parameter for extension (default:8)
 .TP
-\fB\-P\fR, \fB\-\-prime5\fR <string>
-add 5' adapter (default:none)
+\fB\-J\fR, \fB\-\-jump\fR <n>
+search seeds with jump size <n> (0=automatic) (default:0)
 .TP
-\fB\-Q\fR, \fB\-\-prime3\fR <string>
-add 3' adapter (default:none)
+\fB\-O\fR, \fB\-\-order\fR
+sorts the output by chromsome and position (might take a while!)
 .TP
-\fB\-R\fR, \fB\-\-clipacc\fR <n>
-clipping accuracy (default:70)
+\fB\-I\fR, \fB\-\-maxpairinsertsize\fR <n>
+maximum size of the inserts (paired end) in case of multiple hits (default:200000)
 .TP
-\fB\-T\fR, \fB\-\-polyA\fR
-clip polyA tail
+\fB\-M\fR, \fB\-\-maxinterval\fR <n>
+maximum width of a suffix array interval, i.e. a query seed will be omitted if it matches more than <n> times (default:100)
 .TP
-\fB\-Y\fR, \fB\-\-autoclip\fR
-autoclip unknown 3prime adapter
+\fB\-c\fR, \fB\-\-checkidx\fR
+check index
 .TP
-\fB\-C\fR, \fB\-\-hardclip\fR
-enable hard clipping
+\fB\-n\fR, \fB\-\-extensionpenalty\fR <n>
+penalty for a mismatch during extension (default:4)
 .TP
-\fB\-O\fR, \fB\-\-order\fR
-sorts the output by chromsome and position (might take a while!)
+\fB\-r\fR, \fB\-\-maxout\fR <n>
+maximum number of alignments that will be reported. If set to zero, all alignments will be reported (default:0)
+.TP
+\fB\-\-skipidcheck\fR
+do not check whether the fastq ids of mates / paired ends match. Instead, the first mate (\fB\-q\fR) will be used for output only.
+.TP
+\fB\-\-showalign\fR
+show alignments
 .TP
-\fB\-I\fR, \fB\-\-maxinsertsize\fR <n>
-maximum size of the inserts (paired end) (default:5000)
+\fB\-\-nohead\fR
+do not output header
 .SH BUGS
 Please report bugs to steve at bioinf.uni\-leipzig.de
+.SH SEE ALSO
+http://www.bioinf.uni-leipzig.de/Software/segemehl/
+.SH REFERENCES
+.IP
+2008 Bioinformatik Leipzig
+.IP
+2018 Leibniz Institute on Aging (FLI)
 .SH AUTHOR
-This software was written by Christian Otto and others at Bioinformatik Leipzig
-.P
 This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.



View it on GitLab: https://salsa.debian.org/med-team/segemehl/compare/ec3bb6023740da34a386476c2181be528a28526a...bba0c690b6bc9e911fb54474ff4acdaba4f004d4

-- 
View it on GitLab: https://salsa.debian.org/med-team/segemehl/compare/ec3bb6023740da34a386476c2181be528a28526a...bba0c690b6bc9e911fb54474ff4acdaba4f004d4
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20181005/55446650/attachment-0001.html>


More information about the debian-med-commit mailing list