[med-svn] [gmap] 01/03: d/copyright: - refresh copyright, add not mentioned before files - add Comment that code was influenced by another software updated man pages d/rules: do not generate manpages with help2man d/docs: install NOTICE
Alex Mestiashvili
malex-guest at moszumanska.debian.org
Thu Jul 3 15:25:25 UTC 2014
This is an automated email from the git hooks/post-receive script.
malex-guest pushed a commit to branch master
in repository gmap.
commit b66b9052501c02544b38e8059b630a42c8a48d4d
Author: Alexandre Mestiashvili <alex at biotec.tu-dresden.de>
Date: Thu Jul 3 15:33:10 2014 +0200
d/copyright:
- refresh copyright, add not mentioned before files
- add Comment that code was influenced by another software
updated man pages
d/rules: do not generate manpages with help2man
d/docs: install NOTICE
---
debian/changelog | 1 +
debian/copyright | 31 +++
debian/docs | 1 +
debian/gmap.1 | 300 ++++++++++++---------
debian/gmap_build.1 | 93 +++++++
debian/gsnap.1 | 545 +++++++++++++++++++++-----------------
debian/patches/install-data-local | 2 +
debian/rules | 14 -
8 files changed, 601 insertions(+), 386 deletions(-)
diff --git a/debian/changelog b/debian/changelog
index db2f2e9..459ccef 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -11,6 +11,7 @@ gmap (2014-06-10-1) UNRELEASED; urgency=medium
- use as bindir /usr/lib/gmap
- generate map pages with help of help2man
* d/install: removed as all files are installed in /usr/lib/gmap
+ * d/copyright: - refresh copyright, add not mentioned before files - add Comment that code was influenced by another software updated man pages d/rules: do not generate manpages with help2man d/docs: install NOTICE
-- Alexandre Mestiashvili <alex at biotec.tu-dresden.de> Mon, 30 Jun 2014 13:04:45 +0200
diff --git a/debian/copyright b/debian/copyright
index e00feee..3397e8f 100644
--- a/debian/copyright
+++ b/debian/copyright
@@ -37,15 +37,46 @@ License: other
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
GENENTECH, INC. HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT,
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
+Comment:
+ Some inspiration and code was taken from
+ https://github.com/lemire/FastPFor,
+ covered by Apache-2.0 License
Files: src/getopt*
Copyright: 2002 Free Software Foundation, Inc.
License: LGPL-2.1+
See `/usr/share/common-licenses/LGPL'.
+Files: src/saca-k*
+Copyright: 2012 Ge Nong <issng at mail.sysu.edu.cn>
+License: MIT
+ Permission is hereby granted, free of charge, to any person obtaining a copy
+ of this software and associated documentation files (the "Software"), to deal
+ in the Software without restriction, including without limitation the rights
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ copies of the Software, and to permit persons to whom the Software is
+ furnished to do so, subject to the following conditions:
+ .
+ The above copyright notice and this permission notice shall be included in
+ all copies or substantial portions of the Software.
+ .
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ THE SOFTWARE.
+
+Files: src/fastlog.h
+Copyright: 2012 Paul Mineiro <paul at mineiro.com>
+License: BSD
+ See `/usr/share/common-licenses/BSD'.
+
Files: debian/*
Copyright: 2011 Shaun Jackman <sjackman at debian.org>
2012 Andreas Tille <tille at debian.org>
+ 2014 Alex Mestiashvili <alex at biotec.tu-dresden.de>
License: ISC
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
diff --git a/debian/docs b/debian/docs
index e845566..1f24c64 100644
--- a/debian/docs
+++ b/debian/docs
@@ -1 +1,2 @@
README
+NOTICE
diff --git a/debian/gmap.1 b/debian/gmap.1
index b8c6080..2a859c7 100644
--- a/debian/gmap.1
+++ b/debian/gmap.1
@@ -1,151 +1,156 @@
-.TH GMAP "1" "Jun 2012" "GMAP 2012-06-12" "User Commands"
+.TH GMAP "1" "July 2014" "GMAP 2014-06-10" "User Commands"
.SH NAME
gmap \- Genomic Mapping and Alignment Program
.SH SYNOPSIS
.B gmap
-\fB-d\fR\fIDB\fR|\fB-g\fR\fIFASTA\fR [\fIOPTION\fR]... [\fIQUERY\fR]...
+[\fI\,OPTIONS\/\fR...] \fI\,<FASTA files\/\fR...\fI\,>, or\/\fR cat <FASTA files...> | gmap [OPTIONS...]
.SH DESCRIPTION
Align the sequences QUERY to the reference, specified with
\fB-d\fR or \fB-g\fR.
-With no QUERY, read standard input.
.SH OPTIONS
-.SS Input options
+.SS Input options (must include \fB\-d\fR or \fB\-g\fR)
.TP
-\fB\-D\fR, \fB\-\-dir\fR=\fIdirectory\fR
+\fB\-D\fR, \fB\-\-dir\fR=\fI\,directory\/\fR
Genome directory
.TP
-\fB\-d\fR, \fB\-\-db\fR=\fISTRING\fR
-Genome database. If argument is '?' (with
+\fB\-d\fR, \fB\-\-db\fR=\fI\,STRING\/\fR
+Genome database. If argument is '?' (with
the quotes), this command lists available databases.
.TP
-\fB-k\fR, \fB--kmer\fR=\fIINT\fR
+\fB\-k\fR, \fB\-\-kmer\fR=\fI\,INT\/\fR
kmer size to use in genome database (allowed values: 16 or less).
If not specified, the program will find the highest available
kmer size in the genome database
.TP
-\fB--basesize\fR=\fIINT\fR
-Base size to use in genome database. If not specified, the program
-will find the highest available base size in the genome database
-within selected k-mer size
-.TP
-\fB--sampling\fR=\fIINT\fR
+\fB\-\-sampling\fR=\fI\,INT\/\fR
Sampling to use in genome database. If not specified, the program
will find the smallest available sampling value in the genome database
-within selected basesize and k-mer size
+within selected k\-mer size
.TP
\fB\-G\fR, \fB\-\-genomefull\fR
Use full genome (all ASCII chars allowed;
built explicitly during setup), not
compressed version
.TP
-\fB\-g\fR, \fB\-\-gseg\fR=\fIfilename\fR
-User-supplied genomic segment
+\fB\-g\fR, \fB\-\-gseg\fR=\fI\,filename\/\fR
+User\-supplied genomic segment
.TP
-\fB-1\fR, \fB--selfalign\fR
+\fB\-1\fR, \fB\-\-selfalign\fR
Align one sequence against itself in FASTA format via stdin
(Useful for getting protein translation of a nucleotide sequence)
.TP
-\fB-2\fR, \fB--pairalign\fR
+\fB\-2\fR, \fB\-\-pairalign\fR
Align two sequences in FASTA format via stdin, first one being
genomic and second one being cDNA
.TP
-\fB--cmdline\fR=\fISTRING\fR,\fISTRING\fR
+\fB\-\-cmdline\fR=\fI\,STRING\/\fR,STRING
Align these two sequences provided on the command line,
first one being genomic and second one being cDNA
.TP
-\fB\-q\fR, \fB\-\-part\fR=\fIINT\fR/\fIINT\fR
-Process only the i-th out of every n sequences
+\fB\-q\fR, \fB\-\-part\fR=\fI\,INT\/\fR/INT
+Process only the i\-th out of every n sequences
e.g., 0/100 or 99/100 (useful for distributing jobs
to a computer farm).
.TP
-\fB\-\-input\-buffer\fR=\fIINT\fR
+\fB\-\-input\-buffer\-size\fR=\fI\,INT\/\fR
Size of input buffer (program reads this many sequences
at a time for efficiency) (default 1000)
.SS
+.SS
Computation options
.TP
-\fB\-B\fR, \fB\-\-batch\fR=\fIINT\fR
- Mode Offsets Positions Genome
- 0 allocate mmap mmap
- 1 allocate mmap & preload mmap
- 2 allocate mmap & preload mmap & preload (default)
- 3 allocate allocate mmap & preload
- 4 allocate allocate allocate
- 5 expand allocate allocate
+\fB\-B\fR, \fB\-\-batch\fR=\fI\,INT\/\fR
+Batch mode (default = 2)
+ Mode Offsets Positions Genome
+ 0 see note mmap mmap
+ 1 see note mmap & preload mmap
+ (default) 2 see note mmap & preload mmap & preload
+ 3 see note allocate mmap & preload
+ 4 see note allocate allocate
+ 5 expand allocate allocate
-Note: For a single sequence, all data structures use mmap.
-If mmap not available and allocate not chosen, then will use fileio
-(very slow)
+Note: For a single sequence, all data structures use mmap
+If mmap not available and allocate not chosen, then will use fileio (very slow)
+.TP
+Note about \fB\-\-batch\fR and offsets: Expansion of offsets can be controlled
+independently by the \fB\-\-expand\-offsets\fR flag. The \fB\-\-batch\fR=\fI\,5\/\fR option is equivalent
+to \fB\-\-batch\fR=\fI\,4\/\fR plus \fB\-\-expand\-offsets\fR=\fI\,1\/\fR
+.TP
+\fB\-\-expand\-offsets\fR=\fI\,INT\/\fR
+Whether to expand the genomic offsets index
+Values: 0 (no, default), or 1 (yes).
+Expansion gives faster alignment, but requires more memory
.TP
-\fB--nosplicing\fR
+\fB\-\-nosplicing\fR
Turns off splicing (useful for aligning genomic sequences
onto a genome)
.TP
-\fB--min-intronlength\fR=\fIINT\fR
+\fB\-\-min\-intronlength\fR=\fI\,INT\/\fR
Min length for one internal intron (default 9). Below this size,
a genomic gap will be considered a deletion rather than an intron.
.TP
-\fB-K\fR, \fB--intronlength\fR=\fIINT\fR
+\fB\-K\fR, \fB\-\-intronlength\fR=\fI\,INT\/\fR
Max length for one internal intron (default 1000000)
.TP
-\fB-w\fR, \fB--localsplicedist\fR=\fIINT\fR
-Max length for known splice sites at ends of sequence (default 200000)
+\fB\-w\fR, \fB\-\-localsplicedist\fR=\fI\,INT\/\fR
+Max length for known splice sites at ends of sequence
+(default 2,000,000)
.TP
-\fB\-L\fR, \fB\-\-totallength\fR=\fIINT\fR
+\fB\-L\fR, \fB\-\-totallength\fR=\fI\,INT\/\fR
Max total intron length (default 2400000)
.TP
-\fB\-x\fR, \fB\-\-chimera-margin\fR=\fIINT\fR
+\fB\-x\fR, \fB\-\-chimera\-margin\fR=\fI\,INT\/\fR
Amount of unaligned sequence that triggers
-search for the remaining sequence (default 40).
+search for the remaining sequence (default 30).
Enables alignment of chimeric reads, and may help
-with some non-chimeric reads. To turn off, set to
-a large value (greater than the query length).
+with some non\-chimeric reads. To turn off, set to
+zero.
.TP
-\fB\-t\fR, \fB\-\-nthreads\fR=\fIINT\fR
-Number of worker threads
+\fB\-\-no\-chimeras\fR
+Turns off finding of chimeras. Same effect as \fB\-\-chimera\-margin\fR=\fI\,0\/\fR
.TP
-\fB\-C\fR, \fB\-\-chrsubsetfile\fR=\fIfilename\fR
-User\-supplied chromosome subset file
+\fB\-t\fR, \fB\-\-nthreads\fR=\fI\,INT\/\fR
+Number of worker threads
.TP
-\fB\-c\fR, \fB\-\-chrsubset\fR=\fIstring\fR
-Chromosome subset to search
+\fB\-c\fR, \fB\-\-chrsubset\fR=\fI\,string\/\fR
+Limit search to given chromosome
.TP
-\fB\-z\fR, \fB\-\-direction\fR=\fISTRING\fR
+\fB\-z\fR, \fB\-\-direction\fR=\fI\,STRING\/\fR
cDNA direction (sense_force, antisense_force,
-sense_filter, antisense_filter, or auto (default))
+sense_filter, antisense_filter,or auto (default))
.TP
-\fB\-H\fR, \fB\-\-trimendexons\fR=\fIINT\fR
+\fB\-H\fR, \fB\-\-trimendexons\fR=\fI\,INT\/\fR
Trim end exons with fewer than given number of matches
(in nt, default 12)
.TP
-\fB--cross-species\fR
-For cross-species alignments, use a more sensitive search for
-canonical splicing
-.TP
-\fB--canonical-mode\fR=\fIINT\fR
-Reward for canonical and semi-canonical introns
+\fB\-\-canonical\-mode\fR=\fI\,INT\/\fR
+Reward for canonical and semi\-canonical introns
0=low reward, 1=high reward (default), 2=low reward for
high\-identity sequences and high reward otherwise
.TP
-\fB--allow-close-indels\fR=\fIINT\fR
+\fB\-\-cross\-species\fR
+Use a more sensitive search for canonical splicing, which helps especially
+for cross\-species alignments and other difficult cases
+.TP
+\fB\-\-allow\-close\-indels\fR=\fI\,INT\/\fR
Allow an insertion and deletion close to each other
-(0=no, 1=yes (default), 2=only for high-quality alignments)
+(0=no, 1=yes (default), 2=only for high\-quality alignments)
.TP
-\fB--microexon-spliceprob\fR=\fIFLOAT\fR
+\fB\-\-microexon\-spliceprob\fR=\fI\,FLOAT\/\fR
Allow microexons only if one of the splice site probabilities is
greater than this value (default 0.90)
.TP
-\fB--cmetdir\fR=\fISTRING\fR
+\fB\-\-cmetdir\fR=\fI\,STRING\/\fR
Directory for methylcytosine index files (created using cmetindex)
-(default is location of genome index files specified using \-D, \-V, and \-d)
+(default is location of genome index files specified using \fB\-D\fR, \fB\-V\fR, and \fB\-d\fR)
.TP
-\fB--atoidir\fR=\fISTRING\fR
-Directory for A-to-I RNA editing index files (created using atoiindex)
-(default is location of genome index files specified using \-D, \-V, and \-d)
+\fB\-\-atoidir\fR=\fI\,STRING\/\fR
+Directory for A\-to\-I RNA editing index files (created using atoiindex)
+(default is location of genome index files specified using \fB\-D\fR, \fB\-V\fR, and \fB\-d\fR)
.TP
-\fB--mode\fR=\fISTRING\fR
-Alignment mode: standard (default), cmet-stranded, cmet-nonstranded,
-atoi-stranded, or atoi-nonstranded. Non-standard modes requires you
+\fB\-\-mode\fR=\fI\,STRING\/\fR
+Alignment mode: standard (default), cmet\-stranded, cmet\-nonstranded,
+atoi\-stranded, or atoi\-nonstranded. Non\-standard modes requires you
to have previously run the cmetindex or atoiindex programs on the genome
.TP
\fB\-p\fR, \fB\-\-prunelevel\fR
@@ -163,13 +168,13 @@ Show alignments
\fB\-3\fR, \fB\-\-continuous\fR
Show alignment in three continuous lines
.TP
-\fB\-4\fR, \fB\-\-continuous-by-exon\fR
+\fB\-4\fR, \fB\-\-continuous\-by\-exon\fR
Show alignment in three lines per exon
.TP
\fB\-Z\fR, \fB\-\-compress\fR
Print output in compressed format
.TP
-\fB\-E\fR, \fB\-\-exons\fR=\fISTRING\fR
+\fB\-E\fR, \fB\-\-exons\fR=\fI\,STRING\/\fR
Print exons ("cdna" or "genomic")
.TP
\fB\-P\fR, \fB\-\-protein_dna\fR
@@ -178,34 +183,32 @@ Print protein sequence (cDNA)
\fB\-Q\fR, \fB\-\-protein_gen\fR
Print protein sequence (genomic)
.TP
-\fB\-f\fR, \fB\-\-format\fR=\fIINT\fR
-Other format for output (also note the \-A and \-S options and other
-options listed under Output types):
- psl (or 1)= PSL (BLAT) format,
- gff3_gene (or 2)= GFF3 gene format,
- gff3_match_cdna (or 3)= GFF3 cDNA_match format,
+\fB\-f\fR, \fB\-\-format\fR=\fI\,INT\/\fR
+Other format for output (also note the \fB\-A\fR and \fB\-S\fR options
+and other options listed under Output types):
+ psl (or 1) = PSL (BLAT) format,
+ gff3_gene (or 2) = GFF3 gene format,
+ gff3_match_cdna (or 3) = GFF3 cDNA_match format,
gff3_match_est (or 4) = GFF3 EST_match format,
splicesites (or 6) = splicesites output (for GSNAP splicing file),
introns = introns output (for GSNAP splicing file),
map_exons (or 7) = IIT FASTA exon map format,
- map_genes (or 8) = IIT FASTA map format,
+ map_ranges (or 8) = IIT FASTA range map format,
coords (or 9) = coords in table format,
sampe = SAM format (setting paired_read bit in flag),
samse = SAM format (without setting paired_read bit)
.SS
Output options
.TP
-\fB\-n\fR, \fB\-\-npaths\fR=\fIINT\fR
-Maximum number of paths to show. If set to 0,
-prints two paths if chimera detected, else one.
+\fB\-n\fR, \fB\-\-npaths\fR=\fI\,INT\/\fR
+Maximum number of paths to show (default 5). If set to 1, GMAP
+will not report chimeric alignments, since those imply
+two paths. If you want a single alignment plus chimeric
+alignments, then set this to be 0.
.TP
-\fB--quiet-if-excessive\fR
-If more than maximum number of paths are found, then nothing is
-printed.
-.TP
-\fB--suboptimal-score\fR=\fIINT\fR
+\fB\-\-suboptimal\-score\fR=\fI\,INT\/\fR
Report only paths whose score is within this value of the
-best path. By default, if this option is not provided,
+best path. By default, if this option is not provided,
the program prints all paths found.
.TP
\fB\-O\fR, \fB\-\-ordered\fR
@@ -215,7 +218,7 @@ only if there is more than one worker thread)
\fB\-5\fR, \fB\-\-md5\fR
Print MD5 checksum for each query sequence
.TP
-\fB\-o\fR, \fB\-\-chimera-overlap\fR
+\fB\-o\fR, \fB\-\-chimera\-overlap\fR
Overlap to show, if any, at chimera breakpoint
.TP
\fB\-\-failsonly\fR
@@ -224,27 +227,37 @@ Print only failed alignments, those with no results
\fB\-\-nofails\fR
Exclude printing of failed alignments
.TP
-\fB\-\-fails\-as\-input\fR
-Print completely failed alignments as input FASTA or FASTQ format
+\fB\-V\fR, \fB\-\-snpsdir\fR=\fI\,STRING\/\fR
+Directory for SNPs index files (created using snpindex) (default is
+location of genome index files specified using \fB\-D\fR and \fB\-d\fR)
.TP
-\fB\-V\fR, \fB\-\-usesnps\fR=\fISTRING\fR
+\fB\-v\fR, \fB\-\-use\-snps\fR=\fI\,STRING\/\fR
Use database containing known SNPs (in <STRING>.iit, built
-previously using snpindex) for reporting output
+previously using snpindex) for tolerance to SNPs
+.TP
+\fB\-\-split\-output\fR=\fI\,STRING\/\fR
+Basename for multiple\-file output, separately for nomapping,
+uniq, mult, (and chimera, if \fB\-\-chimera\-margin\fR is selected)
+.TP
+\fB\-\-failed\-input\fR=\fI\,STRING\/\fR
+Print completely failed alignments as input FASTA or FASTQ format
+to the given file. If the \fB\-\-split\-output\fR flag is also given, this file
+is generated in addition to the output in the .nomapping file.
.TP
-\fB\-\-split-output\fR=\fISTRING\fR
-Basename for multiple-file output, separately for nomapping,
-uniq, mult, (and chimera, if \-\-chimera\-margin is selected)
+\fB\-\-append\-output\fR
+When \fB\-\-split\-output\fR or \fB\-\-failedinput\fR is given, this flag will append output
+to the existing files. Otherwise, the default is to create new files.
.TP
-\fB--output-buffer-size\fR=\fIINT\fR
-Buffer size, in queries, for output thread (default 1000). When the
-number of results to be printed exceeds this size, the worker threads
-are halted until the backlog is cleared
+\fB\-\-output\-buffer\-size\fR=\fI\,INT\/\fR
+Buffer size, in queries, for output thread (default 1000). When the number
+of results to be printed exceeds this size, the worker threads are halted
+until the backlog is cleared
.TP
\fB\-F\fR, \fB\-\-fulllength\fR
Assume full\-length protein, starting with Met
.TP
-\fB\-\-cdsstart\fR=\fIINT\fR
-Translate codons from given nucleotide (1-based)
+\fB\-a\fR, \fB\-\-cdsstart\fR=\fI\,INT\/\fR
+Translate codons from given nucleotide (1\-based)
.TP
\fB\-T\fR, \fB\-\-truncate\fR
Truncate alignment around full\-length protein, Met to Stop
@@ -258,45 +271,57 @@ Options for SAM output
\fB\-\-no\-sam\-headers\fR
Do not print headers beginning with '@'
.TP
-\fB--sam-use-0M\fM
+\fB\-\-sam\-use\-0M\fR
Insert 0M in CIGAR between adjacent insertions and deletions
Required by Picard, but can cause errors in other tools
.TP
-\fB\-\-read\-group\-id\fR=\fISTRING\fR
-Value to put into read-group id (RG-ID) field
+\fB\-\-force\-xs\-dir\fR
+For RNA\-Seq alignments, disallows XS:A:? when the sense direction
+is unclear, and replaces this value arbitrarily with XS:A:+.
+May be useful for some programs, such as Cufflinks, that cannot
+handle XS:A:?. However, if you use this flag, the reported value
+of XS:A:+ in these cases will not be meaningful.
+.TP
+\fB\-\-md\-lowercase\-snp\fR
+In MD string, when known SNPs are given by the \fB\-v\fR flag,
+prints difference nucleotides as lower\-case when they,
+differ from reference but match a known alternate allele
.TP
-\fB\-\-read\-group\-name\fR=\fISTRING\fR
-Value to put into read-group name (RG-SM) field
+\fB\-\-read\-group\-id\fR=\fI\,STRING\/\fR
+Value to put into read\-group id (RG\-ID) field
.TP
-\fB--read-group-library\fR=\fISTRING\fR
-Value to put into read-group library (RG-LB) field
+\fB\-\-read\-group\-name\fR=\fI\,STRING\/\fR
+Value to put into read\-group name (RG\-SM) field
.TP
-\fB--read-group-platform\fR=\fISTRING\fR
-Value to put into read-group library (RG-PL) field
+\fB\-\-read\-group\-library\fR=\fI\,STRING\/\fR
+Value to put into read\-group library (RG\-LB) field
+.TP
+\fB\-\-read\-group\-platform\fR=\fI\,STRING\/\fR
+Value to put into read\-group library (RG\-PL) field
.SS
Options for quality scores
.TP
-\fB--quality-protocol\fR=\fISTRING\fR
-Protocol for input quality scores. Allowed values:
- illumina (ASCII 64-126) (equivalent to \-J 64 \-j -31)
- sanger (ASCII 33-126) (equivalent to \-J 33 \-j 0)
+\fB\-\-quality\-protocol\fR=\fI\,STRING\/\fR
+Protocol for input quality scores. Allowed values:
+ illumina (ASCII 64\-126) (equivalent to \fB\-J\fR 64 \fB\-j\fR \fB\-31\fR)
+ sanger (ASCII 33\-126) (equivalent to \fB\-J\fR 33 \fB\-j\fR 0)
Default is sanger (no quality print shift)
-SAM output files should have quality scores in sanger protocol.
+SAM output files should have quality scores in sanger protocol
Or you can specify the print shift with this flag:
.TP
-\fB-j\fR, \fB--quality-print-shift\fR=\fIINT\fR
+\fB\-j\fR, \fB\-\-quality\-print\-shift\fR=\fI\,INT\/\fR
Shift FASTQ quality scores by this amount in output
-(default is 0 for sanger protocol; to change Illumina input to Sanger
-output, select -31)
+(default is 0 for sanger protocol; to change Illumina input
+to Sanger output, select \fB\-31\fR)
.SS
External map file options
.TP
-\fB\-M\fR, \fB\-\-mapdir\fR=\fIdirectory\fR
+\fB\-M\fR, \fB\-\-mapdir\fR=\fI\,directory\/\fR
Map directory
.TP
-\fB\-m\fR, \fB\-\-map\fR=\fIiitfile\fR
-Map file. If argument is '?' (with the quotes),
+\fB\-m\fR, \fB\-\-map\fR=\fI\,iitfile\/\fR
+Map file. If argument is '?' (with the quotes),
this lists available map files.
.TP
\fB\-e\fR, \fB\-\-mapexons\fR
@@ -305,7 +330,7 @@ Map each exon separately
\fB\-b\fR, \fB\-\-mapboth\fR
Report hits from both strands of genome
.TP
-\fB\-u\fR, \fB\-\-flanking\fR=\fIINT\fR
+\fB\-u\fR, \fB\-\-flanking\fR=\fI\,INT\/\fR
Show flanking hits (default 0)
.TP
\fB\-\-print\-comment\fR
@@ -316,18 +341,32 @@ Alignment output options
\fB\-N\fR, \fB\-\-nolengths\fR
No intron lengths in alignment
.TP
-\fB\-I\fR, \fB\-\-invertmode\fR=\fIINT\fR
+\fB\-I\fR, \fB\-\-invertmode\fR=\fI\,INT\/\fR
Mode for alignments to genomic (\-) strand:
- 0=Don't invert the cDNA (default)
- 1=Invert cDNA and print genomic (\-) strand
- 2=Invert cDNA and print genomic (+) strand
+0=Don't invert the cDNA (default)
+1=Invert cDNA and print genomic (\-) strand
+2=Invert cDNA and print genomic (+) strand
.TP
-\fB\-i\fR, \fB\-\-introngap\fR=\fIINT\fR
+\fB\-i\fR, \fB\-\-introngap\fR=\fI\,INT\/\fR
Nucleotides to show on each end of intron (default=3)
.TP
-\fB\-l\fR, \fB\-\-wraplength\fR=\fIINT\fR
+\fB\-l\fR, \fB\-\-wraplength\fR=\fI\,INT\/\fR
Wrap length for alignment (default=50)
.SS
+Filtering output options
+.TP
+\fB\-\-min\-trimmed\-coverage\fR=\fI\,FLOAT\/\fR
+Do not print alignments with trimmed coverage less
+this value (default=0.0, which means no filtering)
+Note that chimeric alignments will be output regardless
+of this filter
+.TP
+\fB\-\-min\-identity\fR=\fI\,FLOAT\/\fR
+Do not print alignments with identity less
+this value (default=0.0, which means no filtering)
+Note that chimeric alignments will be output regardless
+of this filter
+.SS
Help options
.TP
\fB\-\-version\fR
@@ -350,6 +389,5 @@ Report bugs to Thomas Wu <twu at gene.com>.
.SH COPYRIGHT
Copyright 2005 Genentech, Inc. All rights reserved.
.SH "SEE ALSO"
-\fBgmap_setup\fR(1), \fBgsnap\fR(1)
+\fBgmap_build\fR(1), \fBgsnap\fR(1)
.br
-http://research-pub.gene.com/gmap/
diff --git a/debian/gmap_build.1 b/debian/gmap_build.1
new file mode 100644
index 0000000..6ac1f0e
--- /dev/null
+++ b/debian/gmap_build.1
@@ -0,0 +1,93 @@
+.TH GMAP_BUILD "1" "July 2014" "GMAP 2014-06-10" "User Commands"
+.SH NAME
+gmap_build \- create a genome database for GMAP or GSNAP
+.SH SYNOPSIS
+.B gmap_build
+[\fI\,options\/\fR...] \fI\,-d <genomename> <fasta_files>\/\fR
+.SH DESCRIPTION
+.PP
+gmap_build:
+ Builds a gmap database for a genome to be used by GMAP or GSNAP.
+ Part of GMAP package, version 2014\-06\-10.
+ Starting from version 2013-10-28, gmap_setup program is removed, and now supporting only gmap_build.
+ Moved options from gmap_setup to gmap_build.
+.SH OPTIONS
+.TP
+\fB\-D\fR, \fB\-\-dir\fR=\fI\,STRING\/\fR
+Destination directory for installation (defaults to gmapdb directory specified at configure time)
+.TP
+\fB\-d\fR, \fB\-\-db\fR=\fI\,STRING\/\fR
+Genome name
+.TP
+\fB\-n\fR, \fB\-\-names\fR=\fI\,STRING\/\fR
+Substitute names for chromosomes, provided in a file. The file should have one line
+for each chromosome name to be changed, with the original FASTA name in column 1 and
+the desired chromosome name in column 2. This provides an easy way to change the
+names of chromosomes, for example, to add or remove the "chr" prefix. Column 2 may
+be blank, which indicates no name change. This file can also be combined with
+\fB\-\-sort\fR=\fI\,names\/\fR to provide a particular order for the chromosomes in the genome index.
+.TP
+\fB\-M\fR, \fB\-\-mdflag\fR=\fI\,STRING\/\fR
+Use MD file from NCBI for mapping contigs to chromosomal coordinates
+.TP
+\fB\-C\fR, \fB\-\-contigs\-are\-mapped\fR
+Find a chromosomal region in each FASTA header line. Useful for contigs that have been mapped
+to chromosomal coordinates. Ignored if the \fB\-\-mdflag\fR is provided.
+.TP
+\fB\-z\fR, \fB\-\-compression\fR=\fI\,STRING\/\fR
+Use given compression types (separated by commas; default is bitpack64)
+bitpack64 \- optimized for modern computers with SIMD instructions (recommended)
+all \- create all available compression types, currently bitpack64
+none \- do not compress offset files (I believe this is no longer supported)
+.TP
+\fB\-k\fR, \fB\-\-kmer\fR=\fI\,INT\/\fR
+k\-mer value for genomic index (allowed: 15 or less, default is 15)
+.TP
+\fB\-q\fR INT
+sampling interval for genomoe (allowed: 1\-3, default 3)
+.TP
+\fB\-s\fR, \fB\-\-sort\fR=\fI\,STRING\/\fR
+Sort chromosomes using given method:
+none \- use chromosomes as found in FASTA file(s)
+alpha \- sort chromosomes alphabetically (chr10 before chr 1)
+numeric\-alpha \- chr1, chr1U, chr2, chrM, chrU, chrX, chrY
+chrom \- chr1, chr2, chrM, chrX, chrY, chr1U, chrU
+names \- sort chromosomes based on file provided to \fB\-\-names\fR flag
+.TP
+\fB\-g\fR, \fB\-\-gunzip\fR
+Files are gzipped, so need to gunzip each file first
+.TP
+\fB\-E\fR, \fB\-\-fasta\-pipe\fR=\fI\,STRING\/\fR
+Interpret argument as a command, instead of a list of FASTA files
+.TP
+\fB\-w\fR INT
+Wait (sleep) this many seconds after each step (default 2)
+.TP
+\fB\-c\fR, \fB\-\-circular\fR=\fI\,STRING\/\fR
+Circular chromosomes (either a list of chromosomes separated by a comma, or
+a filename containing circular chromosomes, one per line). If you use the
+\fB\-\-names\fR feature, then you should use the original name of the chromosome,
+not the substitute name, for this option.
+.TP
+\fB\-e\fR, \fB\-\-nmessages\fR=\fI\,INT\/\fR
+Maximum number of messages (warnings, contig reports) to report (default 50)
+.TP
+\fB\-\-no\-sarray\fR
+Skip build of suffix array
+.SS "Obsolete options:"
+.TP
+\fB\-T\fR STRING
+Temporary build directory (may need to specify if you run out of space in your current directory)
+This is no longer necessary, since gmap_build now builds directly in the destination
+(or \fB\-D\fR) directory.
+.SH AUTHOR
+Thomas D. Wu and Colin K. Watanabe
+.SH "REPORTING BUGS"
+Report bugs to Thomas Wu <twu at gene.com>.
+.SH COPYRIGHT
+Copyright 2005 Genentech, Inc. All rights reserved.
+.SH "SEE ALSO"
+\fBgmap\fR(1), \fBgsnap\fR(1)
+.br
+http://research-pub.gene.com/gmap/
+
diff --git a/debian/gsnap.1 b/debian/gsnap.1
index f22267e..9157eef 100644
--- a/debian/gsnap.1
+++ b/debian/gsnap.1
@@ -1,345 +1,385 @@
-.TH GSNAP "1" "Jun 2012" "GMAP 2012-06-12" "User Commands"
+.TH GSNAP "1" "July 2014" "GMAP 2014-06-10" "User Commands"
.SH NAME
gsnap \- Genomic Short-read Nucleotide Alignment Program
.SH SYNOPSIS
.B gsnap
-\fB-d\fR\fIDB\fR [\fIOPTION\fR]... [\fIQUERY\fR]...
+[\fI\,OPTIONS\/\fR...] \fI\,<FASTA file>, or\/\fR cat <FASTA file> | gmap [OPTIONS...]
.SH DESCRIPTION
-Align the sequences QUERY to the reference DB.
-With no QUERY, read standard input.
-.SH OPTIONS
.SS
-Input options
+Input options (must include \fB\-d\fR)
.TP
-\fB\-D\fR, \fB\-\-dir\fR=\fIdirectory\fR
+\fB\-D\fR, \fB\-\-dir\fR=\fI\,directory\/\fR
Genome directory
.TP
-\fB\-d\fR, \fB\-\-db\fR=\fISTRING\fR
+\fB\-d\fR, \fB\-\-db\fR=\fI\,STRING\/\fR
Genome database
.TP
-\fB-k\fR, \fB--kmer\fR=\fIINT\fR
-kmer size to use in genome database (allowed values: 16 or less).
+\fB\-\-use\-sarray\fR=\fI\,INT\/\fR
+Whether to use a suffix array, which will give increased speed.
+Allowed values: 0 (no) or 1 (yes, if available, default).
+Note that suffix arrays will bias against SNP alleles in
+SNP\-tolerant alignment.
+.TP
+\fB\-k\fR, \fB\-\-kmer\fR=\fI\,INT\/\fR
+kmer size to use in genome database (allowed values: 16 or less)
If not specified, the program will find the highest available
kmer size in the genome database
.TP
-\fB--basesize\fR=\fIINT\fR
-Base size to use in genome database. If not specified, the program
-will find the highest available base size in the genome database
-within selected k-mer size
-.TP
-\fB--sampling\fR=\fIINT\fR
+\fB\-\-sampling\fR=\fI\,INT\/\fR
Sampling to use in genome database. If not specified, the program
will find the smallest available sampling value in the genome database
-within selected basesize and k-mer size
+within selected k\-mer size
.TP
-\fB\-q\fR, \fB\-\-part\fR=\fIINT/INT\fR
+\fB\-q\fR, \fB\-\-part\fR=\fI\,INT\/\fR/INT
Process only the i\-th out of every n sequences
-e.g., 0/100 or 99/100 (useful for distributing jobs to a computer farm).
+e.g., 0/100 or 99/100 (useful for distributing jobs
+to a computer farm).
.TP
-\fB\-\-input\-buffer\fR=\fIINT\fR
+\fB\-\-input\-buffer\-size\fR=\fI\,INT\/\fR
Size of input buffer (program reads this many sequences
at a time for efficiency) (default 1000)
.TP
-\fB\-\-barcode\-length\fR=\fIINT\fR
-Amount of barcode to remove from start of read (default 0)
+\fB\-\-barcode\-length\fR=\fI\,INT\/\fR
+Amount of barcode to remove from start of read
+(default 0)
.TP
-\fB\-o\fR, \fB\-\-orientation=\fISTRING\fR
-Orientation of paired-end reads
-Allowed values: FR (fwd-rev, or typical Illumina; default),
-RF (rev-fwd, for circularized inserts), or FF (fwd-fwd, same strand)
+\fB\-o\fR, \fB\-\-orientation\fR=\fI\,STRING\/\fR
+Orientation of paired\-end reads
+Allowed values: FR (fwd\-rev, or typical Illumina; default),
+RF (rev\-fwd, for circularized inserts), or FF (fwd\-fwd, same strand)
.TP
-\fB--fastq-id-start\fR=\fIINT\fR
-Starting position of identifier in FASTQ header, space-delimited (>= 1)
+\fB\-\-fastq\-id\-start\fR=\fI\,INT\/\fR
+Starting position of identifier in FASTQ header, space\-delimited (>= 1)
.TP
-\fB--fastq-id-end\fR=\fIINT\fR
-Ending position of identifier in FASTQ header, space-delimited (>= 1)
+\fB\-\-fastq\-id\-end\fR=\fI\,INT\/\fR
+Ending position of identifier in FASTQ header, space\-delimited (>= 1)
Examples:
- @HWUSI-EAS100R:6:73:941:1973#0/1
- start=1, end=1 (default)
- => identifier is HWUSI-EAS100R:6:73:941:1973#0
- @SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
- start=1, end=1
- => identifier is SRR001666.1
- start=2, end=2
- => identifier is 071112_SLXA-EAS1_s_7:5:1:817:345
- start=1, end=2
- => identifier is SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345
-.TP
-\fB--filter-chastity\fR=\fISTRING\fR
+ @HWUSI\-EAS100R:6:73:941:1973#0/1
+ start=1, end=1 (default) => identifier is HWUSI\-EAS100R:6:73:941:1973#0
+ @SRR001666.1 071112_SLXA\-EAS1_s_7:5:1:817:345 length=36
+ start=1, end=1 => identifier is SRR001666.1
+ start=2, end=2 => identifier is 071112_SLXA\-EAS1_s_7:5:1:817:345
+ start=1, end=2 => identifier is SRR001666.1 071112_SLXA\-EAS1_s_7:5:1:817:345
+.TP
+\fB\-\-force\-single\-end\fR
+When multiple FASTQ files are provided on the command line, GSNAP assumes
+they are matching paired\-end files. This flag treats each file as single\-end.
+.TP
+\fB\-\-filter\-chastity\fR=\fI\,STRING\/\fR
Skips reads marked by the Illumina chastity program. Expecting a string
after the accession having a 'Y' after the first colon, like this:
- @accession 1:Y:0:CTTGTA
+.TP
+ at accession 1:Y:0:CTTGTA
where the 'Y' signifies filtering by chastity.
Values: off (default), either, both. For 'either', a 'Y' on either end
-of a paired-end read will be filtered. For 'both', a 'Y' is required
-on both ends of a paired-end read (or on the only end of a single-end read).
+of a paired\-end read will be filtered. For 'both', a 'Y' is required
+on both ends of a paired\-end read (or on the only end of a single\-end read).
+.TP
+\fB\-\-allow\-pe\-name\-mismatch\fR
+Allows accession names of reads to mismatch in paired\-end files
+.TP
+\fB\-\-gunzip\fR
+Uncompress gzipped input files
.SS
Computation options
-.PP
+.IP
Note: GSNAP has an ultrafast algorithm for calculating mismatches up to and including
-((readlength+2)/kmer \- 2) ("ultrafast mismatches"). The program will run fastest if
+((readlength+2)/kmer \- 2) ("ultrafast mismatches"). The program will run fastest if
max\-mismatches (plus suboptimal\-levels) is within that value.
Also, indels, especially end indels, take longer to compute, although the algorithm
is still designed to be fast.
.TP
-\fB\-B\fR, \fB\-\-batch\fR=\fIINT\fR
- Mode Offsets Positions Genome
- 0 allocate mmap mmap
- 1 allocate mmap & preload mmap
- 2 allocate mmap & preload mmap & preload (default)
- 3 allocate allocate mmap & preload
- 4 allocate allocate allocate
- 5 expand allocate allocate
-
-Note: For a single sequence, all data structures use mmap.
-If mmap not available and allocate not chosen, then will use fileio
-(very slow)
-.TP
-\fB\-m\fR, \fB\-\-max\-mismatches\fR=\fIFLOAT\fR
+\fB\-B\fR, \fB\-\-batch\fR=\fI\,INT\/\fR
+ Batch mode (default = 2)
+ Mode Offsets Positions Genome Suffix array
+ 0 see note mmap mmap mmap
+ 1 see note mmap & preload mmap mmap
+ (default) 2 see note mmap & preload mmap & preload mmap & preload
+ 3 see note allocate mmap & preload mmap & preload
+ 4 see note allocate allocate mmap & preload
+ 5 see note allocate allocate allocate
+Note: For a single sequence, all data structures use mmap
+If mmap not available and allocate not chosen, then will use fileio (very slow)
+.TP
+Note about offsets: Expansion of offsets can be controlled
+independently by the \fB\-\-expand\-offsets\fR flag. However, offsets
+are accessed relatively fast in this version of GSNAP.
+.TP
+\fB\-\-expand\-offsets\fR=\fI\,INT\/\fR
+Whether to expand the genomic offsets index
+Values: 0 (no, default), or 1 (yes).
+Expansion gives faster alignment, but requires more memory
+.TP
+\fB\-m\fR, \fB\-\-max\-mismatches\fR=\fI\,FLOAT\/\fR
Maximum number of mismatches allowed (if not specified, then
-defaults to the ultrafast level of ((readlength+2)/kmer \- 2))
+defaults to the ultrafast level of ((readlength+index_interval\-1)/kmer \- 2))
+(By default, the genome index interval is 3, but this can be changed
+.TP
+by providing a different value for \fB\-q\fR to gmap_build when processing
+the genome.)
+.TP
If specified between 0.0 and 1.0, then treated as a fraction
-of each read length. Otherwise, treated as an integral number
+of each read length. Otherwise, treated as an integral number
of mismatches (including indel and splicing penalties)
-For RNA-Seq, you may need to increase this value slightly
+For RNA\-Seq, you may need to increase this value slightly
to align reads extending past the ends of an exon.
.TP
-\fB--query-unk-mismatch\fR=\fIINT\fR
+\fB\-\-query\-unk\-mismatch\fR=\fI\,INT\/\fR
Whether to count unknown (N) characters in the query as a mismatch
(0=no (default), 1=yes)
.TP
-\fB--genome-unk-mismatch\fR=\fIINT\fR
+\fB\-\-genome\-unk\-mismatch\fR=\fI\,INT\/\fR
Whether to count unknown (N) characters in the genome as a mismatch
(0=no, 1=yes (default))
.TP
-\fB--terminal-threshold\fR=\fIINT\fR
-Threshold for searching for a terminal alignment (from one end of the
-read to the best possible position at the other end) (default 2).
+\fB\-\-maxsearch\fR=\fI\,INT\/\fR
+Maximum number of alignments to find (default 1000).
+Must be larger than \fB\-\-npaths\fR, which is the number to report.
+Keeping this number large will allow for random selection among multiple alignments.
+Reducing this number can speed up the program.
+.TP
+\fB\-\-terminal\-threshold\fR=\fI\,INT\/\fR
+Threshold for computing a terminal alignment (from one end of the
+read to the best possible position at the other end) (default 2
+for standard, atoi\-stranded, and atoi\-nonstranded mode;
+default 1000 for cmet\-stranded and cmet\-nonstranded mode).
For example, if this value is 2, then if GSNAP finds an exact or
-1-mismatch alignment, it will not try to find a terminal alignment.
-Note that this default value may not be low enough if you want to
-obtain terminal alignments for very short reads, although such reads
-probably don't have enough specificity for terminal alignments anyway.
-To turn off terminal alignments, set this to a high value, greater
-than the value for \-\-max\-mismatches.
-.TP
-\fB\-i\fR, \fB\-\-indel\-penalty\fR=\fIINT\fR
+1\-mismatch alignment, it will not try to find a terminal alignment.
+To turn off the computation of terminal alignments, set this to a
+high value, greater than the value for \fB\-\-max\-mismatches\fR. However,
+note hat terminal alignments are needed to help the GMAP algorithm
+find some alignments. Therefore, to avoid getting terminal alignments
+in the output, you should generally set \fB\-\-terminal\-output\-minlength\fR
+instead of this parameter.
+.TP
+\fB\-\-terminal\-output\-minlength\fR=\fI\,INT\/\fR
+Threshold alignment length in bp for a terminal alignment result to be printed
+.TP
+(in bp) (default 25 for RNA\-Seq standard, atoi\-stranded, and atoi\-nonstranded modes;
+default MAX_READLENGTH for other RNA\-Seq modes and for DNA\-Seq in all modes).
+Setting this parameter to a value of MAX_READLENGTH or more will prevent
+all terminal alignments from being printed.
+.TP
+\fB\-i\fR, \fB\-\-indel\-penalty\fR=\fI\,INT\/\fR
Penalty for an indel (default 2).
-Counts against mismatches allowed. To find indels, make
-indel-penalty less than or equal to max-mismatches.
+Counts against mismatches allowed. To find indels, make
+indel\-penalty less than or equal to max\-mismatches.
A value < 2 can lead to false positives at read ends
.TP
-\fB\-\-indel\-endlength\fR=\fIINT\fR
+\fB\-\-indel\-endlength\fR=\fI\,INT\/\fR
Minimum length at end required for indel alignments (default 4)
.TP
-\fB\-y\fR, \fB\-\-max\-middle\-insertions\fR=\fIINT\fR
+\fB\-y\fR, \fB\-\-max\-middle\-insertions\fR=\fI\,INT\/\fR
Maximum number of middle insertions allowed (default 9)
+.HP
+\fB\-z\fR, \fB\-\-max\-middle\-deletions\fR=\fI\,INT\/\fR Maximum number of middle deletions allowed (default 30)
.TP
-\fB\-z\fR, \fB\-\-max\-middle\-deletions\fR=\fIINT\fR
-Maximum number of middle deletions allowed (default 30)
-.TP
-\fB\-Y\fR, \fB\-\-max\-end\-insertions\fR=\fIINT\fR
+\fB\-Y\fR, \fB\-\-max\-end\-insertions\fR=\fI\,INT\/\fR
Maximum number of end insertions allowed (default 3)
.TP
-\fB\-Z\fR, \fB\-\-max\-end\-deletions\fR=\fIINT\fR
+\fB\-Z\fR, \fB\-\-max\-end\-deletions\fR=\fI\,INT\/\fR
Maximum number of end deletions allowed (default 6)
.TP
-\fB\-M\fR, \fB\-\-suboptimal\-levels\fR=\fIINT\fR
+\fB\-M\fR, \fB\-\-suboptimal\-levels\fR=\fI\,INT\/\fR
Report suboptimal hits beyond best hit (default 0)
-All hits with best score plus suboptimal-levels are reported
-.TP
-\fB-a\fR, \fB--adapter-strip\fR=\fISTRING\fR
-Method for removing adapters from reads. Currently allowed values:
-off, paired.
-Default is "paired", which removes adapters from paired-end reads if a
-concordant or paired alignment cannot be found from the original read.
-To turn off, use the value "off".
-.TP
-\fB\-\-trim\-mismatch\-score\fR=\fIINT\fR
-Score to use for mismatches when trimming at ends (default is -3;
-to turn off trimming, specify 0). Warning: turning trimming off
+All hits with best score plus suboptimal\-levels are reported
+.TP
+\fB\-a\fR, \fB\-\-adapter\-strip\fR=\fI\,STRING\/\fR
+Method for removing adapters from reads. Currently allowed values: off, paired.
+Default is "off". To turn on, specify "paired", which removes adapters
+from paired\-end reads if they appear to be present.
+.TP
+\fB\-\-trim\-mismatch\-score\fR=\fI\,INT\/\fR
+Score to use for mismatches when trimming at ends (default is \fB\-3\fR;
+to turn off trimming, specify 0). Warning: turning trimming off
will give false positive mismatches at the ends of reads
.TP
-\fB--trim-indel-score\fR=\fIINT\fR
-Score to use for indels when trimming at ends (default is -4;
-to turn off trimming, specify 0). Warning: turning trimming off
+\fB\-\-trim\-indel\-score\fR=\fI\,INT\/\fR
+Score to use for indels when trimming at ends (default is \fB\-4\fR;
+to turn off trimming, specify 0). Warning: turning trimming off
will give false positive indels at the ends of reads
.TP
-\fB\-V\fR, \fB\-\-snpsdir\fR=\fISTRING\fR
+\fB\-V\fR, \fB\-\-snpsdir\fR=\fI\,STRING\/\fR
Directory for SNPs index files (created using snpindex) (default is
-location of genome index files specified using \-D and \-d)
+location of genome index files specified using \fB\-D\fR and \fB\-d\fR)
.TP
-\fB\-v\fR, \fB\-\-use\-snps\fR=\fISTRING\fR
+\fB\-v\fR, \fB\-\-use\-snps\fR=\fI\,STRING\/\fR
Use database containing known SNPs (in <STRING>.iit, built
previously using snpindex) for tolerance to SNPs
.TP
-\fB\-\-cmetdir\fR=\fISTRING\fR
+\fB\-\-cmetdir\fR=\fI\,STRING\/\fR
Directory for methylcytosine index files (created using cmetindex)
-default is location of genome index files specified using \-D, \-V, and \-d)
+(default is location of genome index files specified using \fB\-D\fR, \fB\-V\fR, and \fB\-d\fR)
.TP
-\fB--atoidir\fR=\fISTRING\fR
-Directory for A-to-I RNA editing index files (created using atoiindex)
-(default is location of genome index files specified using \-D, \-V, and
-\-d)
+\fB\-\-atoidir\fR=\fI\,STRING\/\fR
+Directory for A\-to\-I RNA editing index files (created using atoiindex)
+(default is location of genome index files specified using \fB\-D\fR, \fB\-V\fR, and \fB\-d\fR)
.TP
-\fB--mode\fR=\fISTRING\fR
-Alignment mode: standard (default), cmet-stranded, cmet-nonstranded,
-atoi-stranded, or atoi-nonstranded. Non-standard modes requires you
+\fB\-\-mode\fR=\fI\,STRING\/\fR
+Alignment mode: standard (default), cmet\-stranded, cmet\-nonstranded,
+atoi\-stranded, or atoi\-nonstranded. Non\-standard modes requires you
to have previously run the cmetindex or atoiindex programs on the genome
.TP
-\fB--tallydir\fR=\fISTRING\fR
-Directory for tally IIT file to resolve concordant multiple results
-(default is location of genome index files specified using \-D and \-d).
-Note: can just give full path name to \-\-use\-tally instead.
+\fB\-\-tallydir\fR=\fI\,STRING\/\fR
+Directory for tally IIT file to resolve concordant multiple results (default is
+location of genome index files specified using \fB\-D\fR and \fB\-d\fR). Note: can
+just give full path name to \fB\-\-use\-tally\fR instead.
.TP
-\fB--use-tally\fR=\fISTRING\fR
+\fB\-\-use\-tally\fR=\fI\,STRING\/\fR
Use this tally IIT file to resolve concordant multiple results
.TP
-\fB--runlengthdir\fR=\fISTRING\fR
-Directory for runlength IIT file to resolve concordant multiple
-results (default is location of genome index files specified using \-D
-and \-d).
-Note: can just give full path name to \-\-use\-runlength instead.
+\fB\-\-runlengthdir\fR=\fI\,STRING\/\fR
+Directory for runlength IIT file to resolve concordant multiple results (default is
+location of genome index files specified using \fB\-D\fR and \fB\-d\fR). Note: can
+just give full path name to \fB\-\-use\-runlength\fR instead.
.TP
-\fB--use-runlength\fR=\fISTRING\fR
+\fB\-\-use\-runlength\fR=\fI\,STRING\/\fR
Use this runlength IIT file to resolve concordant multiple results
.TP
-\fB\-t\fR, \fB\-\-nthreads\fR=\fIINT\fR
+\fB\-t\fR, \fB\-\-nthreads\fR=\fI\,INT\/\fR
Number of worker threads
.SS
Options for GMAP alignment within GSNAP
.TP
-\fB--gmap-mode\fR=\fISTRING\fR
-Cases to use GMAP for complex alignments containing multiple splices
-or indels.
-Allowed values: none, pairsearch, indel_knownsplice, terminal, improve
+\fB\-\-gmap\-mode\fR=\fI\,STRING\/\fR
+Cases to use GMAP for complex alignments containing multiple splices or indels
+Allowed values: none, all, pairsearch, indel_knownsplice, terminal, improve
+.TP
(or multiple values, separated by commas).
-Default: all on, i.e., pairsearch,indel_knownsplice,terminal,improve
+Default: all, i.e., pairsearch,indel_knownsplice,terminal,improve
.TP
-\fB--trigger-score-for-gmap\fR=\fIINT\fR
+\fB\-\-trigger\-score\-for\-gmap\fR=\fI\,INT\/\fR
Try GMAP pairsearch on nearby genomic regions if best score (the total
-of both ends if paired-end) exceeds this value (default 5)
+of both ends if paired\-end) exceeds this value (default 5)
+.TP
+\fB\-\-gmap\-min\-match\-length\fR=\fI\,INT\/\fR
+Keep GMAP hit only if it has this many consecutive matches (default 20)
.TP
-\fB--max-gmap-pairsearch\fR=\fIINT\fR
+\fB\-\-gmap\-allowance\fR=\fI\,INT\/\fR
+Extra mismatch/indel score allowed for GMAP alignments (default 3)
+.TP
+\fB\-\-max\-gmap\-pairsearch\fR=\fI\,INT\/\fR
Perform GMAP pairsearch on nearby genomic regions up to this many
-many candidate ends (default 10). Requires pairsearch in \-\-gmap\-mode
+many candidate ends (default 10). Requires pairsearch in \fB\-\-gmap\-mode\fR
.TP
-\fB--max-gmap-terminal\fR=\fIINT\fR
+\fB\-\-max\-gmap\-terminal\fR=\fI\,INT\/\fR
Perform GMAP terminal on nearby genomic regions up to this many
-candidate ends (default 5). Requires terminal in \-\-gmap\-mode
+candidate ends (default 5). Requires terminal in \fB\-\-gmap\-mode\fR
.TP
-\fB--max-gmap-improvement\fR=\fIINT\fR
+\fB\-\-max\-gmap\-improvement\fR=\fI\,INT\/\fR
Perform GMAP improvement on nearby genomic regions up to this many
-candidate ends (default 5). Requires improve in \-\-gmap\-mode
+candidate ends (default 5). Requires improve in \fB\-\-gmap\-mode\fR
.TP
-\fB--microexon-spliceprob\fR=\fIFLOAT\fR
+\fB\-\-microexon\-spliceprob\fR=\fI\,FLOAT\/\fR
Allow microexons only if one of the splice site probabilities is
greater than this value (default 0.90)
.SS
Splicing options for RNA\-Seq
.TP
-.TP
-\fB-N,\fR \fB--novelsplicing\fR=\fIINT\fR
+\fB\-N\fR, \fB\-\-novelsplicing\fR=\fI\,INT\/\fR
Look for novel splicing (0=no (default), 1=yes)
.TP
-\fB--splicingdir\fR=\fISTRING\fR
+\fB\-\-splicingdir\fR=\fI\,STRING\/\fR
Directory for splicing involving known sites or known introns,
-as specified by the \-s or \-\-use\-splicing flag (default is
-directory computed from \-D and \-d flags).
-Note: can just give full pathname to the \-s flag instead.
+as specified by the \fB\-s\fR or \fB\-\-use\-splicing\fR flag (default is
+directory computed from \fB\-D\fR and \fB\-d\fR flags). Note: can
+just give full pathname to the \fB\-s\fR flag instead.
.TP
-\fB\-s\fR, \fB--use-splicing\fR=\fISTRING\fR
+\fB\-s\fR, \fB\-\-use\-splicing\fR=\fI\,STRING\/\fR
Look for splicing involving known sites or known introns
-(in <STRING>.iit), at short or long distances.
-See README instructions for the distinction between known sites and
-known introns
+(in <STRING>.iit), at short or long distances
+See README instructions for the distinction between known sites
+and known introns
.TP
-\fB--ambig-splice-noclip\fR
+\fB\-\-ambig\-splice\-noclip\fR
For ambiguous known splicing at ends of the read, do not clip at the
-splice site, but extend instead into the intron. This flag makes
-sense only if you provide the \-\-use\-splicing flag, and you are trying
-to eliminate all soft clipping with \-\-trim\-mismatch\-score=0
+splice site, but extend instead into the intron. This flag makes
+sense only if you provide the \fB\-\-use\-splicing\fR flag, and you are trying
+to eliminate all soft clipping with \fB\-\-trim\-mismatch\-score\fR=\fI\,0\/\fR
.TP
-\fB\-w\fR, \fB\-\-localsplicedist\fR=\fIINT\fR
+\fB\-w\fR, \fB\-\-localsplicedist\fR=\fI\,INT\/\fR
Definition of local novel splicing event (default 200000)
.TP
-\fB\-e\fR, \fB\-\-local\-splice\-penalty\fR=\fIINT\fR
-Penalty for a local splice (default 0). Counts against mismatches allowed
+\fB\-\-novelend\-splicedist\fR=\fI\,INT\/\fR
+Distance to look for novel splices at the ends of reads (default 50000)
+.TP
+\fB\-e\fR, \fB\-\-local\-splice\-penalty\fR=\fI\,INT\/\fR
+Penalty for a local splice (default 0). Counts against mismatches allowed
.TP
-\fB\-E\fR, \fB\-\-distant\-splice\-penalty\fR=\fIINT\fR
-Penalty for a distant splice (default 1). A distant splice is one where
-the intron length exceeds the value of \-w, or \-\-localsplicedist, or is an
+\fB\-E\fR, \fB\-\-distant\-splice\-penalty\fR=\fI\,INT\/\fR
+Penalty for a distant splice (default 1). A distant splice is one where
+the intron length exceeds the value of \fB\-w\fR, or \fB\-\-localsplicedist\fR, or is an
inversion, scramble, or translocation between two different chromosomes
Counts against mismatches allowed
.TP
-\fB\-K\fR, \fB\-\-distant\-splice\-endlength\fR=\fIINT\fR
-Minimum length at end required for distant spliced alignments (default 16, min
-allowed is the value of -k, or kmer size)
+\fB\-K\fR, \fB\-\-distant\-splice\-endlength\fR=\fI\,INT\/\fR
+Minimum length at end required for distant spliced alignments (default 20, min
+allowed is the value of \fB\-k\fR, or kmer size)
.TP
-\fB-l,\fR \fB\-\-shortend\-splice\-endlength\fR=\fIINT\fR
-Minimum length at end required for short-end spliced alignments (default 2)
-but unless known splice sites are provided with the \-s flag, GSNAP may still
-need the end length to be the value of \-k, or kmer size to find a given splice
+\fB\-l\fR, \fB\-\-shortend\-splice\-endlength\fR=\fI\,INT\/\fR
+Minimum length at end required for short\-end spliced alignments (default 2,
+but unless known splice sites are provided with the \fB\-s\fR flag, GSNAP may still
+need the end length to be the value of \fB\-k\fR, or kmer size to find a given splice
.TP
-\fB\-\-distant\-splice\-identity\fR=\fIFLOAT\fR
+\fB\-\-distant\-splice\-identity\fR=\fI\,FLOAT\/\fR
Minimum identity at end required for distant spliced alignments (default 0.95)
.TP
-\fB--antistranded-penalty\fR=\fIINT\fR
+\fB\-\-antistranded\-penalty\fR=\fI\,INT\/\fR
(Not currently implemented)
-Penalty for antistranded splicing when using stranded RNA-Seq
-protocols. A positive value, such as 1, expects antisense on the
-first read and sense on the second read. Default is 0, which treats
-sense and antisense equally well
+Penalty for antistranded splicing when using stranded RNA\-Seq protocols.
+A positive value, such as 1, expects antisense on the first read
+and sense on the second read. Default is 0, which treats sense and antisense
+equally well
.TP
-\fB--merge-distant-samechr\fR
+\fB\-\-merge\-distant\-samechr\fR
Report distant splices on the same chromosome as a single splice, if possible.
Will produce a single SAM line instead of two SAM lines, which is also done
for translocations, inversions, and scramble events
.SS
Options for paired\-end reads
.TP
-\fB\-\-pairmax-dna\fR=\fIINT\fR
-Max total genomic length for DNA-Seq paired reads, or other reads
-without splicing (default 1000). Used if \-N or \-s is not specified.
+\fB\-\-pairmax\-dna\fR=\fI\,INT\/\fR
+Max total genomic length for DNA\-Seq paired reads, or other reads
+without splicing (default 1000). Used if \fB\-N\fR or \fB\-s\fR is not specified.
.TP
-\fB\-\-pairmax\-rna\fR=\fIINT\fR
-Max total genomic length for RNA-Seq paired reads, or other reads
-that could have a splice (default 200000). Used if \-N or \-s is specified.
-Should probably match the value for \-w, \-\-localsplicedist.
+\fB\-\-pairmax\-rna\fR=\fI\,INT\/\fR
+Max total genomic length for RNA\-Seq paired reads, or other reads
+that could have a splice (default 200000). Used if \fB\-N\fR or \fB\-s\fR is specified.
+Should probably match the value for \fB\-w\fR, \fB\-\-localsplicedist\fR.
.TP
-\fB--pairexpect\fR=\fIINT\fR
-Expected paired-end length, used for calling splices in medial part of
-paired-end reads (default 200)
+\fB\-\-pairexpect\fR=\fI\,INT\/\fR
+Expected paired\-end length, used for calling splices in medial part of
+paired\-end reads (default 200)
.TP
-\fB--pairdev\fR=\fIINT\fR
-Allowable deviation from expected paired-end length, used for
-calling splices in medial part of paired-end reads (default 25)
+\fB\-\-pairdev\fR=\fI\,INT\/\fR
+Allowable deviation from expected paired\-end length, used for
+calling splices in medial part of paired\-end reads (default 100)
.SS
Options for quality scores
.TP
-\fB\-\-quality\-protocol\fR=\fISTRING\fR
-Protocol for input quality scores. Allowed values:
-
- illumina (ASCII 64-126) (equivalent to \-J 64 \-j -31)
- sanger (ASCII 33-126) (equivalent to \-J 33 \-j 0)
+\fB\-\-quality\-protocol\fR=\fI\,STRING\/\fR
+Protocol for input quality scores. Allowed values:
+
+ illumina (ASCII 64\-126) (equivalent to \fB\-J\fR 64 \fB\-j\fR \fB\-31\fR)
+ sanger (ASCII 33\-126) (equivalent to \fB\-J\fR 33 \fB\-j\fR 0)
Default is sanger (no quality print shift)
SAM output files should have quality scores in sanger protocol
Or you can customize this behavior with these flags:
.TP
-\fB-J\fR, \fB\-\-quality\-zero\-score\fR=\fIINT\fR
+\fB\-J\fR, \fB\-\-quality\-zero\-score\fR=\fI\,INT\/\fR
FASTQ quality scores are zero at this ASCII value
(default is 33 for sanger protocol; for Illumina, select 64)
.TP
-\fB-j\fR, \fB\-\-quality\-print\-shift\fR=\fIINT\fR
+\fB\-j\fR, \fB\-\-quality\-print\-shift\fR=\fI\,INT\/\fR
Shift FASTQ quality scores by this amount in output
(default is 0 for sanger protocol; to change Illumina input
-to Sanger output, select -31)
+to Sanger output, select \fB\-31\fR)
.SS
Output options
.TP
-\fB\-n\fR, \fB\-\-npaths\fR=\fIINT\fR
+\fB\-n\fR, \fB\-\-npaths\fR=\fI\,INT\/\fR
Maximum number of paths to print (default 100).
.TP
\fB\-Q\fR, \fB\-\-quiet\-if\-excessive\fR
@@ -351,13 +391,12 @@ Print output in same order as input (relevant
only if there is more than one worker thread)
.TP
\fB\-\-show\-refdiff\fR
-For GSNAP output in SNP-tolerant alignment, shows all differences
+For GSNAP output in SNP\-tolerant alignment, shows all differences
relative to the reference genome as lower case (otherwise, it shows
all differences relative to both the reference and alternate genome)
.TP
-\fB--clip-overlap\fR
-For paired-end reads whose alignments overlap, clip the overlapping
-region.
+\fB\-\-clip\-overlap\fR
+For paired\-end reads whose alignments overlap, clip the overlapping region.
.TP
\fB\-\-print\-snps\fR
Print detailed information about SNPs in reads (works only if \fB\-v\fR also selected)
@@ -369,47 +408,71 @@ Print only failed alignments, those with no results
\fB\-\-nofails\fR
Exclude printing of failed alignments
.TP
-\fB\-\-fails\-as\-input\fR
-Print completely failed alignments as input FASTA or FASTQ format
-.TP
-\fB\-A\fR, \fB\-\-format\fR=\fISTRING\fR
+\fB\-A\fR, \fB\-\-format\fR=\fI\,STRING\/\fR
Another format type, other than default.
Currently implemented: sam
-Also allowed, but not installed at compile-time: goby
-(To install, need to re-compile with appropriate options)
-.TP
-\fB--output-buffer-size\fR=\fIINT\fR
-Buffer size, in queries, for output thread (default 1000). When the
-number of results to be printed exceeds this size, the worker threads
-are halted until the backlog is cleared
+Also allowed, but not installed at compile\-time: goby
+(To install, need to re\-compile with appropriate options)
+.TP
+\fB\-\-split\-output\fR=\fI\,STRING\/\fR
+Basename for multiple\-file output, separately for nomapping,
+halfmapping_uniq, halfmapping_mult, unpaired_uniq, unpaired_mult,
+paired_uniq, paired_mult, concordant_uniq, and concordant_mult results
+.TP
+\fB\-\-failed\-input\fR=\fI\,STRING\/\fR
+Print completely failed alignments as input FASTA or FASTQ format,
+to the given file, appending .1 or .2, for paired\-end data.
+If the \fB\-\-split\-output\fR flag is also given, this file is generated
+in addition to the output in the .nomapping file.
+.TP
+\fB\-\-append\-output\fR
+When \fB\-\-split\-output\fR or \fB\-\-failed\-input\fR is given, this flag will append output
+to the existing files. Otherwise, the default is to create new files.
+.TP
+\fB\-\-output\-buffer\-size\fR=\fI\,INT\/\fR
+Buffer size, in queries, for output thread (default 1000). When the number
+of results to be printed exceeds this size, the worker threads are halted
+until the backlog is cleared
.SS
Options for SAM output
.TP
\fB\-\-no\-sam\-headers\fR
Do not print headers beginning with '@'
.TP
-\fB\-\-sam\-headers\-batch\fR=\fIINT\fR
-Print headers only for this batch, as specified by -q
+\fB\-\-sam\-headers\-batch\fR=\fI\,INT\/\fR
+Print headers only for this batch, as specified by \fB\-q\fR
.TP
-\fB--sam-use-0M\fM
+\fB\-\-sam\-use\-0M\fR
Insert 0M in CIGAR between adjacent insertions and deletions
Required by Picard, but can cause errors in other tools
.TP
-\fB--sam-multiple-primaries\fR
+\fB\-\-sam\-multiple\-primaries\fR
Allows multiple alignments to be marked as primary if they
have equally good mapping scores
.TP
-\fB\-\-read\-group\-id\fR=\fISTRING\fR
-Value to put into read-group id (RG-ID) field
+\fB\-\-force\-xs\-dir\fR
+For RNA\-Seq alignments, disallows XS:A:? when the sense direction
+is unclear, and replaces this value arbitrarily with XS:A:+.
+May be useful for some programs, such as Cufflinks, that cannot
+handle XS:A:?. However, if you use this flag, the reported value
+of XS:A:+ in these cases will not be meaningful.
+.TP
+\fB\-\-md\-lowercase\-snp\fR
+In MD string, when known SNPs are given by the \fB\-v\fR flag,
+prints difference nucleotides as lower\-case when they,
+differ from reference but match a known alternate allele
+.TP
+\fB\-\-read\-group\-id\fR=\fI\,STRING\/\fR
+Value to put into read\-group id (RG\-ID) field
.TP
-\fB\-\-read\-group\-name\fR=\fISTRING\fR
-Value to put into read-group name (RG-SM) field
+\fB\-\-read\-group\-name\fR=\fI\,STRING\/\fR
+Value to put into read\-group name (RG\-SM) field
.TP
-\fB--read-group-library\fR=\fISTRING\fR
-Value to put into read-group library (RG-LB) field
+\fB\-\-read\-group\-library\fR=\fI\,STRING\/\fR
+Value to put into read\-group library (RG\-LB) field
.TP
-\fB--read-group-platform\fR=\fISTRING\fR
-Value to put into read-group library (RG-PL) field
+\fB\-\-read\-group\-platform\fR=\fI\,STRING\/\fR
+Value to put into read\-group library (RG\-PL) field
.SS
Help options
.TP
@@ -418,21 +481,21 @@ Show version
.TP
\fB\-\-help\fR
Show this help message
-.SH ENVIRONMENT
-.TP
-\fBGMAPDB\fR
-genome directory (eqivalent to \fB-D\fR)
-.SH FILES
-.TP
-~/.gmaprc
-configuration file
-.SH AUTHOR
-Thomas D. Wu and Colin K. Watanabe
-.SH "REPORTING BUGS"
-Report bugs to Thomas Wu <twu at gene.com>.
-.SH COPYRIGHT
-Copyright 2005 Genentech, Inc. All rights reserved.
-.SH "SEE ALSO"
-\fBgmap_setup\fR(1), \fBgmap\fR(1)
-.br
-http://research-pub.gene.com/gmap/
+.SH ENVIRONMENT
+.TP
+\fBGMAPDB\fR
+genome directory (eqivalent to \fB-D\fR)
+.SH FILES
+.TP
+~/.gmaprc
+configuration file
+.SH AUTHOR
+Thomas D. Wu and Colin K. Watanabe
+.SH "REPORTING BUGS"
+Report bugs to Thomas Wu <twu at gene.com>.
+.SH COPYRIGHT
+Copyright 2005 Genentech, Inc. All rights reserved.
+.SH "SEE ALSO"
+\fBgmap_build\fR(1), \fBgmap\fR(1)
+.br
+http://research-pub.gene.com/gmap/
diff --git a/debian/patches/install-data-local b/debian/patches/install-data-local
index 60307a9..f61b11e 100644
--- a/debian/patches/install-data-local
+++ b/debian/patches/install-data-local
@@ -1,3 +1,5 @@
+Description: Install data local
+
--- gmap.orig/Makefile.in
+++ gmap/Makefile.in
@@ -660,7 +660,7 @@
diff --git a/debian/rules b/debian/rules
index e1e517f..f5244b2 100755
--- a/debian/rules
+++ b/debian/rules
@@ -2,8 +2,6 @@
export DH_OPTIONS
pkg := $(shell dpkg-parsechangelog | sed -n 's/^Source: //p')
-mandir=$(CURDIR)/debian/$(pkg)/usr/share/man/man1
-version=$(shell dpkg-parsechangelog -ldebian/changelog | grep Version: | cut -f2 -d' ' | cut -f1 -d- )
%:
dh $@ --with autotools_dev
@@ -16,15 +14,3 @@ override_dh_install:
mkdir -p debian/$(pkg)/usr/bin
ln -s /usr/lib/gmap/gmap debian/$(pkg)/usr/bin/gmap
ln -s /usr/lib/gmap/gsnap debian/$(pkg)/usr/bin/gsnap
-
-override_dh_installman:
- dh_installman
-#generate man pages with help2man
- mkdir -p $(mandir)
- cd debian/$(pkg)/usr/lib/gmap/ && \
- for i in gmap gsnap; do \
- help2man --no-info --no-discard-stderr \
- --version-string="$(version)" \
- --help-option="--help" \
- ./$$i | grep -v "called with args" >$(mandir)/$$i.1; \
- done
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/gmap.git
More information about the debian-med-commit
mailing list