[med-svn] [Git][med-team/bbmap][manpages] Add new manpages and move all manpages into debian/mans subdir

Andreas Tille gitlab at salsa.debian.org
Thu Apr 4 13:39:07 BST 2019



Andreas Tille pushed to branch manpages at Debian Med / bbmap


Commits:
3758d4c6 by Andreas Tille at 2019-04-04T12:14:22Z
Add new manpages and move all manpages into debian/mans subdir

- - - - -


8 changed files:

- debian/createmanpages
- debian/manpages
- + debian/mans/bbduk.sh.1
- debian/bbmap.sh.1 → debian/mans/bbmap.sh.1
- + debian/mans/bbnorm.sh.1
- debian/bloomfilter.sh.1 → debian/mans/bloomfilter.sh.1
- + debian/mans/dedupe.sh.1
- + debian/mans/reformat.sh.1


Changes:

=====================================
debian/createmanpages
=====================================
@@ -1,5 +1,5 @@
 #!/bin/sh
-MANDIR=debian
+MANDIR=debian/mans
 mkdir -p $MANDIR
 
 VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
@@ -24,6 +24,30 @@ help2man --no-info --no-discard-stderr --help-option=" " \
             --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
 echo $AUTHOR >> $MANDIR/${progname}.1
 
+progname=bbnorm.sh
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="Kmer-based error-correction and normalization tool" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=dedupe.sh
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="Simplifies assemblies by removing duplicate or contained" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=reformat.sh
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+progname=bbduk.sh
+help2man --no-info --no-discard-stderr --help-option=" " \
+         --name="Filters, trims, or masks reads with kmer matches to an artifact/contaminant file" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
 echo "$MANDIR/*.1" > debian/manpages
 
 cat <<EOT


=====================================
debian/manpages
=====================================
@@ -1 +1 @@
-debian/*.1
+debian/mans/*.1


=====================================
debian/mans/bbduk.sh.1
=====================================
@@ -0,0 +1,488 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH BBDUK.SH "1" "April 2019" "bbduk.sh 38.43" "User Commands"
+.SH NAME
+bbduk.sh \- Filters, trims, or masks reads with kmer matches to an artifact/contaminant file
+.SH SYNOPSIS
+.B bbduk.sh
+\fI\,in=<input file> out=<output file> ref=<contaminant files>\/\fR
+.SH AUTHOR
+Written by Brian Bushnell
+Last modified March 21, 2019
+.PP
+Description:  Compares reads to the kmers in a reference dataset, optionally
+allowing an edit distance. Splits the reads into two outputs \- those that
+match the reference, and those that don't. Can also trim (remove) the matching
+parts of the reads rather than binning the reads.
+Please read bbmap/docs/guides/BBDukGuide.txt for more information.
+.PP
+Input may be stdin or a fasta or fastq file, compressed or uncompressed.
+If you pipe via stdin/stdout, please include the file type; e.g. for gzipped
+fasta input, set in=stdin.fa.gz
+.PP
+Input parameters:
+in=<file>           Main input. in=stdin.fq will pipe from stdin.
+in2=<file>          Input for 2nd read of pairs in a different file.
+ref=<file,file>     Comma\-delimited list of reference files.
+.TP
+In addition to filenames, you may also use the keywords:
+adapters, artifacts, phix, lambda, pjet, mtst, kapa
+.PP
+literal=<seq,seq>   Comma\-delimited list of literal reference sequences.
+touppercase=f       (tuc) Change all bases upper\-case.
+interleaved=auto    (int) t/f overrides interleaved autodetection.
+qin=auto            Input quality offset: 33 (Sanger), 64, or auto.
+reads=\-1            If positive, quit after processing X reads or pairs.
+copyundefined=f     (cu) Process non\-AGCT IUPAC reference bases by making all
+.TP
+possible unambiguous copies.
+Intended for short motifs
+.IP
+or adapter barcodes, as time/memory use is exponential.
+.PP
+samplerate=1        Set lower to only process a fraction of input reads.
+samref=<file>       Optional reference fasta for processing sam files.
+.PP
+Output parameters:
+out=<file>          (outnonmatch) Write reads here that do not contain
+.TP
+kmers matching the database.
+\&'out=stdout.fq' will pipe
+.IP
+to standard out.
+.PP
+out2=<file>         (outnonmatch2) Use this to write 2nd read of pairs to a
+.IP
+different file.
+.PP
+outm=<file>         (outmatch) Write reads here that fail filters.  In default
+.TP
+kfilter mode, this means any read with a matching kmer.
+In any mode, it also includes reads that fail filters such
+as minlength, mingc, maxgc, entropy, etc.  In other words,
+it includes all reads that do not go to 'out'.
+.PP
+outm2=<file>        (outmatch2) Use this to write 2nd read of pairs to a
+.IP
+different file.
+.PP
+outs=<file>         (outsingle) Use this to write singleton reads whose mate
+.IP
+was trimmed shorter than minlen.
+.PP
+stats=<file>        Write statistics about which contamininants were detected.
+refstats=<file>     Write statistics on a per\-reference\-file basis.
+rpkm=<file>         Write RPKM for each reference sequence (for RNA\-seq).
+dump=<file>         Dump kmer tables to a file, in fasta format.
+duk=<file>          Write statistics in duk's format. *DEPRECATED*
+nzo=t               Only write statistics about ref sequences with nonzero hits.
+overwrite=t         (ow) Grant permission to overwrite files.
+showspeed=t         (ss) 'f' suppresses display of processing speed.
+ziplevel=2          (zl) Compression level; 1 (min) through 9 (max).
+fastawrap=70        Length of lines in fasta output.
+qout=auto           Output quality offset: 33 (Sanger), 64, or auto.
+statscolumns=3      (cols) Number of columns for stats output, 3 or 5.
+.IP
+5 includes base counts.
+.PP
+rename=f            Rename reads to indicate which sequences they matched.
+refnames=f          Use names of reference files rather than scaffold IDs.
+trd=f               Truncate read and ref names at the first whitespace.
+ordered=f           Set to true to output reads in same order as input.
+maxbasesout=\-1      If positive, quit after writing approximately this many
+.IP
+bases to out (outu/outnonmatch).
+.PP
+maxbasesoutm=\-1     If positive, quit after writing approximately this many
+.IP
+bases to outm (outmatch).
+.PP
+json=f              Print to screen in json format.
+.PP
+Histogram output parameters:
+bhist=<file>        Base composition histogram by position.
+qhist=<file>        Quality histogram by position.
+qchist=<file>       Count of bases with each quality value.
+aqhist=<file>       Histogram of average read quality.
+bqhist=<file>       Quality histogram designed for box plots.
+lhist=<file>        Read length histogram.
+phist=<file>        Polymer length histogram.
+gchist=<file>       Read GC content histogram.
+ihist=<file>        Insert size histogram, for paired reads in mapped sam.
+gcbins=100          Number gchist bins.  Set to 'auto' to use read length.
+maxhistlen=6000     Set an upper bound for histogram lengths; higher uses
+.TP
+more memory.
+The default is 6000 for some histograms
+.IP
+and 80000 for others.
+.PP
+Histograms for mapped sam/bam files only:
+histbefore=t        Calculate histograms from reads before processing.
+ehist=<file>        Errors\-per\-read histogram.
+qahist=<file>       Quality accuracy histogram of error rates versus quality
+.IP
+score.
+.PP
+indelhist=<file>    Indel length histogram.
+mhist=<file>        Histogram of match, sub, del, and ins rates by position.
+idhist=<file>       Histogram of read count versus percent identity.
+idbins=100          Number idhist bins.  Set to 'auto' to use read length.
+varfile=<file>      Ignore substitution errors listed in this file when
+.TP
+calculating error rates.
+Can be generated with
+.IP
+CallVariants.
+.PP
+vcf=<file>          Ignore substitution errors listed in this VCF file
+.IP
+when calculating error rates.
+.PP
+ignorevcfindels=t   Also ignore indels listed in the VCF.
+.PP
+Processing parameters:
+k=27                Kmer length used for finding contaminants.  Contaminants
+.TP
+shorter than k will not be found.
+k must be at least 1.
+.PP
+rcomp=t             Look for reverse\-complements of kmers in addition to
+.IP
+forward kmers.
+.PP
+maskmiddle=t        (mm) Treat the middle base of a kmer as a wildcard, to
+.IP
+increase sensitivity in the presence of errors.
+.PP
+minkmerhits=1       (mkh) Reads need at least this many matching kmers
+.IP
+to be considered as matching the reference.
+.PP
+minkmerfraction=0.0 (mkf) A reads needs at least this fraction of its total
+.TP
+kmers to hit a ref, in order to be considered a match.
+If this and minkmerhits are set, the greater is used.
+.PP
+mincovfraction=0.0  (mcf) A reads needs at least this fraction of its total
+.TP
+bases to be covered by ref kmers to be considered a match.
+If specified, mcf overrides mkh and mkf.
+.PP
+hammingdistance=0   (hdist) Maximum Hamming distance for ref kmers (subs only).
+.IP
+Memory use is proportional to (3*K)^hdist.
+.PP
+qhdist=0            Hamming distance for query kmers; impacts speed, not memory.
+editdistance=0      (edist) Maximum edit distance from ref kmers (subs
+.TP
+and indels).
+Memory use is proportional to (8*K)^edist.
+.PP
+hammingdistance2=0  (hdist2) Sets hdist for short kmers, when using mink.
+qhdist2=0           Sets qhdist for short kmers, when using mink.
+editdistance2=0     (edist2) Sets edist for short kmers, when using mink.
+forbidn=f           (fn) Forbids matching of read kmers containing N.
+.TP
+By default, these will match a reference 'A' if
+hdist>0 or edist>0, to increase sensitivity.
+.PP
+removeifeitherbad=t (rieb) Paired reads get sent to 'outmatch' if either is
+.TP
+match (or either is trimmed shorter than minlen).
+Set to false to require both.
+.PP
+trimfailures=f      Instead of discarding failed reads, trim them to 1bp.
+.IP
+This makes the statistics a bit odd.
+.PP
+findbestmatch=f     (fbm) If multiple matches, associate read with sequence
+.TP
+sharing most kmers.
+Reduces speed.
+.PP
+skipr1=f            Don't do kmer\-based operations on read 1.
+skipr2=f            Don't do kmer\-based operations on read 2.
+ecco=f              For overlapping paired reads only.  Performs errorcorrection with BBMerge prior to kmer operations.
+recalibrate=f       (recal) Recalibrate quality scores.  Requires calibration
+.IP
+matrices generated by CalcTrueQuality.
+.PP
+sam=<file,file>     If recalibration is desired, and matrices have not already
+.IP
+been generated, BBDuk will create them from the sam file.
+.PP
+amino=f             Run in amino acid mode.  Some features have not been
+.TP
+tested, but kmer\-matching works fine.
+Maximum k is 12.
+.PP
+Speed and Memory parameters:
+threads=auto        (t) Set number of threads to use; default is number of
+.IP
+logical processors.
+.PP
+prealloc=f          Preallocate memory in table.  Allows faster table loading
+.IP
+and more efficient memory usage, for a large reference.
+.PP
+monitor=f           Kill this process if it crashes.  monitor=600,0.01 would
+.IP
+kill after 600 seconds under 1% usage.
+.PP
+minrskip=1          (mns) Force minimal skip interval when indexing reference
+.TP
+kmers.
+1 means use all, 2 means use every other kmer, etc.
+.PP
+maxrskip=1          (mxs) Restrict maximal skip interval when indexing
+.TP
+reference kmers. Normally all are used for scaffolds<100kb,
+but with longer scaffolds, up to maxrskip\-1 are skipped.
+.PP
+rskip=              Set both minrskip and maxrskip to the same value.
+.IP
+If not set, rskip will vary based on sequence length.
+.PP
+qskip=1             Skip query kmers to increase speed.  1 means use all.
+speed=0             Ignore this fraction of kmer space (0\-15 out of 16) in both
+.TP
+reads and reference.
+Increases speed and reduces memory.
+.PP
+Note: Do not use more than one of 'speed', 'qskip', and 'rskip'.
+.PP
+Trimming/Filtering/Masking parameters:
+Note \- if ktrim, kmask, and ksplit are unset, the default behavior is kfilter.
+All kmer processing modes are mutually exclusive.
+Reads only get sent to 'outm' purely based on kmer matches in kfilter mode.
+.PP
+ktrim=f             Trim reads to remove bases matching reference kmers.
+.TP
+Values:
+f (don't trim),
+r (trim to the right),
+l (trim to the left)
+.PP
+kmask=              Replace bases matching ref kmers with another symbol.
+.TP
+Allows any non\-whitespace character, and processes short
+kmers on both ends if mink is set.  'kmask=lc' will
+convert masked bases to lowercase.
+.PP
+maskfullycovered=f  (mfc) Only mask bases that are fully covered by kmers.
+ksplit=f            For single\-ended reads only.  Reads will be split into
+.TP
+pairs around the kmer.
+If the kmer is at the end of the
+.TP
+read, it will be trimmed instead.
+Singletons will go to
+.TP
+out, and pairs will go to outm.
+Do not use ksplit with
+.IP
+other operations such as quality\-trimming or filtering.
+.PP
+mink=0              Look for shorter kmers at read tips down to this length,
+.TP
+when k\-trimming or masking.
+0 means disabled.  Enabling
+.IP
+this will disable maskmiddle.
+.PP
+qtrim=f             Trim read ends to remove bases with quality below trimq.
+.TP
+Performed AFTER looking for kmers.
+Values:
+.TP
+rl (trim both ends),
+f (neither end),
+r (right end only),
+l (left end only),
+w (sliding window).
+.PP
+trimq=6             Regions with average quality BELOW this will be trimmed,
+.TP
+if qtrim is set to something other than f.
+Can be a
+.IP
+floating\-point number like 7.3.
+.PP
+trimclip=f          Trim soft\-clipped bases from sam files.
+minlength=10        (ml) Reads shorter than this after trimming will be
+.TP
+discarded.
+Pairs will be discarded if both are shorter.
+.PP
+mlf=0               (minlengthfraction) Reads shorter than this fraction of
+.IP
+original length after trimming will be discarded.
+.PP
+maxlength=          Reads longer than this after trimming will be discarded.
+.IP
+Pairs will be discarded only if both are longer.
+.PP
+minavgquality=0     (maq) Reads with average quality (after trimming) below
+.IP
+this will be discarded.
+.PP
+maqb=0              If positive, calculate maq from this many initial bases.
+minbasequality=0    (mbq) Reads with any base below this quality (after
+.IP
+trimming) will be discarded.
+.PP
+maxns=\-1            If non\-negative, reads with more Ns than this
+.IP
+(after trimming) will be discarded.
+.PP
+mcb=0               (minconsecutivebases) Discard reads without at least
+.IP
+this many consecutive called bases.
+.PP
+ottm=f              (outputtrimmedtomatch) Output reads trimmed to shorter
+.IP
+than minlength to outm rather than discarding.
+.PP
+tp=0                (trimpad) Trim this much extra around matching kmers.
+tbo=f               (trimbyoverlap) Trim adapters based on where paired
+.IP
+reads overlap.
+.PP
+strictoverlap=t     Adjust sensitivity for trimbyoverlap mode.
+minoverlap=14       Require this many bases of overlap for detection.
+mininsert=40        Require insert size of at least this for overlap.
+.IP
+Should be reduced to 16 for small RNA sequencing.
+.PP
+tpe=f               (trimpairsevenly) When kmer right\-trimming, trim both
+.IP
+reads to the minimum length of either.
+.PP
+forcetrimleft=0     (ftl) If positive, trim bases to the left of this position
+.IP
+(exclusive, 0\-based).
+.PP
+forcetrimright=0    (ftr) If positive, trim bases to the right of this position
+.IP
+(exclusive, 0\-based).
+.PP
+forcetrimright2=0   (ftr2) If positive, trim this many bases on the right end.
+forcetrimmod=0      (ftm) If positive, right\-trim length to be equal to zero,
+.IP
+modulo this number.
+.PP
+restrictleft=0      If positive, only look for kmer matches in the
+.IP
+leftmost X bases.
+.PP
+restrictright=0     If positive, only look for kmer matches in the
+.IP
+rightmost X bases.
+.PP
+mingc=0             Discard reads with GC content below this.
+maxgc=1             Discard reads with GC content above this.
+gcpairs=t           Use average GC of paired reads.
+.IP
+Also affects gchist.
+.PP
+tossjunk=f          Discard reads with invalid characters as bases.
+swift=f             Trim Swift sequences: Trailing C/T/N R1, leading G/A/N R2.
+.PP
+Header\-parsing parameters \- these require Illumina headers:
+chastityfilter=f    (cf) Discard reads with id containing ' 1:Y:' or ' 2:Y:'.
+barcodefilter=f     Remove reads with unexpected barcodes if barcodes is set,
+.TP
+or barcodes containing 'N' otherwise.
+A barcode must be
+.TP
+the last part of the read header.
+Values:
+.TP
+t:
+Remove reads with bad barcodes.
+.TP
+f:
+Ignore barcodes.
+.IP
+crash: Crash upon encountering bad barcodes.
+.PP
+barcodes=           Comma\-delimited list of barcodes or files of barcodes.
+xmin=\-1             If positive, discard reads with a lesser X coordinate.
+ymin=\-1             If positive, discard reads with a lesser Y coordinate.
+xmax=\-1             If positive, discard reads with a greater X coordinate.
+ymax=\-1             If positive, discard reads with a greater Y coordinate.
+.PP
+Polymer trimming:
+trimpolya=0         If greater than 0, trim poly\-A or poly\-T tails of
+.IP
+at least this length on either end of reads.
+.PP
+trimpolygleft=0     If greater than 0, trim poly\-G prefixes of at least this
+.TP
+length on the left end of reads.
+Does not trim poly\-C.
+.PP
+trimpolygright=0    If greater than 0, trim poly\-G tails of at least this
+.TP
+length on the right end of reads.
+Does not trim poly\-C.
+.PP
+trimpolyg=0         This sets both left and right at once.
+filterpolyg=0       If greater than 0, remove reads with a poly\-G prefix of
+.IP
+at least this length (on the left).
+.PP
+Note: there are also equivalent poly\-C flags.
+.PP
+Polymer tracking:
+pratio=base,base    'pratio=G,C' will print the ratio of G to C polymers.
+plen=20             Length of homopolymers to count.
+.PP
+Entropy/Complexity parameters:
+entropy=\-1          Set between 0 and 1 to filter reads with entropy below
+.TP
+that value.
+Higher is more stringent.
+.PP
+entropywindow=50    Calculate entropy using a sliding window of this length.
+entropyk=5          Calculate entropy using kmers of this length.
+minbasefrequency=0  Discard reads with a minimum base frequency below this.
+entropymask=f       Values:
+.TP
+f:
+Discard low\-entropy sequences.
+.TP
+t:
+Mask low\-entropy parts of sequences with N.
+.IP
+lc: Change low\-entropy parts of sequences to lowercase.
+.PP
+entropymark=f       Mark each base with its entropy value.  This is on a scale
+.TP
+of 0\-41 and is reported as quality scores, so the output
+should be fastq or fasta+qual.
+.PP
+Cardinality estimation:
+cardinality=f       (loglog) Count unique kmers using the LogLog algorithm.
+cardinalityout=f    (loglogout) Count unique kmers in output reads.
+loglogk=31          Use this kmer length for counting.
+loglogbuckets=1999  Use this many buckets for counting.
+.PP
+Java Parameters:
+.PP
+\fB\-Xmx\fR                This will set Java's memory usage, overriding autodetection.
+.TP
+\fB\-Xmx20g\fR will
+specify 20 gigs of RAM, and \fB\-Xmx200m\fR will specify 200 megs.
+The max is typically 85% of physical memory.
+.PP
+\fB\-eoom\fR               This flag will cause the process to exit if an
+.TP
+out\-of\-memory exception occurs.
+Requires Java 8u92+.
+.PP
+\fB\-da\fR                 Disable assertions.
+.PP
+Please contact Brian Bushnell at bbushnell at lbl.gov if you encounter any problems.
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/bbmap.sh.1 → debian/mans/bbmap.sh.1
=====================================


=====================================
debian/mans/bbnorm.sh.1
=====================================
@@ -0,0 +1,139 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH BBNORM.SH "1" "April 2019" "bbnorm.sh 38.43" "User Commands"
+.SH NAME
+bbnorm.sh \- Kmer-based error-correction and normalization tool
+.SH SYNOPSIS
+.B bbnorm.sh
+\fI\,in=<input> out=<reads to keep> outt=<reads to toss> hist=<histogram output>\/\fR
+.SH AUTHOR
+Written by Brian Bushnell
+Last modified October 19, 2017
+.PP
+Description:  Normalizes read depth based on kmer counts.
+Can also error\-correct, bin reads by kmer depth, and generate a kmer depth histogram.
+However, Tadpole has superior error\-correction to BBNorm.
+Please read bbmap/docs/guides/BBNormGuide.txt for more information.
+.PP
+Input parameters:
+in=null             Primary input.  Use in2 for paired reads in a second file
+in2=null            Second input file for paired reads in two files
+extra=null          Additional files to use for input (generating hash table) but not for output
+fastareadlen=2^31   Break up FASTA reads longer than this.  Can be useful when processing scaffolded genomes
+tablereads=\-1       Use at most this many reads when building the hashtable (\fB\-1\fR means all)
+kmersample=1        Process every nth kmer, and skip the rest
+readsample=1        Process every nth read, and skip the rest
+interleaved=auto    May be set to true or false to force the input read file to ovverride autodetection of the input file as paired interleaved.
+qin=auto            ASCII offset for input quality.  May be 33 (Sanger), 64 (Illumina), or auto.
+.PP
+Output parameters:
+out=<file>          File for normalized or corrected reads.  Use out2 for paired reads in a second file
+outt=<file>         (outtoss) File for reads that were excluded from primary output
+reads=\-1            Only process this number of reads, then quit (\fB\-1\fR means all)
+sampleoutput=t      Use sampling on output as well as input (not used if sample rates are 1)
+keepall=f           Set to true to keep all reads (e.g. if you just want error correction).
+zerobin=f           Set to true if you want kmers with a count of 0 to go in the 0 bin instead of the 1 bin in histograms.
+.TP
+Default is false, to prevent confusion about how there can be 0\-count kmers.
+The reason is that based on the 'minq' and 'minprob' settings, some kmers may be excluded from the bloom filter.
+.PP
+tmpdir=      This will specify a directory for temp files (only needed for multipass runs).  If null, they will be written to the output directory.
+usetempdir=t        Allows enabling/disabling of temporary directory; if disabled, temp files will be written to the output directory.
+qout=auto           ASCII offset for output quality.  May be 33 (Sanger), 64 (Illumina), or auto (same as input).
+rename=f            Rename reads based on their kmer depth.
+.PP
+Hashing parameters:
+k=31                Kmer length (values under 32 are most efficient, but arbitrarily high values are supported)
+bits=32             Bits per cell in bloom filter; must be 2, 4, 8, 16, or 32.  Maximum kmer depth recorded is 2^cbits.  Automatically reduced to 16 in 2\-pass.
+.IP
+Large values decrease accuracy for a fixed amount of memory, so use the lowest number you can that will still capture highest\-depth kmers.
+.PP
+hashes=3            Number of times each kmer is hashed and stored.  Higher is slower.
+.IP
+Higher is MORE accurate if there is enough memory, and LESS accurate if there is not enough memory.
+.PP
+prefilter=f         True is slower, but generally more accurate; filters out low\-depth kmers from the main hashtable.  The prefilter is more memory\-efficient because it uses 2\-bit cells.
+prehashes=2         Number of hashes for prefilter.
+prefilterbits=2     (pbits) Bits per cell in prefilter.
+prefiltersize=0.35  Fraction of memory to allocate to prefilter.
+buildpasses=1       More passes can sometimes increase accuracy by iteratively removing low\-depth kmers
+minq=6              Ignore kmers containing bases with quality below this
+minprob=0.5         Ignore kmers with overall probability of correctness below this
+threads=auto        (t) Spawn exactly X hashing threads (default is number of logical processors).  Total active threads may exceed X due to I/O threads.
+rdk=t               (removeduplicatekmers) When true, a kmer's count will only be incremented once per read pair, even if that kmer occurs more than once.
+.PP
+Normalization parameters:
+fixspikes=f         (fs) Do a slower, high\-precision bloom filter lookup of kmers that appear to have an abnormally high depth due to collisions.
+target=100          (tgt) Target normalization depth.  NOTE:  All depth parameters control kmer depth, not read depth.
+.TP
+For kmer depth Dk, read depth Dr, read length R, and kmer size K:
+Dr=Dk*(R/(R\-K+1))
+.PP
+maxdepth=\-1         (max) Reads will not be downsampled when below this depth, even if they are above the target depth.
+mindepth=5          (min) Kmers with depth below this number will not be included when calculating the depth of a read.
+minkmers=15         (mgkpr) Reads must have at least this many kmers over min depth to be retained.  Aka 'mingoodkmersperread'.
+percentile=54.0     (dp) Read depth is by default inferred from the 54th percentile of kmer depth, but this may be changed to any number 1\-100.
+uselowerdepth=t     (uld) For pairs, use the depth of the lower read as the depth proxy.
+deterministic=t     (dr) Generate random numbers deterministically to ensure identical output between multiple runs.  May decrease speed with a huge number of threads.
+passes=2            (p) 1 pass is the basic mode.  2 passes (default) allows greater accuracy, error detection, better contol of output depth.
+.PP
+Error detection parameters:
+hdp=90.0            (highdepthpercentile) Position in sorted kmer depth array used as proxy of a read's high kmer depth.
+ldp=25.0            (lowdepthpercentile) Position in sorted kmer depth array used as proxy of a read's low kmer depth.
+tossbadreads=f      (tbr) Throw away reads detected as containing errors.
+requirebothbad=f    (rbb) Only toss bad pairs if both reads are bad.
+errordetectratio=125   (edr) Reads with a ratio of at least this much between their high and low depth kmers will be classified as error reads.
+highthresh=12       (ht) Threshold for high kmer.  A high kmer at this or above are considered non\-error.
+lowthresh=3         (lt) Threshold for low kmer.  Kmers at this and below are always considered errors.
+.PP
+Error correction parameters:
+ecc=f               Set to true to correct errors.  NOTE: Tadpole is now preferred for ecc as it does a better job.
+ecclimit=3          Correct up to this many errors per read.  If more are detected, the read will remain unchanged.
+errorcorrectratio=140  (ecr) Adjacent kmers with a depth ratio of at least this much between will be classified as an error.
+echighthresh=22     (echt) Threshold for high kmer.  A kmer at this or above may be considered non\-error.
+eclowthresh=2       (eclt) Threshold for low kmer.  Kmers at this and below are considered errors.
+eccmaxqual=127      Do not correct bases with quality above this value.
+aec=f               (aggressiveErrorCorrection) Sets more aggressive values of ecr=100, ecclimit=7, echt=16, eclt=3.
+cec=f               (conservativeErrorCorrection) Sets more conservative values of ecr=180, ecclimit=2, echt=30, eclt=1, sl=4, pl=4.
+meo=f               (markErrorsOnly) Marks errors by reducing quality value of suspected errors; does not correct anything.
+mue=t               (markUncorrectableErrors) Marks errors only on uncorrectable reads; requires 'ecc=t'.
+overlap=f           (ecco) Error correct by read overlap.
+.PP
+Depth binning parameters:
+lowbindepth=10      (lbd) Cutoff for low depth bin.
+highbindepth=80     (hbd) Cutoff for high depth bin.
+outlow=<file>       Pairs in which both reads have a median below lbd go into this file.
+outhigh=<file>      Pairs in which both reads have a median above hbd go into this file.
+outmid=<file>       All other pairs go into this file.
+.PP
+Histogram parameters:
+hist=<file>         Specify a file to write the input kmer depth histogram.
+histout=<file>      Specify a file to write the output kmer depth histogram.
+histcol=3           (histogramcolumns) Number of histogram columns, 2 or 3.
+pzc=f               (printzerocoverage) Print lines in the histogram with zero coverage.
+histlen=1048576     Max kmer depth displayed in histogram.  Also affects statistics displayed, but does not affect normalization.
+.PP
+Peak calling parameters:
+peaks=<file>        Write the peaks to this file.  Default is stdout.
+minHeight=2         (h) Ignore peaks shorter than this.
+minVolume=5         (v) Ignore peaks with less area than this.
+minWidth=3          (w) Ignore peaks narrower than this.
+minPeak=2           (minp) Ignore peaks with an X\-value below this.
+maxPeak=BIG         (maxp) Ignore peaks with an X\-value above this.
+maxPeakCount=8      (maxpc) Print up to this many peaks (prioritizing height).
+.PP
+Java Parameters:
+\fB\-Xmx\fR                This will set Java's memory usage, overriding autodetection.
+.TP
+\fB\-Xmx20g\fR will specify 20 gigs of RAM, and \fB\-Xmx200m\fR will specify 200 megs.
+The max is typically 85% of physical memory.
+.PP
+\fB\-eoom\fR               This flag will cause the process to exit if an
+.TP
+out\-of\-memory exception occurs.
+Requires Java 8u92+.
+.PP
+\fB\-da\fR                 Disable assertions.
+.PP
+Please contact Brian Bushnell at bbushnell at lbl.gov if you encounter any problems.
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/bloomfilter.sh.1 → debian/mans/bloomfilter.sh.1
=====================================


=====================================
debian/mans/dedupe.sh.1
=====================================
@@ -0,0 +1,137 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH DEDUPE.SH "1" "April 2019" "dedupe.sh 38.43" "User Commands"
+.SH NAME
+dedupe.sh \- Simplifies assemblies by removing duplicate or contained
+.SH SYNOPSIS
+.B dedupe.sh
+\fI\,in=<file or stdin> out=<file or stdout>\/\fR
+.SH AUTHOR
+Written by Brian Bushnell and Jonathan Rood
+Last modified November 20, 2017
+.PP
+Description:  Accepts one or more files containing sets of sequences (reads or scaffolds).
+Removes duplicate sequences, which may be specified to be exact matches, subsequences, or sequences within some percent identity.
+Can also find overlapping sequences and group them into clusters.
+Please read bbmap/docs/guides/DedupeGuide.txt for more information.
+.PP
+An example of running Dedupe for clustering short reads:
+dedupe.sh in=x.fq am=f ac=f fo c pc rnc=f mcs=4 mo=100 s=1 pto cc qin=33 csf=stats.txt pattern=cluster_%.fq dot=graph.dot
+.PP
+Input may be fasta or fastq, compressed or uncompressed.
+Output may be stdout or a file.  With no output parameter, data will be written to stdout.
+If 'out=null', there will be no output, but statistics will still be printed.
+You can also use 'dedupe <infile> <outfile>' without the 'in=' and 'out='.
+.PP
+I/O parameters:
+in=<file,file>        A single file or a comma\-delimited list of files.
+out=<file>            Destination for all output contigs.
+pattern=<file>        Clusters will be written to individual files, where the '%' symbol in the pattern is replaced by cluster number.
+outd=<file>           Optional; removed duplicates will go here.
+csf=<file>            (clusterstatsfile) Write a list of cluster names and sizes.
+dot=<file>            (graph) Write a graph in dot format.  Requires 'fo' and 'pc' flags.
+threads=auto          (t) Set number of threads to use; default is number of logical processors.
+overwrite=t           (ow) Set to false to force the program to abort rather than overwrite an existing file.
+showspeed=t           (ss) Set to 'f' to suppress display of processing speed.
+minscaf=0             (ms) Ignore contigs/scaffolds shorter than this.
+interleaved=auto      If true, forces fastq input to be paired and interleaved.
+ziplevel=2            Set to 1 (lowest) through 9 (max) to change compression level; lower compression is faster.
+.PP
+Output format parameters:
+storename=t           (sn) Store scaffold names (set false to save memory).
+#addpairnum=f         Add .1 and .2 to numeric id of read1 and read2.
+storequality=t        (sq) Store quality values for fastq assemblies (set false to save memory).
+uniquenames=t         (un) Ensure all output scaffolds have unique names.  Uses more memory.
+numbergraphnodes=t    (ngn) Label dot graph nodes with read numbers rather than read names.
+sort=f                Sort output (otherwise it will be random).  Options:
+.TP
+length:
+Sort by length
+.TP
+quality: Sort by quality
+name:    Sort by name
+id:      Sort by input order
+.PP
+ascending=f           Sort in ascending order.
+ordered=f             Output sequences in input order.  Equivalent to sort=id ascending.
+renameclusters=f      (rnc) Rename contigs to indicate which cluster they are in.
+printlengthinedges=f  (ple) Print the length of contigs in edges.
+.PP
+Processing parameters:
+absorbrc=t            (arc) Absorb reverse\-complements as well as normal orientation.
+absorbmatch=t         (am) Absorb exact matches of contigs.
+absorbcontainment=t   (ac) Absorb full containments of contigs.
+#absorboverlap=f      (ao) Absorb (merge) non\-contained overlaps of contigs (TODO).
+findoverlap=f         (fo) Find overlaps between contigs (containments and non\-containments).  Necessary for clustering.
+uniqueonly=f          (uo) If true, all copies of duplicate reads will be discarded, rather than keeping 1.
+rmn=f                 (requirematchingnames) If true, both names and sequence must match.
+usejni=f              (jni) Do alignments in C code, which is faster, if an edit distance is allowed.
+.IP
+This will require compiling the C code; details are in \fI\,/jni/README.txt\/\fP.
+.PP
+Subset parameters:
+subsetcount=1         (sstc) Number of subsets used to process the data; higher uses less memory.
+subset=0              (sst) Only process reads whose ((ID%subsetcount)==subset).
+.PP
+Clustering parameters:
+cluster=f             (c) Group overlapping contigs into clusters.
+pto=f                 (preventtransitiveoverlaps) Do not look for new edges between nodes in the same cluster.
+minclustersize=1      (mcs) Do not output clusters smaller than this.
+pbr=f                 (pickbestrepresentative) Only output the single highest\-quality read per cluster.
+.PP
+Cluster postprocessing parameters:
+processclusters=f     (pc) Run the cluster processing phase, which performs the selected operations in this category.
+.IP
+For example, pc AND cc must be enabled to perform cc.
+.PP
+fixmultijoins=t       (fmj) Remove redundant overlaps between the same two contigs.
+removecycles=t        (rc) Remove all cycles so clusters form trees.
+cc=t                  (canonicizeclusters) Flip contigs so clusters have a single orientation.
+fcc=f                 (fixcanoncontradictions) Truncate graph at nodes with canonization disputes.
+foc=f                 (fixoffsetcontradictions) Truncate graph at nodes with offset disputes.
+mst=f                 (maxspanningtree) Remove cyclic edges, leaving only the longest edges that form a tree.
+.PP
+Overlap Detection Parameters
+exact=t               (ex) Only allow exact symbol matches.  When false, an 'N' will match any symbol.
+touppercase=t         (tuc) Convert input bases to upper\-case; otherwise, lower\-case will not match.
+maxsubs=0             (s) Allow up to this many mismatches (substitutions only, no indels).  May be set higher than maxedits.
+maxedits=0            (e) Allow up to this many edits (subs or indels).  Higher is slower.
+minidentity=100       (mid) Absorb contained sequences with percent identity of at least this (includes indels).
+minlengthpercent=0    (mlp) Smaller contig must be at least this percent of larger contig's length to be absorbed.
+minoverlappercent=0   (mop) Overlap must be at least this percent of smaller contig's length to cluster and merge.
+minoverlap=200        (mo) Overlap must be at least this long to cluster and merge.
+depthratio=0          (dr) When non\-zero, overlaps will only be formed between reads with a depth ratio of at most this.
+.TP
+Should be above 1.
+Depth is determined by parsing the read names; this information can be added
+.IP
+by running KmerNormalize (khist.sh, bbnorm.sh, or ecc.sh) with the flag 'rename'
+.PP
+k=31                  Seed length used for finding containments and overlaps.  Anything shorter than k will not be found.
+numaffixmaps=1        (nam) Number of prefixes/suffixes to index per contig. Higher is more sensitive, if edits are allowed.
+hashns=f              Set to true to search for matches using kmers containing Ns.  Can lead to extreme slowdown in some cases.
+#ignoreaffix1=f       (ia1) Ignore first affix (for testing).
+#storesuffix=f        (ss) Store suffix as well as prefix.  Automatically set to true when doing inexact matches.
+.PP
+Other Parameters
+qtrim=f               Set to qtrim=rl to trim leading and trailing Ns.
+trimq=6               Quality trim level.
+forcetrimleft=\-1      (ftl) If positive, trim bases to the left of this position (exclusive, 0\-based).
+forcetrimright=\-1     (ftr) If positive, trim bases to the right of this position (exclusive, 0\-based).
+.PP
+Note on Proteins / Amino Acids
+Dedupe supports amino acid space via the 'amino' flag.  This also changes the default kmer length to 10.
+In amino acid mode, all flags related to canonicity and reverse\-complementation are disabled,
+and nam (numaffixmaps) is currently limited to 2 per tip.
+.PP
+Java Parameters:
+\fB\-Xmx\fR                  This will set Java's memory usage, overriding autodetection.
+.TP
+\fB\-Xmx20g\fR will specify 20 gigs of RAM, and \fB\-Xmx200m\fR will specify 200 megs.
+The max is typically 85% of physical memory.
+.PP
+\fB\-eoom\fR                 This flag will cause the process to exit if an out\-of\-memory exception occurs.  Requires Java 8u92+.
+\fB\-da\fR                   Disable assertions.
+.PP
+Please contact Brian Bushnell at bbushnell at lbl.gov if you encounter any problems.
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.


=====================================
debian/mans/reformat.sh.1
=====================================
@@ -0,0 +1,206 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.8.
+.TH REFORMAT.SH "1" "April 2019" "reformat.sh 38.43" "User Commands"
+.SH NAME
+reformat.sh \- Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64
+.SH SYNOPSIS
+.B reformat.sh
+\fI\,in=<file> in2=<file2> out=<outfile> out2=<outfile2>\/\fR
+.SH AUTHOR
+Written by Brian Bushnell
+Last modified February 21, 2019
+.PP
+Description:  Reformats reads to change ASCII quality encoding, interleaving, file format, or compression format.
+Optionally performs additional functions such as quality trimming, subsetting, and subsampling.
+Supports fastq, fasta, fasta+qual, scarf, oneline, sam, bam, gzip, bz2.
+Please read bbmap/docs/guides/ReformatGuide.txt for more information.
+.PP
+in2 and out2 are for paired reads and are optional.
+If input is paired and there is only one output file, it will be written interleaved.
+.PP
+Parameters and their defaults:
+.PP
+ow=f                    (overwrite) Overwrites files that already exist.
+app=f                   (append) Append to files that already exist.
+zl=4                    (ziplevel) Set compression level, 1 (low) to 9 (max).
+int=f                   (interleaved) Determines whether INPUT file is considered interleaved.
+fastawrap=70            Length of lines in fasta output.
+fastareadlen=0          Set to a non\-zero number to break fasta files into reads of at most this length.
+fastaminlen=1           Ignore fasta reads shorter than this.
+qin=auto                ASCII offset for input quality.  May be 33 (Sanger), 64 (Illumina), or auto.
+qout=auto               ASCII offset for output quality.  May be 33 (Sanger), 64 (Illumina), or auto (same as input).
+qfake=30                Quality value used for fasta to fastq reformatting.
+qfin=<.qual file>       Read qualities from this qual file, for the reads coming from 'in=<fasta file>'
+qfin2=<.qual file>      Read qualities from this qual file, for the reads coming from 'in2=<fasta file>'
+qfout=<.qual file>      Write qualities from this qual file, for the reads going to 'out=<fasta file>'
+qfout2=<.qual file>     Write qualities from this qual file, for the reads coming from 'out2=<fasta file>'
+outsingle=<file>        (outs) If a read is longer than minlength and its mate is shorter, the longer one goes here.
+deleteinput=f           Delete input upon successful completion.
+ref=<file>              Optional reference fasta for sam processing.
+.PP
+Processing Parameters:
+.PP
+verifypaired=f          (vpair) When true, checks reads to see if the names look paired.  Prints an error message if not.
+verifyinterleaved=f     (vint) sets 'vpair' to true and 'interleaved' to true.
+allowidenticalnames=f   (ain) When verifying pair names, allows identical names, instead of requiring \fI\,/1\/\fP and \fI\,/2\/\fP or 1: and 2:
+tossbrokenreads=f       (tbr) Discard reads that have different numbers of bases and qualities.  By default this will be detected and cause a crash.
+ignorebadquality=f      (ibq) Fix out\-of\-range quality values instead of crashing with a warning.
+addslash=f              Append ' /1' and ' /2' to read names, if not already present.  Please include the flag 'int=t' if the reads are interleaved.
+spaceslash=t            Put a space before the slash in addslash mode.
+addcolon=f              Append ' 1:' and ' 2:' to read names, if not already present.  Please include the flag 'int=t' if the reads are interleaved.
+underscore=f            Change whitespace in read names to underscores.
+rcomp=f                 (rc) Reverse\-compliment reads.
+rcompmate=f             (rcm) Reverse\-compliment read 2 only.
+changequality=t         (cq) N bases always get a quality of 0 and ACGT bases get a min quality of 2.
+quantize=f              Quantize qualities to a subset of values like NextSeq.  Can also be used with comma\-delimited list, like quantize=0,8,13,22,27,32,37
+tuc=f                   (touppercase) Change lowercase letters in reads to uppercase.
+uniquenames=f           Make duplicate names unique by appending _<number>.
+remap=                  A set of pairs: remap=CTGN will transform C>T and G>N.
+.IP
+Use remap1 and remap2 to specify read 1 or 2.
+.PP
+iupacToN=f              (itn) Convert non\-ACGTN symbols to N.
+monitor=f               Kill this process if it crashes.  monitor=600,0.01 would kill after 600 seconds under 1% usage.
+crashjunk=t             Crash when encountering reads with invalid bases.
+tossjunk=f              Discard reads with invalid characters as bases.
+fixjunk=f               Convert invalid bases to N.
+fixheaders=f            Convert nonstandard header characters to standard ASCII.
+recalibrate=f           (recal) Recalibrate quality scores.  Must first generate matrices with CalcTrueQuality.
+maxcalledquality=41     Quality scores capped at this upper bound.
+mincalledquality=2      Quality scores of ACGT bases will be capped at lower bound.
+trimreaddescription=f   (trd) Trim the names of reads after the first whitespace.
+trimrname=f             For sam/bam files, trim rname/rnext fields after the first space.
+fixheaders=f            Replace characters in headers such as space, *, and | to make them valid file names.
+warnifnosequence=t      For fasta, issue a warning if a sequenceless header is encountered.
+warnfirsttimeonly=t     Issue a warning for only the first sequenceless header.
+utot=f                  Convert U to T (for RNA \-> DNA translation).
+padleft=0               Pad the left end of sequences with this many symbols.
+padright=0              Pad the right end of sequences with this many symbols.
+pad=0                   Set padleft and padright to the same value.
+padsymbol=N             Symbol to use for padding.
+.PP
+Histogram output parameters:
+.PP
+bhist=<file>            Base composition histogram by position.
+qhist=<file>            Quality histogram by position.
+qchist=<file>           Count of bases with each quality value.
+aqhist=<file>           Histogram of average read quality.
+bqhist=<file>           Quality histogram designed for box plots.
+lhist=<file>            Read length histogram.
+gchist=<file>           Read GC content histogram.
+gcbins=100              Number gchist bins.  Set to 'auto' to use read length.
+gcplot=f                Add a graphical representation to the gchist.
+maxhistlen=6000         Set an upper bound for histogram lengths; higher uses more memory.
+.IP
+The default is 6000 for some histograms and 80000 for others.
+.PP
+Histograms for sam files only (requires sam format 1.4 or higher):
+.PP
+ehist=<file>            Errors\-per\-read histogram.
+qahist=<file>           Quality accuracy histogram of error rates versus quality score.
+indelhist=<file>        Indel length histogram.
+mhist=<file>            Histogram of match, sub, del, and ins rates by read location.
+ihist=<file>            Insert size histograms.  Requires paired reads in a sam file.
+idhist=<file>           Histogram of read count versus percent identity.
+idbins=100              Number idhist bins.  Set to 'auto' to use read length.
+.PP
+Sampling parameters:
+.PP
+reads=\-1                Set to a positive number to only process this many INPUT reads (or pairs), then quit.
+skipreads=\-1            Skip (discard) this many INPUT reads before processing the rest.
+samplerate=1            Randomly output only this fraction of reads; 1 means sampling is disabled.
+sampleseed=\-1           Set to a positive number to use that prng seed for sampling (allowing deterministic sampling).
+samplereadstarget=0     (srt) Exact number of OUTPUT reads (or pairs) desired.
+samplebasestarget=0     (sbt) Exact number of OUTPUT bases desired.
+.IP
+Important: srt/sbt flags should not be used with stdin, samplerate, qtrim, minlength, or minavgquality.
+.PP
+upsample=f              Allow srt/sbt to upsample (duplicate reads) when the target is greater than input.
+prioritizelength=f      If true, calculate a length threshold to reach the target, and retain all reads of at least that length (must set srt or sbt).
+.PP
+Trimming and filtering parameters:
+.PP
+qtrim=f                 Trim read ends to remove bases with quality below trimq.
+.IP
+Values: t (trim both ends), f (neither end), r (right end only), l (left end only), w (sliding window).
+.PP
+trimq=6                 Regions with average quality BELOW this will be trimmed.  Can be a floating\-point number like 7.3.
+minlength=0             (ml) Reads shorter than this after trimming will be discarded.  Pairs will be discarded only if both are shorter.
+mlf=0                   (mlf) Reads shorter than this fraction of original length after trimming will be discarded.
+maxlength=0             If nonzero, reads longer than this after trimming will be discarded.
+breaklength=0           If nonzero, reads longer than this will be broken into multiple reads of this length.  Does not work for paired reads.
+requirebothbad=t        (rbb) Only discard pairs if both reads are shorter than minlen.
+invertfilters=f         (invert) Output failing reads instead of passing reads.
+minavgquality=0         (maq) Reads with average quality (after trimming) below this will be discarded.
+maqb=0                  If positive, calculate maq from this many initial bases.
+chastityfilter=f        (cf) Reads with names  containing ' 1:Y:' or ' 2:Y:' will be discarded.
+barcodefilter=f         Remove reads with unexpected barcodes if barcodes is set, or barcodes containing 'N' otherwise.
+.IP
+A barcode must be the last part of the read header.
+.PP
+barcodes=               Comma\-delimited list of barcodes or files of barcodes.
+maxns=\-1                If 0 or greater, reads with more Ns than this (after trimming) will be discarded.
+minconsecutivebases=0   (mcb) Discard reads without at least this many consecutive called bases.
+forcetrimleft=0         (ftl) If nonzero, trim left bases of the read to this position (exclusive, 0\-based).
+forcetrimright=0        (ftr) If nonzero, trim right bases of the read after this position (exclusive, 0\-based).
+forcetrimright2=0       (ftr2) If positive, trim this many bases on the right end.
+forcetrimmod=5          (ftm) If positive, trim length to be equal to zero modulo this number.
+mingc=0                 Discard reads with GC content below this.
+maxgc=1                 Discard reads with GC content above this.
+gcpairs=t               Use average GC of paired reads.
+.IP
+Also affects gchist.
+.PP
+Sam and bam processing options:
+.PP
+mappedonly=f            Toss unmapped reads.
+unmappedonly=f          Toss mapped reads.
+pairedonly=f            Toss reads that are not mapped as proper pairs.
+unpairedonly=f          Toss reads that are mapped as proper pairs.
+primaryonly=f           Toss secondary alignments.  Set this to true for sam to fastq conversion.
+minmapq=\-1              If non\-negative, toss reads with mapq under this.
+maxmapq=\-1              If non\-negative, toss reads with mapq over this.
+requiredbits=0          (rbits) Toss sam lines with any of these flag bits unset.  Similar to samtools \fB\-f\fR.
+filterbits=0            (fbits) Toss sam lines with any of these flag bits set.  Similar to samtools \fB\-F\fR.
+stoptag=f               Set to true to write a tag indicating read stop location, prefixed by YS:i:
+sam=                    Set to 'sam=1.3' to convert '=' and 'X' cigar symbols (from sam 1.4+ format) to 'M'.
+.IP
+Set to 'sam=1.4' to convert 'M' to '=' and 'X' (sam=1.4 requires MD tags to be present, or ref to be specified).
+.PP
+Sam and bam alignment filtering options:
+These require = and X symbols in cigar strings, or MD tags, or areference fasta.
+\fB\-1\fR means disabled; to filter reads with any of a symbol type, set to 0.
+.PP
+subfilter=\-1            Discard reads with more than this many substitutions.
+insfilter=\-1            Discard reads with more than this many insertions.
+delfilter=\-1            Discard reads with more than this many deletions.
+indelfilter=\-1          Discard reads with more than this many indels.
+editfilter=\-1           Discard reads with more than this many edits.
+inslenfilter=\-1         Discard reads with an insertion longer than this.
+dellenfilter=\-1         Discard reads with a deletion longer than this.
+idfilter=\-1.0           Discard reads with identity below this.
+clipfilter=\-1           Discard reads with more than this many soft\-clipped bases.
+.PP
+Kmer counting and cardinality estimation:
+k=0                     If positive, count the total number of kmers.
+cardinality=f           (loglog) Count unique kmers using the LogLog algorithm.
+loglogbuckets=1999      Use this many buckets for cardinality estimation.
+.PP
+Shortcuts:
+The # symbol will be substituted for 1 and 2.  The % symbol in out will be substituted for input name minus extensions.
+For example:
+reformat.sh in=read#.fq out=%.fa
+\&...is equivalent to:
+reformat.sh in1=read1.fq in2=read2.fq out1=read1.fa out2=read2.fa
+.PP
+Java Parameters:
+\fB\-Xmx\fR                    This will set Java's memory usage, overriding autodetection.
+.TP
+\fB\-Xmx20g\fR will specify 20 gigs of RAM, and \fB\-Xmx200m\fR will specify 200 megs.
+The max is typically 85% of physical memory.
+.PP
+\fB\-eoom\fR                   This flag will cause the process to exit if an out\-of\-memory exception occurs.  Requires Java 8u92+.
+\fB\-da\fR                     Disable assertions.
+.PP
+Please contact Brian Bushnell at bbushnell at lbl.gov if you encounter any problems.
+.SH AUTHOR
+This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.



View it on GitLab: https://salsa.debian.org/med-team/bbmap/commit/3758d4c612caa1e1f97bf4d89e5ef85f1f08d4e2

-- 
View it on GitLab: https://salsa.debian.org/med-team/bbmap/commit/3758d4c612caa1e1f97bf4d89e5ef85f1f08d4e2
You're receiving this email because of your account on salsa.debian.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20190404/cae8d8e2/attachment-0001.html>


More information about the debian-med-commit mailing list