[med-svn] [Git][med-team/vsearch][master] 8 commits: Add autopkgtests

Nilesh Patra gitlab at salsa.debian.org
Fri Jun 12 12:27:27 BST 2020



Nilesh Patra pushed to branch master at Debian Med / vsearch


Commits:
a8d9cff3 by Nilesh Patra at 2020-06-12T16:05:53+05:30
Add autopkgtests

- - - - -
7d33e969 by Nilesh Patra at 2020-06-12T16:06:01+05:30
Add new binary: vsearch-examples
We do not intend to bloat the user's machine too much. vsearch-data doesn't seem to be in the archive or NEW queue either

- - - - -
55d260e6 by Nilesh Patra at 2020-06-12T16:08:31+05:30
Add vsearch binary

- - - - -
8d700db6 by Nilesh Patra at 2020-06-12T16:08:49+05:30
Install relevant examples

- - - - -
c84e575a by Nilesh Patra at 2020-06-12T16:16:35+05:30
Remove extenal URL

- - - - -
008570ae by Nilesh Patra at 2020-06-12T16:49:00+05:30
Add manpage

- - - - -
c6913e72 by Nilesh Patra at 2020-06-12T16:49:44+05:30
Install self-generated manpage - the one provided by upstream looks incompatible

- - - - -
8f3acaa9 by Nilesh Patra at 2020-06-12T16:53:45+05:30
Add myself to uploaders

- - - - -


15 changed files:

- − debian/_tests/run-unit-test
- debian/control
- + debian/createmanpages
- + debian/man/vsearch.1
- debian/rules
- + debian/tests/README
- debian/_tests/control → debian/tests/control
- + debian/tests/data/BioMarKs50k.fsa
- + debian/tests/data/query.fsa
- + debian/tests/expected-output/test1-expected.out
- + debian/tests/expected-output/test2-expected.out
- + debian/tests/run-unit-test
- + debian/vsearch-examples.docs
- + debian/vsearch.install
- debian/vsearch.manpages


Changes:

=====================================
debian/_tests/run-unit-test deleted
=====================================
@@ -1,29 +0,0 @@
-#!/bin/sh -e
-
-pkg=vsearch
-if [ "$ADTTMP" = "" ] ; then
-  ADTTMP=`mktemp -d /tmp/${pkg}-test.XXXXXX`
-fi
-cd $ADTTMP
-mkdir -p ${pkg}-data
-mkdir -p ${pkg}-test
-cd ${pkg}-test
-cp -a /usr/share/doc/${pkg}/test .
-VDATADIR=/usr/share/vsearch/data
-if [ ! -d ${VDATADIR} ] ; then
-    echo "You need to install vsearch-data package to run this test."
-    exit 1
-fi
-
-for datadir in `find ${VDATADIR} -type d | sed -e "s#${VDATADIR}/*##" -e "/^$/d"` ; do mkdir -p ../vsearch-data/${datadir} ; done
-for datafile in `find ${VDATADIR} -type f` ; do 
-    ln -s ${datafile} `echo ${datafile} | sed -e "s#${VDATADIR}/#../vsearch-data/#"`
-done
-zcat ../vsearch-data/BioMarKs.fsa.gz > ../vsearch-data/BioMarKs.fsa
-cd test
-for t in *.sh ; do
-    bash $t v
-done
-cd ..
-
-# rm -f $ADTTMP/*


=====================================
debian/control
=====================================
@@ -1,7 +1,8 @@
 Source: vsearch
 Maintainer: Debian Med Packaging Team <debian-med-packaging at lists.alioth.debian.org>
 Uploaders: Tim Booth <tbooth at ceh.ac.uk>,
-           Andreas Tille <tille at debian.org>
+           Andreas Tille <tille at debian.org>,
+           Nilesh Patra <npatra974 at gmail.com>
 Section: science
 Priority: optional
 Build-Depends: debhelper-compat (= 12),


=====================================
debian/createmanpages
=====================================
@@ -0,0 +1,29 @@
+#!/bin/sh
+MANDIR=debian/man
+mkdir -p $MANDIR
+
+VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
+NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
+PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//' | head -n1`
+
+AUTHOR=".SH AUTHOR\n \
+This manpage was written by $DEBFULLNAME for the Debian distribution and\n \
+can be used for any other usage of the program.\
+"
+
+# If program name is different from package name or title should be
+# different from package short description change this here
+progname=vsearch
+help2man --no-info --no-discard-stderr --help-option="-h" \
+         --name="$NAME" \
+            --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+
+echo "$MANDIR/*.1" > debian/manpages
+
+cat <<EOT
+Please enhance the help2man output.
+The following web page might be helpful in doing so:
+    http://liw.fi/manpages/
+EOT
+


=====================================
debian/man/vsearch.1
=====================================
@@ -0,0 +1,1114 @@
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.12.
+.TH VSEARCH "1" "June 2020" "vsearch 2.14.2" "User Commands"
+.SH NAME
+vsearch \- tool for processing metagenomic sequences
+.SH SYNOPSIS
+.B vsearch
+[\fI\,OPTIONS\/\fR]
+.SH DESCRIPTION
+vsearch v2.14.2_linux_x86_64, 7.5GB RAM, 8 cores
+https://github.com/torognes/vsearch
+.PP
+Rognes T, Flouri T, Nichols B, Quince C, Mahe F (2016)
+VSEARCH: a versatile open source tool for metagenomics
+PeerJ 4:e2584 doi: 10.7717/peerj.2584 https://doi.org/10.7717/peerj.2584
+.PP
+General options
+.TP
+\fB\-\-bzip2_decompress\fR
+decompress input with bzip2 (required if pipe)
+.TP
+\fB\-\-fasta_width\fR INT
+width of FASTA seq lines, 0 for no wrap (80)
+.TP
+\fB\-\-gzip_decompress\fR
+decompress input with gzip (required if pipe)
+.TP
+\fB\-\-help\fR | \fB\-h\fR
+display help information
+.TP
+\fB\-\-log\fR FILENAME
+write messages, timing and memory info to file
+.TP
+\fB\-\-maxseqlength\fR INT
+maximum sequence length (50000)
+.TP
+\fB\-\-minseqlength\fR INT
+min seq length (clust/derep/search: 32, other:1)
+.TP
+\fB\-\-no_progress\fR
+do not show progress indicator
+.TP
+\fB\-\-notrunclabels\fR
+do not truncate labels at first space
+.TP
+\fB\-\-quiet\fR
+output just warnings and fatal errors to stderr
+.TP
+\fB\-\-threads\fR INT
+number of threads to use, zero for all cores (0)
+.TP
+\fB\-\-version\fR | \fB\-v\fR
+display version information
+.PP
+Chimera detection
+.TP
+\fB\-\-uchime_denovo\fR FILENAME
+detect chimeras de novo
+.TP
+\fB\-\-uchime2_denovo\fR FILENAME
+detect chimeras de novo in denoised amplicons
+.TP
+\fB\-\-uchime3_denovo\fR FILENAME
+detect chimeras de novo in denoised amplicons
+.TP
+\fB\-\-uchime_ref\fR FILENAME
+detect chimeras using a reference database
+.IP
+Data
+.TP
+\fB\-\-db\fR FILENAME
+reference database for \fB\-\-uchime_ref\fR
+.IP
+Parameters
+.TP
+\fB\-\-abskew\fR REAL
+minimum abundance ratio (2.0, 16.0 for uchime3)
+.TP
+\fB\-\-dn\fR REAL
+\&'no' vote pseudo\-count (1.4)
+.TP
+\fB\-\-mindiffs\fR INT
+minimum number of differences in segment (3) *
+.TP
+\fB\-\-mindiv\fR REAL
+minimum divergence from closest parent (0.8) *
+.TP
+\fB\-\-minh\fR REAL
+minimum score (0.28) * ignored in uchime2/3
+.TP
+\fB\-\-sizein\fR
+propagate abundance annotation from input
+.TP
+\fB\-\-self\fR
+exclude identical labels for \fB\-\-uchime_ref\fR
+.TP
+\fB\-\-selfid\fR
+exclude identical sequences for \fB\-\-uchime_ref\fR
+.TP
+\fB\-\-xn\fR REAL
+\&'no' vote weight (8.0)
+.IP
+Output
+.TP
+\fB\-\-alignwidth\fR INT
+width of alignment in uchimealn output (80)
+.TP
+\fB\-\-borderline\fR FILENAME
+output borderline chimeric sequences to file
+.TP
+\fB\-\-chimeras\fR FILENAME
+output chimeric sequences to file
+.TP
+\fB\-\-fasta_score\fR
+include chimera score in fasta output
+.TP
+\fB\-\-nonchimeras\fR FILENAME
+output non\-chimeric sequences to file
+.TP
+\fB\-\-relabel\fR STRING
+relabel nonchimeras with this prefix string
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel with md5 digest of normalized sequence
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel with sha1 digest of normalized sequence
+.TP
+\fB\-\-sizeout\fR
+include abundance information when relabelling
+.TP
+\fB\-\-uchimealns\fR FILENAME
+output chimera alignments to file
+.TP
+\fB\-\-uchimeout\fR FILENAME
+output to chimera info to tab\-separated file
+.TP
+\fB\-\-uchimeout5\fR
+make output compatible with uchime version 5
+.TP
+\fB\-\-xsize\fR
+strip abundance information in output
+.PP
+Clustering
+.TP
+\fB\-\-cluster_fast\fR FILENAME
+cluster sequences after sorting by length
+.TP
+\fB\-\-cluster_size\fR FILENAME
+cluster sequences after sorting by abundance
+.HP
+\fB\-\-cluster_smallmem\fR FILENAME cluster already sorted sequences (see \fB\-usersort\fR)
+.TP
+\fB\-\-cluster_unoise\fR FILENAME
+denoise Illumina amplicon reads
+.IP
+Parameters (most searching options also apply)
+.TP
+\fB\-\-cons_truncate\fR
+do not ignore terminal gaps in MSA for consensus
+.TP
+\fB\-\-id\fR REAL
+reject if identity lower, accepted values: 0\-1.0
+.TP
+\fB\-\-iddef\fR INT
+id definition, 0\-4=CD\-HIT,all,int,MBL,BLAST (2)
+.TP
+\fB\-\-qmask\fR none|dust|soft
+mask seqs with dust, soft or no method (dust)
+.TP
+\fB\-\-sizein\fR
+propagate abundance annotation from input
+.TP
+\fB\-\-strand\fR plus|both
+cluster using plus or both strands (plus)
+.TP
+\fB\-\-usersort\fR
+indicate sequences not pre\-sorted by length
+.TP
+\fB\-\-minsize\fR INT
+minimum abundance (unoise only) (8)
+.TP
+\fB\-\-unoise_alpha\fR REAL
+alpha parameter (unoise only) (2.0)
+.IP
+Output
+.TP
+\fB\-\-biomout\fR FILENAME
+filename for OTU table output in biom 1.0 format
+.TP
+\fB\-\-centroids\fR FILENAME
+output centroid sequences to FASTA file
+.TP
+\fB\-\-clusterout_id\fR
+add cluster id info to consout and profile files
+.TP
+\fB\-\-clusterout_sort\fR
+order msaout, consout, profile by decr abundance
+.TP
+\fB\-\-clusters\fR STRING
+output each cluster to a separate FASTA file
+.TP
+\fB\-\-consout\fR FILENAME
+output cluster consensus sequences to FASTA file
+.TP
+\fB\-\-mothur_shared_out\fR FN
+filename for OTU table output in mothur format
+.TP
+\fB\-\-msaout\fR FILENAME
+output multiple seq. alignments to FASTA file
+.TP
+\fB\-\-otutabout\fR FILENAME
+filename for OTU table output in classic format
+.TP
+\fB\-\-profile\fR FILENAME
+output sequence profile of each cluster to file
+.TP
+\fB\-\-relabel\fR STRING
+relabel centroids with this prefix string
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel with md5 digest of normalized sequence
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel with sha1 digest of normalized sequence
+.TP
+\fB\-\-sizeorder\fR
+sort accepted centroids by abundance, AGC
+.TP
+\fB\-\-sizeout\fR
+write cluster abundances to centroid file
+.TP
+\fB\-\-uc\fR FILENAME
+specify filename for UCLUST\-like output
+.TP
+\fB\-\-xsize\fR
+strip abundance information in output
+.PP
+Convert SFF to FASTQ
+.TP
+\fB\-\-sff_convert\fR FILENAME
+convert given SFF file to FASTQ format
+.IP
+Parameters
+.TP
+\fB\-\-sff_clip\fR
+clip ends of sequences as indicated in file (no)
+.TP
+\fB\-\-fastq_asciiout\fR INT
+FASTQ output quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmaxout\fR INT
+maximum base quality value for FASTQ output (41)
+.TP
+\fB\-\-fastq_qminout\fR INT
+minimum base quality value for FASTQ output (0)
+.IP
+Output
+.TP
+\fB\-\-fastqout\fR FILENAME
+output converted sequences to given FASTQ file
+.PP
+Dereplication and rereplication
+.HP
+\fB\-\-derep_fulllength\fR FILENAME dereplicate sequences in the given FASTA file
+.TP
+\fB\-\-derep_prefix\fR FILENAME
+dereplicate sequences in file based on prefixes
+.TP
+\fB\-\-rereplicate\fR FILENAME
+rereplicate sequences in the given FASTA file
+.IP
+Parameters
+.TP
+\fB\-\-maxuniquesize\fR INT
+maximum abundance for output from dereplication
+.TP
+\fB\-\-minuniquesize\fR INT
+minimum abundance for output from dereplication
+.TP
+\fB\-\-sizein\fR
+propagate abundance annotation from input
+.TP
+\fB\-\-strand\fR plus|both
+dereplicate plus or both strands (plus)
+.IP
+Output
+.TP
+\fB\-\-output\fR FILENAME
+output FASTA file
+.TP
+\fB\-\-relabel\fR STRING
+relabel with this prefix string
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel with md5 digest of normalized sequence
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel with sha1 digest of normalized sequence
+.TP
+\fB\-\-sizeout\fR
+write abundance annotation to output
+.TP
+\fB\-\-topn\fR INT
+output only n most abundant sequences after derep
+.TP
+\fB\-\-uc\fR FILENAME
+filename for UCLUST\-like dereplication output
+.TP
+\fB\-\-xsize\fR
+strip abundance information in derep output
+.PP
+FASTQ format conversion
+.TP
+\fB\-\-fastq_convert\fR FILENAME
+convert between FASTQ file formats
+.IP
+Parameters
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_asciiout\fR INT
+FASTQ output quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmaxout\fR INT
+maximum base quality value for FASTQ output (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.TP
+\fB\-\-fastq_qminout\fR INT
+minimum base quality value for FASTQ output (0)
+.IP
+Output
+.TP
+\fB\-\-fastqout\fR FILENAME
+FASTQ output filename for converted sequences
+.PP
+FASTQ format detection and quality analysis
+.TP
+\fB\-\-fastq_chars\fR FILENAME
+analyse FASTQ file for version and quality range
+.IP
+Parameters
+.TP
+\fB\-\-fastq_tail\fR INT
+min length of tails to count for fastq_chars (4)
+.PP
+FASTQ quality statistics
+.TP
+\fB\-\-fastq_stats\fR FILENAME
+report statistics on FASTQ file
+.TP
+\fB\-\-fastq_eestats\fR FILENAME
+quality score and expected error statistics
+.TP
+\fB\-\-fastq_eestats2\fR FILENAME
+expected error and length cutoff statistics
+.IP
+Parameters
+.TP
+\fB\-\-ee_cutoffs\fR REAL,...
+fastq_eestats2 expected error cutoffs (0.5,1,2)
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.HP
+\fB\-\-length_cutoffs\fR INT,INT,INT fastq_eestats2 length (min,max,incr) (50,*,50)
+.IP
+Output
+.TP
+\fB\-\-log\fR FILENAME
+output file for fastq_stats statistics
+.TP
+\fB\-\-output\fR FILENAME
+output file for fastq_eestats(2) statistics
+.PP
+Masking (new)
+.TP
+\fB\-\-fastx_mask\fR FILENAME
+mask sequences in the given FASTA or FASTQ file
+.IP
+Parameters
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.TP
+\fB\-\-hardmask\fR
+mask by replacing with N instead of lower case
+.TP
+\fB\-\-max_unmasked_pct\fR
+max unmasked % of sequences to keep (100.0)
+.TP
+\fB\-\-min_unmasked_pct\fR
+min unmasked % of sequences to keep (0.0)
+.TP
+\fB\-\-qmask\fR none|dust|soft
+mask seqs with dust, soft or no method (dust)
+.IP
+Output
+.TP
+\fB\-\-fastaout\fR FILENAME
+output to specified FASTA file
+.TP
+\fB\-\-fastqout\fR FILENAME
+output to specified FASTQ file
+.PP
+Masking (old)
+.TP
+\fB\-\-maskfasta\fR FILENAME
+mask sequences in the given FASTA file
+.IP
+Parameters
+.TP
+\fB\-\-hardmask\fR
+mask by replacing with N instead of lower case
+.TP
+\fB\-\-qmask\fR none|dust|soft
+mask seqs with dust, soft or no method (dust)
+.IP
+Output
+.TP
+\fB\-\-output\fR FILENAME
+output to specified FASTA file
+.PP
+Paired\-end reads joining
+.TP
+\fB\-\-fastq_join\fR FILENAME
+join paired\-end reads into one sequence with gap
+.IP
+Data
+.TP
+\fB\-\-reverse\fR FILENAME
+specify FASTQ file with reverse reads
+.TP
+\fB\-\-join_padgap\fR STRING
+sequence string used for padding (NNNNNNNN)
+.TP
+\fB\-\-join_padgapq\fR STRING
+quality string used for padding (IIIIIIII)
+.IP
+Output
+.TP
+\fB\-\-fastaout\fR FILENAME
+FASTA output filename for joined sequences
+.TP
+\fB\-\-fastqout\fR FILENAME
+FASTQ output filename for joined sequences
+.PP
+Paired\-end reads merging
+.HP
+\fB\-\-fastq_mergepairs\fR FILENAME merge paired\-end reads into one sequence
+.IP
+Data
+.TP
+\fB\-\-reverse\fR FILENAME
+specify FASTQ file with reverse reads
+.IP
+Parameters
+.TP
+\fB\-\-fastq_allowmergestagger\fR
+allow merging of staggered reads
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_maxdiffpct\fR REAL
+maximum percentage diff. bases in overlap (100.0)
+.TP
+\fB\-\-fastq_maxdiffs\fR INT
+maximum number of different bases in overlap (10)
+.TP
+\fB\-\-fastq_maxee\fR REAL
+maximum expected error value for merged sequence
+.TP
+\fB\-\-fastq_maxmergelen\fR
+maximum length of entire merged sequence
+.TP
+\fB\-\-fastq_maxns\fR INT
+maximum number of N's
+.TP
+\fB\-\-fastq_minlen\fR INT
+minimum input read length after truncation (1)
+.TP
+\fB\-\-fastq_minmergelen\fR
+minimum length of entire merged sequence
+.TP
+\fB\-\-fastq_minovlen\fR
+minimum length of overlap between reads (10)
+.TP
+\fB\-\-fastq_nostagger\fR
+disallow merging of staggered reads (default)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmaxout\fR INT
+maximum base quality value for FASTQ output (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.TP
+\fB\-\-fastq_qminout\fR INT
+minimum base quality value for FASTQ output (0)
+.TP
+\fB\-\-fastq_truncqual\fR INT
+base quality value for truncation
+.IP
+Output
+.TP
+\fB\-\-eetabbedout\fR FILENAME
+output error statistics to specified file
+.TP
+\fB\-\-fastaout\fR FILENAME
+FASTA output filename for merged sequences
+.HP
+\fB\-\-fastaout_notmerged_fwd\fR FN FASTA filename for non\-merged forward sequences
+.HP
+\fB\-\-fastaout_notmerged_rev\fR FN FASTA filename for non\-merged reverse sequences
+.TP
+\fB\-\-fastq_eeout\fR
+include expected errors (ee) in FASTQ output
+.TP
+\fB\-\-fastqout\fR FILENAME
+FASTQ output filename for merged sequences
+.HP
+\fB\-\-fastqout_notmerged_fwd\fR FN FASTQ filename for non\-merged forward sequences
+.HP
+\fB\-\-fastqout_notmerged_rev\fR FN FASTQ filename for non\-merged reverse sequences
+.TP
+\fB\-\-label_suffix\fR
+suffix to append to label of merged sequences
+.TP
+\fB\-\-xee\fR
+remove expected errors (ee) info from output
+.PP
+Pairwise alignment
+.TP
+\fB\-\-allpairs_global\fR FILENAME
+perform global alignment of all sequence pairs
+.IP
+Output (most searching options also apply)
+.TP
+\fB\-\-alnout\fR FILENAME
+filename for human\-readable alignment output
+.TP
+\fB\-\-acceptall\fR
+output all pairwise alignments
+.PP
+Restriction site cutting
+.TP
+\fB\-\-cut\fR FILENAME
+filename of FASTA formatted input sequences
+.IP
+Parameters
+.TP
+\fB\-\-cut_pattern\fR STRING
+pattern to match with ^ and _ at cut sites
+.IP
+Output
+.TP
+\fB\-\-fastaout\fR FILENAME
+FASTA filename for fragments on forward strand
+.TP
+\fB\-\-fastaout_rev\fR FILENAME
+FASTA filename for fragments on reverse strand
+.TP
+\fB\-\-fastaout_discarded\fR FN
+FASTA filename for non\-matching sequences
+.HP
+\fB\-\-fastaout_discarded_rev\fR FN FASTA filename for non\-matching, reverse compl.
+.PP
+Reverse complementation
+.TP
+\fB\-\-fastx_revcomp\fR FILENAME
+reverse\-complement seqs in FASTA or FASTQ file
+.IP
+Parameters
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.IP
+Output
+.TP
+\fB\-\-fastaout\fR FILENAME
+FASTA output filename
+.TP
+\fB\-\-fastqout\fR FILENAME
+FASTQ output filename
+.TP
+\fB\-\-label_suffix\fR STRING
+label to append to identifier in the output
+.PP
+Searching
+.TP
+\fB\-\-search_exact\fR FILENAME
+filename of queries for exact match search
+.TP
+\fB\-\-usearch_global\fR FILENAME
+filename of queries for global alignment search
+.IP
+Data
+.TP
+\fB\-\-db\fR FILENAME
+name of UDB or FASTA database for search
+.IP
+Parameters
+.TP
+\fB\-\-dbmask\fR none|dust|soft
+mask db with dust, soft or no method (dust)
+.TP
+\fB\-\-fulldp\fR
+full dynamic programming alignment (always on)
+.TP
+\fB\-\-gapext\fR STRING
+penalties for gap extension (2I/1E)
+.TP
+\fB\-\-gapopen\fR STRING
+penalties for gap opening (20I/2E)
+.TP
+\fB\-\-hardmask\fR
+mask by replacing with N instead of lower case
+.TP
+\fB\-\-id\fR REAL
+reject if identity lower
+.TP
+\fB\-\-iddef\fR INT
+id definition, 0\-4=CD\-HIT,all,int,MBL,BLAST (2)
+.TP
+\fB\-\-idprefix\fR INT
+reject if first n nucleotides do not match
+.TP
+\fB\-\-idsuffix\fR INT
+reject if last n nucleotides do not match
+.TP
+\fB\-\-leftjust\fR
+reject if terminal gaps at alignment left end
+.TP
+\fB\-\-match\fR INT
+score for match (2)
+.TP
+\fB\-\-maxaccepts\fR INT
+number of hits to accept and show per strand (1)
+.TP
+\fB\-\-maxdiffs\fR INT
+reject if more substitutions or indels
+.TP
+\fB\-\-maxgaps\fR INT
+reject if more indels
+.TP
+\fB\-\-maxhits\fR INT
+maximum number of hits to show (unlimited)
+.TP
+\fB\-\-maxid\fR REAL
+reject if identity higher
+.TP
+\fB\-\-maxqsize\fR INT
+reject if query abundance larger
+.TP
+\fB\-\-maxqt\fR REAL
+reject if query/target length ratio higher
+.TP
+\fB\-\-maxrejects\fR INT
+number of non\-matching hits to consider (32)
+.TP
+\fB\-\-maxsizeratio\fR REAL
+reject if query/target abundance ratio higher
+.TP
+\fB\-\-maxsl\fR REAL
+reject if shorter/longer length ratio higher
+.TP
+\fB\-\-maxsubs\fR INT
+reject if more substitutions
+.TP
+\fB\-\-mid\fR REAL
+reject if percent identity lower, ignoring gaps
+.TP
+\fB\-\-mincols\fR INT
+reject if alignment length shorter
+.TP
+\fB\-\-minqt\fR REAL
+reject if query/target length ratio lower
+.TP
+\fB\-\-minsizeratio\fR REAL
+reject if query/target abundance ratio lower
+.TP
+\fB\-\-minsl\fR REAL
+reject if shorter/longer length ratio lower
+.TP
+\fB\-\-mintsize\fR INT
+reject if target abundance lower
+.TP
+\fB\-\-minwordmatches\fR INT
+minimum number of word matches required (12)
+.TP
+\fB\-\-mismatch\fR INT
+score for mismatch (\fB\-4\fR)
+.TP
+\fB\-\-pattern\fR STRING
+option is ignored
+.TP
+\fB\-\-qmask\fR none|dust|soft
+mask query with dust, soft or no method (dust)
+.TP
+\fB\-\-query_cov\fR REAL
+reject if fraction of query seq. aligned lower
+.TP
+\fB\-\-rightjust\fR
+reject if terminal gaps at alignment right end
+.TP
+\fB\-\-sizein\fR
+propagate abundance annotation from input
+.TP
+\fB\-\-self\fR
+reject if labels identical
+.TP
+\fB\-\-selfid\fR
+reject if sequences identical
+.TP
+\fB\-\-slots\fR INT
+option is ignored
+.TP
+\fB\-\-strand\fR plus|both
+search plus or both strands (plus)
+.TP
+\fB\-\-target_cov\fR REAL
+reject if fraction of target seq. aligned lower
+.TP
+\fB\-\-weak_id\fR REAL
+include aligned hits with >= id; continue search
+.TP
+\fB\-\-wordlength\fR INT
+length of words for database index 3\-15 (8)
+.IP
+Output
+.TP
+\fB\-\-alnout\fR FILENAME
+filename for human\-readable alignment output
+.TP
+\fB\-\-biomout\fR FILENAME
+filename for OTU table output in biom 1.0 format
+.TP
+\fB\-\-blast6out\fR FILENAME
+filename for blast\-like tab\-separated output
+.TP
+\fB\-\-dbmatched\fR FILENAME
+FASTA file for matching database sequences
+.TP
+\fB\-\-dbnotmatched\fR FILENAME
+FASTA file for non\-matching database sequences
+.TP
+\fB\-\-fastapairs\fR FILENAME
+FASTA file with pairs of query and target
+.TP
+\fB\-\-matched\fR FILENAME
+FASTA file for matching query sequences
+.TP
+\fB\-\-mothur_shared_out\fR FN
+filename for OTU table output in mothur format
+.TP
+\fB\-\-notmatched\fR FILENAME
+FASTA file for non\-matching query sequences
+.TP
+\fB\-\-otutabout\fR FILENAME
+filename for OTU table output in classic format
+.TP
+\fB\-\-output_no_hits\fR
+output non\-matching queries to output files
+.TP
+\fB\-\-rowlen\fR INT
+width of alignment lines in alnout output (64)
+.TP
+\fB\-\-samheader\fR
+include a header in the SAM output file
+.TP
+\fB\-\-samout\fR FILENAME
+filename for SAM format output
+.TP
+\fB\-\-sizeout\fR
+write abundance annotation to dbmatched file
+.TP
+\fB\-\-top_hits_only\fR
+output only hits with identity equal to the best
+.TP
+\fB\-\-uc\fR FILENAME
+filename for UCLUST\-like output
+.TP
+\fB\-\-uc_allhits\fR
+show all, not just top hit with uc output
+.TP
+\fB\-\-userfields\fR STRING
+fields to output in userout file
+.TP
+\fB\-\-userout\fR FILENAME
+filename for user\-defined tab\-separated output
+.PP
+Shuffling and sorting
+.TP
+\fB\-\-shuffle\fR FILENAME
+shuffle order of sequences in FASTA file randomly
+.TP
+\fB\-\-sortbylength\fR FILENAME
+sort sequences by length in given FASTA file
+.TP
+\fB\-\-sortbysize\fR FILENAME
+abundance sort sequences in given FASTA file
+.IP
+Parameters
+.TP
+\fB\-\-maxsize\fR INT
+maximum abundance for sortbysize
+.TP
+\fB\-\-minsize\fR INT
+minimum abundance for sortbysize
+.TP
+\fB\-\-randseed\fR INT
+seed for PRNG, zero to use random data source (0)
+.TP
+\fB\-\-sizein\fR
+propagate abundance annotation from input
+.IP
+Output
+.TP
+\fB\-\-output\fR FILENAME
+output to specified FASTA file
+.TP
+\fB\-\-relabel\fR STRING
+relabel sequences with this prefix string
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel with md5 digest of normalized sequence
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel with sha1 digest of normalized sequence
+.TP
+\fB\-\-sizeout\fR
+include abundance information when relabelling
+.TP
+\fB\-\-topn\fR INT
+output just first n sequences
+.TP
+\fB\-\-xsize\fR
+strip abundance information in output
+.PP
+Subsampling
+.TP
+\fB\-\-fastx_subsample\fR FILENAME
+subsample sequences from given FASTA/FASTQ file
+.IP
+Parameters
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.TP
+\fB\-\-randseed\fR INT
+seed for PRNG, zero to use random data source (0)
+.TP
+\fB\-\-sample_pct\fR REAL
+sampling percentage between 0.0 and 100.0
+.TP
+\fB\-\-sample_size\fR INT
+sampling size
+.TP
+\fB\-\-sizein\fR
+consider abundance info from input, do not ignore
+.IP
+Output
+.TP
+\fB\-\-fastaout\fR FILENAME
+output subsampled sequences to FASTA file
+.TP
+\fB\-\-fastaout_discarded\fR FILE
+output non\-subsampled sequences to FASTA file
+.TP
+\fB\-\-fastqout\fR FILENAME
+output subsampled sequences to FASTQ file
+.TP
+\fB\-\-fastqout_discarded\fR
+output non\-subsampled sequences to FASTQ file
+.TP
+\fB\-\-relabel\fR STRING
+relabel sequences with this prefix string
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel with md5 digest of normalized sequence
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel with sha1 digest of normalized sequence
+.TP
+\fB\-\-sizeout\fR
+update abundance information in output
+.TP
+\fB\-\-xsize\fR
+strip abundance information in output
+.PP
+Taxonomic classification
+.TP
+\fB\-\-sintax\fR FILENAME
+classify sequences in given FASTA/FASTQ file
+.IP
+Parameters
+.TP
+\fB\-\-db\fR FILENAME
+taxonomic reference db in given FASTA or UDB file
+.TP
+\fB\-\-sintax_cutoff\fR REAL
+confidence value cutoff level (0.0)
+.IP
+Output
+.TP
+\fB\-\-tabbedout\fR FILENAME
+write results to given tab\-delimited file
+.PP
+Trimming and filtering
+.TP
+\fB\-\-fastx_filter\fR FILENAME
+trim and filter sequences in FASTA/FASTQ file
+.TP
+\fB\-\-fastq_filter\fR FILENAME
+trim and filter sequences in FASTQ file
+.TP
+\fB\-\-reverse\fR FILENAME
+FASTQ file with other end of paired\-end reads
+.IP
+Parameters
+.TP
+\fB\-\-fastq_ascii\fR INT
+FASTQ input quality score ASCII base char (33)
+.TP
+\fB\-\-fastq_maxee\fR REAL
+discard if expected error value is higher
+.TP
+\fB\-\-fastq_maxee_rate\fR REAL
+discard if expected error rate is higher
+.TP
+\fB\-\-fastq_maxlen\fR INT
+discard if length of sequence is longer
+.TP
+\fB\-\-fastq_maxns\fR INT
+discard if number of N's is higher
+.TP
+\fB\-\-fastq_minlen\fR INT
+discard if length of sequence is shorter
+.TP
+\fB\-\-fastq_qmax\fR INT
+maximum base quality value for FASTQ input (41)
+.TP
+\fB\-\-fastq_qmin\fR INT
+minimum base quality value for FASTQ input (0)
+.TP
+\fB\-\-fastq_stripleft\fR INT
+delete given number of bases from the 5' end
+.TP
+\fB\-\-fastq_stripright\fR INT
+delete given number of bases from the 3' end
+.TP
+\fB\-\-fastq_truncee\fR REAL
+truncate to given maximum expected error
+.TP
+\fB\-\-fastq_trunclen\fR INT
+truncate to given length (discard if shorter)
+.TP
+\fB\-\-fastq_trunclen_keep\fR INT
+truncate to given length (keep if shorter)
+.TP
+\fB\-\-fastq_truncqual\fR INT
+truncate to given minimum base quality
+.TP
+\fB\-\-maxsize\fR INT
+discard if abundance of sequence is above
+.TP
+\fB\-\-minsize\fR INT
+discard if abundance of sequence is below
+.IP
+Output
+.TP
+\fB\-\-eeout\fR
+include expected errors in output
+.TP
+\fB\-\-fastaout\fR FN
+FASTA filename for passed sequences
+.TP
+\fB\-\-fastaout_discarded\fR FN
+FASTA filename for discarded sequences
+.HP
+\fB\-\-fastaout_discarded_rev\fR FN FASTA filename for discarded reverse sequences
+.TP
+\fB\-\-fastaout_rev\fR FN
+FASTA filename for passed reverse sequences
+.TP
+\fB\-\-fastqout\fR FN
+FASTQ filename for passed sequences
+.TP
+\fB\-\-fastqout_discarded\fR FN
+FASTQ filename for discarded sequences
+.HP
+\fB\-\-fastqout_discarded_rev\fR FN FASTQ filename for discarded reverse sequences
+.TP
+\fB\-\-fastqout_rev\fR FN
+FASTQ filename for passed reverse sequences
+.TP
+\fB\-\-relabel\fR STRING
+relabel filtered sequences with given prefix
+.TP
+\fB\-\-relabel_keep\fR
+keep the old label after the new when relabelling
+.TP
+\fB\-\-relabel_md5\fR
+relabel filtered sequences with md5 digest
+.TP
+\fB\-\-relabel_self\fR
+relabel with the sequence itself as label
+.TP
+\fB\-\-relabel_sha1\fR
+relabel filtered sequences with sha1 digest
+.TP
+\fB\-\-sizeout\fR
+include abundance information when relabelling
+.TP
+\fB\-\-xee\fR
+remove expected errors (ee) info from output
+.TP
+\fB\-\-xsize\fR
+strip abundance information in output
+.PP
+UDB files
+.TP
+\fB\-\-makeudb_usearch\fR FILENAME
+make UDB file from given FASTA file
+.TP
+\fB\-\-udb2fasta\fR FILENAME
+output FASTA file from given UDB file
+.TP
+\fB\-\-udbinfo\fR FILENAME
+show information about UDB file
+.TP
+\fB\-\-udbstats\fR FILENAME
+report statistics about indexed words in UDB file
+.IP
+Parameters
+.TP
+\fB\-\-dbmask\fR none|dust|soft
+mask db with dust, soft or no method (dust)
+.TP
+\fB\-\-hardmask\fR
+mask by replacing with N instead of lower case
+.TP
+\fB\-\-wordlength\fR INT
+length of words for database index 3\-15 (8)
+.IP
+Output
+.TP
+\fB\-\-output\fR FILENAME
+UDB or FASTA output file
+.SH AUTHOR
+ This manpage was written by Nilesh Patra for the Debian distribution and
+ can be used for any other usage of the program.


=====================================
debian/rules
=====================================
@@ -27,6 +27,8 @@ export DEB_LDFLAGS_MAINT_APPEND = -flto
 override_dh_auto_build:
 	dh_auto_build
 	markdown README.md > README.html
+	# Remove redundant travis-build URL
+	sed -i 1d README.html
 
 VDATADIR=/usr/share/vsearch/data
 ifeq (,$(findstring nocheck,$(DEB_BUILD_OPTIONS)))


=====================================
debian/tests/README
=====================================
@@ -0,0 +1,5 @@
+Tests for vsearch
+=================
+
+The data for tests has been referenced from:
+	https://github.com/torognes/vsearch-data


=====================================
debian/_tests/control → debian/tests/control
=====================================
@@ -1,3 +1,4 @@
 Tests: run-unit-test
-Depends: @, vsearch-data
+Depends: @
 Restrictions: allow-stderr
+


=====================================
debian/tests/data/BioMarKs50k.fsa
=====================================
The diff for this file was not included because it is too large.

=====================================
debian/tests/data/query.fsa
=====================================
@@ -0,0 +1,3 @@
+>60caa38f93eb4a7ef8c0fa4d96a5a5f8;size=24
+agctccaatagcgtatattaaaattgttgcggttaaaacgctcgtagttggatatctgctaaggggttccggtccttcccagtgaagaatacgcggaactcttcttggcatttattcagggaaggtgtttgcactttgttgtgtgtcacatgatctgaatttttactttgaggaaatgagagtgtttcaagcaggctttcgccgtgaatatgatagcatggaataatagcacaggacccctttccaaagctgttggttttttggaacgaggtaatcagaataaggatagttgggggtattcgtatttaactgtcagaggtgaaattcttggattttttaaagacgaactattgcgaaggcatctgcccaggatgttttta
+


=====================================
debian/tests/expected-output/test1-expected.out
=====================================
The diff for this file was not included because it is too large.

=====================================
debian/tests/expected-output/test2-expected.out
=====================================
@@ -0,0 +1,35 @@
+vsearch --usearch_global query.fsa --db BioMarKs50k.fsa --id 0.9 --alnout test2.out
+vsearch v2.14.2_linux_x86_64, 7.5GB RAM, 8 cores
+
+Query >60caa38f93eb4a7ef8c0fa4d96a5a5f8;size=24
+ %Id   TLen  Target
+100%    380  60caa38f93eb4a7ef8c0fa4d96a5a5f8;size=24
+
+ Query 380nt >60caa38f93eb4a7ef8c0fa4d96a5a5f8;size=24
+Target 380nt >60caa38f93eb4a7ef8c0fa4d96a5a5f8;size=24
+
+Qry   1 + AGCTCCAATAGCGTATATTAAAATTGTTGCGGTTAAAACGCTCGTAGTTGGATATCTGCTAAGG 64
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt   1 + AGCTCCAATAGCGTATATTAAAATTGTTGCGGTTAAAACGCTCGTAGTTGGATATCTGCTAAGG 64
+
+Qry  65 + GGTTCCGGTCCTTCCCAGTGAAGAATACGCGGAACTCTTCTTGGCATTTATTCAGGGAAGGTGT 128
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt  65 + GGTTCCGGTCCTTCCCAGTGAAGAATACGCGGAACTCTTCTTGGCATTTATTCAGGGAAGGTGT 128
+
+Qry 129 + TTGCACTTTGTTGTGTGTCACATGATCTGAATTTTTACTTTGAGGAAATGAGAGTGTTTCAAGC 192
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt 129 + TTGCACTTTGTTGTGTGTCACATGATCTGAATTTTTACTTTGAGGAAATGAGAGTGTTTCAAGC 192
+
+Qry 193 + AGGCTTTCGCCGTGAATATGATAGCATGGAATAATAGCACAGGACCCCTTTCCAAAGCTGTTGG 256
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt 193 + AGGCTTTCGCCGTGAATATGATAGCATGGAATAATAGCACAGGACCCCTTTCCAAAGCTGTTGG 256
+
+Qry 257 + TTTTTTGGAACGAGGTAATCAGAATAAGGATAGTTGGGGGTATTCGTATTTAACTGTCAGAGGT 320
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt 257 + TTTTTTGGAACGAGGTAATCAGAATAAGGATAGTTGGGGGTATTCGTATTTAACTGTCAGAGGT 320
+
+Qry 321 + GAAATTCTTGGATTTTTTAAAGACGAACTATTGCGAAGGCATCTGCCCAGGATGTTTTTA 380
+          ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
+Tgt 321 + GAAATTCTTGGATTTTTTAAAGACGAACTATTGCGAAGGCATCTGCCCAGGATGTTTTTA 380
+
+380 cols, 380 ids (100.0%), 0 gaps (0.0%)


=====================================
debian/tests/run-unit-test
=====================================
@@ -0,0 +1,26 @@
+#!/bin/bash
+set -e
+
+pkg=vsearch
+
+if [ "${AUTOPKGTEST_TMP}" = "" ] ; then
+  AUTOPKGTEST_TMP=$(mktemp -d /tmp/${pkg}-test.XXXXXX)
+  trap "rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
+fi
+
+cp /usr/share/doc/${pkg}-examples/* -a "${AUTOPKGTEST_TMP}"
+
+cd "${AUTOPKGTEST_TMP}"
+gunzip -r *
+
+echo 'Test 1'
+vsearch --cluster_fast BioMarKs50k.fsa --id 0.97 --centroids test1.out
+diff -u test1.out test1-expected.out 
+echo 'PASS'
+echo
+
+echo 'Test 2'
+vsearch --usearch_global query.fsa --db BioMarKs50k.fsa --id 0.9 --alnout test2.out
+diff -u <(tail -n +3 test2.out) <(tail -n +3 test2-expected.out)
+echo 'PASS'
+


=====================================
debian/vsearch-examples.docs
=====================================
@@ -0,0 +1,2 @@
+debian/tests/data/*
+debian/tests/expected-output/*


=====================================
debian/vsearch.install
=====================================
@@ -0,0 +1 @@
+bin/vsearch usr/bin


=====================================
debian/vsearch.manpages
=====================================
@@ -1 +1 @@
-man/*.1
+debian/man/*.1



View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/66e9bd7296847c305fe827682ce0a4d23765456a...8f3acaa9b461f159f966b13e25804dcd45f2a292

-- 
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/66e9bd7296847c305fe827682ce0a4d23765456a...8f3acaa9b461f159f966b13e25804dcd45f2a292
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20200612/2053bf87/attachment-0001.html>


More information about the debian-med-commit mailing list