[med-svn] [Git][med-team/vsearch][upstream] New upstream version 2.9.0

Steffen Möller gitlab at salsa.debian.org
Mon Oct 15 18:40:40 BST 2018


Steffen Möller pushed to branch upstream at Debian Med / vsearch


Commits:
1920c746 by Steffen Moeller at 2018-10-15T17:26:13Z
New upstream version 2.9.0
- - - - -


9 changed files:

- README.md
- configure.ac
- man/vsearch.1
- src/Makefile.am
- src/derep.cc
- + src/fastqjoin.cc
- + src/fastqjoin.h
- src/vsearch.cc
- src/vsearch.h


Changes:

=====================================
README.md
=====================================
@@ -24,7 +24,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
 
 ## Getting Help
 
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
 
 ## Example
 
@@ -37,9 +37,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
 **Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
 
 ```
-wget https://github.com/torognes/vsearch/archive/v2.8.5.tar.gz
-tar xzf v2.8.5.tar.gz
-cd vsearch-2.8.5
+wget https://github.com/torognes/vsearch/archive/v2.9.0.tar.gz
+tar xzf v2.9.0.tar.gz
+cd vsearch-2.9.0
 ./autogen.sh
 ./configure
 make
@@ -70,36 +70,36 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
 Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch-2.8.5-linux-x86_64.tar.gz
-tar xzf vsearch-2.8.5-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch-2.9.0-linux-x86_64.tar.gz
+tar xzf vsearch-2.9.0-linux-x86_64.tar.gz
 ```
 
 Or these commands if you are using a Linux ppc64le system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch-2.8.5-linux-ppc64le.tar.gz
-tar xzf vsearch-2.8.5-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch-2.9.0-linux-ppc64le.tar.gz
+tar xzf vsearch-2.9.0-linux-ppc64le.tar.gz
 ```
 
 Or these commands if you are using a Mac:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch-2.8.5-macos-x86_64.tar.gz
-tar xzf vsearch-2.8.5-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch-2.9.0-macos-x86_64.tar.gz
+tar xzf vsearch-2.9.0-macos-x86_64.tar.gz
 ```
 
 Or if you are using Windows, download and extract (unzip) the contents of this file:
 
 ```
-https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch-2.8.5-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch-2.9.0-win-x86_64.zip
 ```
 
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.8.5-linux-x86_64` or `vsearch-2.8.5-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.9.0-linux-x86_64` or `vsearch-2.9.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
 
-Windows: You will now have the binary distribution in a folder called `vsearch-2.8.5-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
+Windows: You will now have the binary distribution in a folder called `vsearch-2.9.0-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
 
 
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.8.5/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.9.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
 
 
 ## Plugins, packages, and wrappers


=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
 # Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.8.5], [torognes at ifi.uio.no])
+AC_INIT([vsearch], [2.9.0], [torognes at ifi.uio.no])
 AC_CANONICAL_TARGET
 AM_INIT_AUTOMAKE([subdir-objects])
 AC_LANG([C++])


=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
 .\" ============================================================================
-.TH vsearch 1 "September 26, 2018" "version 2.8.5" "USER COMMANDS"
+.TH vsearch 1 "October 10, 2018" "version 2.9.0" "USER COMMANDS"
 .\" ============================================================================
 .SH NAME
 vsearch \(em chimera detection, clustering, dereplication and
@@ -55,6 +55,10 @@ FASTA/FASTQ file processing:
 \-\-fastaout_discarded | \-\-fastqout | \-\-fastqout_discarded)
 \fIoutputfile\fR [\fIoptions\fR]
 .PP
+\fBvsearch\fR \-\-fastq_join \fIfastqfile\fR \-\-reverse
+\fIfastqfile\fR (\-\-fastaout | \-\-fastqout) \fIoutputfile\fR
+[\fIoptions\fR]
+.PP
 \fBvsearch\fR \-\-fastq_mergepairs \fIfastqfile\fR \-\-reverse
 \fIfastqfile\fR (\-\-fastaout | \-\-fastqout |
 \-\-fastaout_notmerged_fwd | \-\-fastaout_notmerged_rev |
@@ -1091,6 +1095,22 @@ Shorten and/or filter sequences in the given FASTQ file. Similar to
 the \-\-fastx_filter command, but works only on FASTQ files. See
 \-\-fastx_filter for details.
 .TP
+.BI \-\-fastq_join\0 filename
+Join paired-end sequence reads into one sequence and add a gap between
+them using a padding sequence. The sequences are not merged as with
+the fastq_mergepairs command, but simply joined with a gap. The
+forward reads are specified as the argument to this option and the
+reverse reads are specified with the \-\-reverse option. The resulting
+sequences consist of the forward read, the padding sequence and the
+reverse complement of the reverse read. The padding sequence is
+specified with the \-\-join_padgap option and the padding quality is
+specified with the \-\-join_padgapq option. The default padding
+sequence string is NNNNNNNN and the default padding quality string is
+IIIIIIII, corresponding to a base quality score of 40 (a very high
+quality score with error probability 0.0001). The joined sequences are
+output to the file(s) specified with the \-\-fastaout or \-\-fastqout
+options.
+.TP
 .BI \-\-fastq_maxdiffs\~ "positive integer"
 When using \-\-fastq_mergepairs, specify the maximum number of
 non-matching nucleotides allowed in the overlap region. That option
@@ -1354,6 +1374,16 @@ file specified with the \-\-fastaout and/or \-\-fastqout options. If
 the input file is in FASTA format, the output can not be written back
 to a FASTQ file due to missing base quality scores.
 .TP
+.BI \-\-join_padgap\~ string
+When running \-\-fastq_join, use the \fIstring\fR as a sequence
+padding string. The default is NNNNNNNN (8 N's).
+.TP
+.BI \-\-join_padgapq\~ string
+When running \-\-fastq_join, use the \fIstring\fR as a quality padding
+string. The default is a string of I's equal in length to the sequence
+padding string. The letter I corresponds to a base quality score of 40
+indicating a very high quality base with error probability of 0.0001.
+.TP
 .BI \-\-label_suffix\~ string
 When using \-\-fastx_revcomp or \-\-fastq_mergepairs, add the suffix
 \fIstring\fR to sequence headers.
@@ -1389,8 +1419,8 @@ Please see the description of the same option under Chimera detection
 for details.
 .TP
 .BI \-\-reverse \0filename
-When using \-\-fastq_mergepairs, specify the FASTQ file containing
-containing the reverse reads.
+When using \-\-fastq_mergepairs or \-\-fastq_join, specify the FASTQ
+file containing containing the reverse reads.
 .TP
 .B \-\-xsize
 Strip abundance information from the headers when writing the output
@@ -3418,6 +3448,14 @@ are in effect.
 .BR v2.8.5\~ "released September 26th, 2018"
 Fixed a bug in fastq_eestats2 that caused the values for large lengths
 to be much too high when the input sequences had varying lengths.
+.TP
+.BR v2.8.6\~ "released October 9th, 2018"
+Fixed a bug introduced in version 2.8.2 that caused derep_fulllength
+to include the full FASTA header in its output instead of stopping at
+the first space (unless the notrunclabels option is in effect).
+.TP
+.BR v2.9.0\~ "released October 10th, 2018"
+Added the fastq_join command.
 .RE
 .LP
 .\" ============================================================================


=====================================
src/Makefile.am
=====================================
@@ -30,6 +30,7 @@ dynlibs.h \
 eestats.h \
 fasta.h \
 fastq.h \
+fastqjoin.h \
 fastqops.h \
 fastx.h \
 kmerhash.h \
@@ -108,6 +109,7 @@ dynlibs.cc \
 eestats.cc \
 fasta.cc \
 fastq.cc \
+fastqjoin.cc \
 fastqops.cc \
 fastx.cc \
 kmerhash.cc \


=====================================
src/derep.cc
=====================================
@@ -298,7 +298,7 @@ void derep_fulllength()
   double median = 0.0;
   double average = 0.0;
 
-  while(fastx_next(h, 0, chrmap_no_change))
+  while(fastx_next(h, ! opt_notrunclabels, chrmap_no_change))
     {
       int64_t seqlen = fastx_get_sequence_length(h);
 


=====================================
src/fastqjoin.cc
=====================================
@@ -0,0 +1,245 @@
+/*
+
+  VSEARCH: a versatile open source tool for metagenomics
+
+  Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
+  All rights reserved.
+
+  Contact: Torbjorn Rognes <torognes at ifi.uio.no>,
+  Department of Informatics, University of Oslo,
+  PO Box 1080 Blindern, NO-0316 Oslo, Norway
+
+  This software is dual-licensed and available under a choice
+  of one of two licenses, either under the terms of the GNU
+  General Public License version 3 or the BSD 2-Clause License.
+
+
+  GNU General Public License version 3
+
+  This program is free software: you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation, either version 3 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+  The BSD 2-Clause License
+
+  Redistribution and use in source and binary forms, with or without
+  modification, are permitted provided that the following conditions
+  are met:
+
+  1. Redistributions of source code must retain the above copyright
+  notice, this list of conditions and the following disclaimer.
+
+  2. Redistributions in binary form must reproduce the above copyright
+  notice, this list of conditions and the following disclaimer in the
+  documentation and/or other materials provided with the distribution.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+  COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+  ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+  POSSIBILITY OF SUCH DAMAGE.
+
+*/
+
+#include "vsearch.h"
+
+/* static variables */
+
+FILE * join_fileopenw(char * filename)
+{
+  FILE * fp = 0;
+  fp = fopen_output(filename);
+  if (!fp)
+    fatal("Unable to open file for writing (%s)", filename);
+  return fp;
+}
+
+void fastq_join()
+{
+  FILE * fp_fastqout = 0;
+  FILE * fp_fastaout = 0;
+
+  fastx_handle fastq_fwd = 0;
+  fastx_handle fastq_rev = 0;
+
+  uint64_t total = 0;
+
+  /* check input and options */
+
+  if (!opt_reverse)
+    fatal("No reverse reads file specified with --reverse");
+
+  if ((!opt_fastqout) && (!opt_fastaout))
+    fatal("No output files specified");
+
+  char * padgap = 0;
+  char * padgapq = 0;
+
+  if (opt_join_padgap)
+    padgap = xstrdup(opt_join_padgap);
+  else
+    padgap = xstrdup("NNNNNNNN");
+
+  uint64_t padlen = strlen(padgap);
+
+  if (opt_join_padgapq)
+    padgapq = xstrdup(opt_join_padgapq);
+  else
+    {
+      padgapq = (char *) xmalloc(padlen + 1);
+      for(uint64_t i = 0; i < padlen; i++)
+        padgapq[i] = 'I';
+      padgapq[padlen] = 0;
+    }
+
+  if (padlen != strlen(padgapq))
+    fatal("Strings given by --join_padgap and --join_padgapq differ in length");
+
+  /* open input files */
+
+  fastq_fwd = fastq_open(opt_fastq_join);
+  fastq_rev = fastq_open(opt_reverse);
+
+  /* open output files */
+
+  if (opt_fastqout)
+    fp_fastqout = join_fileopenw(opt_fastqout);
+  if (opt_fastaout)
+    fp_fastaout = join_fileopenw(opt_fastaout);
+
+  /* main */
+
+  uint64_t filesize = fastq_get_size(fastq_fwd);
+  progress_init("Joining reads", filesize);
+
+  /* do it */
+
+  total = 0;
+
+  uint64_t alloc = 0;
+  uint64_t len = 0;
+  char * seq = 0;
+  char * qual = 0;
+
+  while(fastq_next(fastq_fwd, 0, chrmap_no_change))
+    {
+      if (! fastq_next(fastq_rev, 0, chrmap_no_change))
+        fatal("More forward reads than reverse reads");
+
+      uint64_t fwd_seq_length = fastq_get_sequence_length(fastq_fwd);
+      uint64_t rev_seq_length = fastq_get_sequence_length(fastq_rev);
+
+      /* allocate enough mem */
+
+      uint64_t needed = fwd_seq_length + rev_seq_length + padlen + 1;
+      if (alloc < needed)
+        {
+          seq = (char *) xrealloc(seq, needed);
+          qual = (char *) xrealloc(qual, needed);
+          alloc = needed;
+        }
+
+      /* join them */
+
+      strcpy(seq, fastq_get_sequence(fastq_fwd));
+      strcpy(qual, fastq_get_quality(fastq_fwd));
+      len = fwd_seq_length;
+
+      strcpy(seq + len, padgap);
+      strcpy(qual + len, padgapq);
+      len += padlen;
+
+      /* reverse complement reverse read */
+
+      char * rev_seq = fastq_get_sequence(fastq_rev);
+      char * rev_qual = fastq_get_quality(fastq_rev);
+
+      for(uint64_t i = 0; i < rev_seq_length; i++)
+        {
+          uint64_t rev_pos = rev_seq_length - 1 - i;
+          seq[len]  = chrmap_complement[(int)(rev_seq[rev_pos])];
+          qual[len] = rev_qual[rev_pos];
+          len++;
+        }
+      seq[len] = 0;
+      qual[len] = 0;
+
+      /* write output */
+
+      if (opt_fastqout)
+        {
+          fastq_print_general(fp_fastqout,
+                              seq,
+                              len,
+                              fastq_get_header(fastq_fwd),
+                              fastq_get_header_length(fastq_fwd),
+                              qual,
+                              0,
+                              total + 1,
+                              0,
+                              0);
+        }
+
+      if (opt_fastaout)
+        {
+          fasta_print_general(fp_fastaout,
+                              0,
+                              seq,
+                              len,
+                              fastq_get_header(fastq_fwd),
+                              fastq_get_header_length(fastq_fwd),
+                              0,
+                              total + 1,
+                              -1,
+                              -1,
+                              0,
+                              0);
+        }
+
+      total++;
+      progress_update(fastq_get_position(fastq_fwd));
+    }
+
+  progress_done();
+
+  if (fastq_next(fastq_rev, 0, chrmap_no_change))
+    fatal("More reverse reads than forward reads");
+
+  fprintf(stderr,
+          "%" PRIu64 " pairs joined\n",
+          total);
+
+  /* clean up */
+
+  if (opt_fastaout)
+    fclose(fp_fastaout);
+  if (opt_fastqout)
+    fclose(fp_fastqout);
+
+  fastq_close(fastq_rev);
+  fastq_rev = 0;
+  fastq_close(fastq_fwd);
+  fastq_fwd = 0;
+
+  free(seq);
+  free(qual);
+  free(padgap);
+  free(padgapq);
+}


=====================================
src/fastqjoin.h
=====================================
@@ -0,0 +1,61 @@
+/*
+
+  VSEARCH: a versatile open source tool for metagenomics
+
+  Copyright (C) 2014-2018, Torbjorn Rognes, Frederic Mahe and Tomas Flouri
+  All rights reserved.
+
+  Contact: Torbjorn Rognes <torognes at ifi.uio.no>,
+  Department of Informatics, University of Oslo,
+  PO Box 1080 Blindern, NO-0316 Oslo, Norway
+
+  This software is dual-licensed and available under a choice
+  of one of two licenses, either under the terms of the GNU
+  General Public License version 3 or the BSD 2-Clause License.
+
+
+  GNU General Public License version 3
+
+  This program is free software: you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation, either version 3 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+
+  The BSD 2-Clause License
+
+  Redistribution and use in source and binary forms, with or without
+  modification, are permitted provided that the following conditions
+  are met:
+
+  1. Redistributions of source code must retain the above copyright
+  notice, this list of conditions and the following disclaimer.
+
+  2. Redistributions in binary form must reproduce the above copyright
+  notice, this list of conditions and the following disclaimer in the
+  documentation and/or other materials provided with the distribution.
+
+  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+  COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+  ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+  POSSIBILITY OF SUCH DAMAGE.
+
+*/
+
+void fastq_join();


=====================================
src/vsearch.cc
=====================================
@@ -108,6 +108,7 @@ char * opt_fastq_convert;
 char * opt_fastq_eestats;
 char * opt_fastq_eestats2;
 char * opt_fastq_filter;
+char * opt_fastq_join;
 char * opt_fastq_mergepairs;
 char * opt_fastq_stats;
 char * opt_fastqout;
@@ -118,6 +119,8 @@ char * opt_fastx_filter;
 char * opt_fastx_mask;
 char * opt_fastx_revcomp;
 char * opt_fastx_subsample;
+char * opt_join_padgap;
+char * opt_join_padgapq;
 char * opt_label_suffix;
 char * opt_log;
 char * opt_makeudb_usearch;
@@ -661,6 +664,7 @@ void args_init(int argc, char **argv)
   opt_fastq_eestats = 0;
   opt_fastq_eestats2 = 0;
   opt_fastq_filter = 0;
+  opt_fastq_join = 0;
   opt_fastq_maxdiffpct = 100.0;
   opt_fastq_maxdiffs = 10;
   opt_fastq_maxee = DBL_MAX;
@@ -713,6 +717,8 @@ void args_init(int argc, char **argv)
   opt_iddef = 2;
   opt_idprefix = 0;
   opt_idsuffix = 0;
+  opt_join_padgap = 0;
+  opt_join_padgapq = 0;
   opt_label_suffix = 0;
   opt_leftjust = 0;
   opt_length_cutoffs_increment = 50;
@@ -1024,6 +1030,9 @@ void args_init(int argc, char **argv)
     {"sintax_cutoff",         required_argument, 0, 0 },
     {"tabbedout",             required_argument, 0, 0 },
     {"fastq_maxdiffpct",      required_argument, 0, 0 },
+    {"fastq_join",            required_argument, 0, 0 },
+    {"join_padgap",           required_argument, 0, 0 },
+    {"join_padgapq",          required_argument, 0, 0 },
     { 0, 0, 0, 0 }
   };
 
@@ -1870,6 +1879,18 @@ void args_init(int argc, char **argv)
           opt_fastq_maxdiffpct = args_getdouble(optarg);
           break;
 
+        case 199:
+          opt_fastq_join = optarg;
+          break;
+
+        case 200:
+          opt_join_padgap = optarg;
+          break;
+
+        case 201:
+          opt_join_padgapq = optarg;
+          break;
+
         default:
           fatal("Internal error in option parsing");
         }
@@ -1957,6 +1978,8 @@ void args_init(int argc, char **argv)
     commands++;
   if (opt_sintax)
     commands++;
+  if (opt_fastq_join)
+    commands++;
 
 
   if (commands > 1)
@@ -2331,6 +2354,16 @@ void cmd_help()
               " Output\n"
               "  --output FILENAME           output to specified FASTA file\n"
               "\n"
+              "Paired-end reads joining\n"
+              "  --fastq_join FILENAME       join paired-end reads into one sequence with gap\n"
+              " Data\n"
+              "  --reverse FILENAME          specify FASTQ file with reverse reads\n"
+              "  --join_padgap STRING        sequence string used for padding (NNNNNNNN)\n"
+              "  --join_padgapq STRING       quality string used for padding (IIIIIIII)\n"
+              " Output\n"
+              "  --fastaout FILENAME         FASTA output filename for joined sequences\n"
+              "  --fastqout FILENAME         FASTQ output filename for joined sequences\n"
+              "\n"
               "Paired-end reads merging\n"
               "  --fastq_mergepairs FILENAME merge paired-end reads into one sequence\n"
               " Data\n"
@@ -2696,6 +2729,7 @@ void cmd_none()
             "vsearch --fastq_convert FILENAME --fastqout FILENAME --fastq_ascii 64\n"
             "vsearch --fastq_eestats FILENAME --output FILENAME\n"
             "vsearch --fastq_eestats2 FILENAME --output FILENAME\n"
+            "vsearch --fastq_join FILENAME --reverse FILENAME --fastqout FILENAME\n"
             "vsearch --fastq_mergepairs FILENAME --reverse FILENAME --fastqout FILENAME\n"
             "vsearch --fastq_stats FILENAME --log FILENAME\n"
             "vsearch --fastx_filter FILENAME --fastaout FILENAME --fastq_trunclen 100\n"
@@ -2939,6 +2973,8 @@ int main(int argc, char** argv)
     cmd_fastq_eestats();
   else if (opt_fastq_eestats2)
     cmd_fastq_eestats2();
+  else if (opt_fastq_join)
+    fastq_join();
   else if (opt_rereplicate)
     cmd_rereplicate();
   else if (opt_version)


=====================================
src/vsearch.h
=====================================
@@ -211,6 +211,7 @@
 #include "udb.h"
 #include "kmerhash.h"
 #include "sintax.h"
+#include "fastqjoin.h"
 
 /* options */
 
@@ -260,6 +261,7 @@ extern char * opt_fastq_convert;
 extern char * opt_fastq_eestats2;
 extern char * opt_fastq_eestats;
 extern char * opt_fastq_filter;
+extern char * opt_fastq_join;
 extern char * opt_fastq_mergepairs;
 extern char * opt_fastq_stats;
 extern char * opt_fastqout;
@@ -270,6 +272,8 @@ extern char * opt_fastx_filter;
 extern char * opt_fastx_mask;
 extern char * opt_fastx_revcomp;
 extern char * opt_fastx_subsample;
+extern char * opt_join_padgap;
+extern char * opt_join_padgapq;
 extern char * opt_label_suffix;
 extern char * opt_log;
 extern char * opt_makeudb_usearch;



View it on GitLab: https://salsa.debian.org/med-team/vsearch/commit/1920c746ed222193d51c0701705924a095862a85

-- 
View it on GitLab: https://salsa.debian.org/med-team/vsearch/commit/1920c746ed222193d51c0701705924a095862a85
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20181015/d87569e5/attachment-0001.html>


More information about the debian-med-commit mailing list