[med-svn] [Git][med-team/vsearch][upstream] New upstream version 2.26.1

Étienne Mollier (@emollier) gitlab at salsa.debian.org
Mon Nov 27 20:52:30 GMT 2023



Étienne Mollier pushed to branch upstream at Debian Med / vsearch


Commits:
cc613f49 by Étienne Mollier at 2023-11-27T21:39:33+01:00
New upstream version 2.26.1
- - - - -


24 changed files:

- README.md
- configure.ac
- man/vsearch.1
- src/Makefile.am
- src/allpairs.cc
- src/chimera.cc
- src/cluster.cc
- src/db.cc
- src/derep.cc
- src/derepsmallmem.cc
- src/eestats.cc
- src/kmerhash.cc
- src/mask.cc
- src/mergepairs.cc
- src/otutable.cc
- src/results.cc
- src/results.h
- src/search.cc
- src/searchexact.cc
- src/sha1.h
- src/sintax.cc
- src/udb.cc
- src/vsearch.cc
- src/xstring.h


Changes:

=====================================
README.md
=====================================
@@ -37,7 +37,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
 
 ## Getting Help
 
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
 
 ## Example
 
@@ -50,9 +50,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
 **Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
 
 ```
-wget https://github.com/torognes/vsearch/archive/v2.25.0.tar.gz
-tar xzf v2.25.0.tar.gz
-cd vsearch-2.25.0
+wget https://github.com/torognes/vsearch/archive/v2.26.1.tar.gz
+tar xzf v2.26.1.tar.gz
+cd vsearch-2.26.1
 ./autogen.sh
 ./configure CFLAGS="-O3" CXXFLAGS="-O3"
 make
@@ -81,48 +81,48 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
 Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-x86_64.tar.gz
-tar xzf vsearch-2.25.0-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-x86_64.tar.gz
+tar xzf vsearch-2.26.1-linux-x86_64.tar.gz
 ```
 
 Or these commands if you are using a Linux ppc64le system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-ppc64le.tar.gz
-tar xzf vsearch-2.25.0-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-ppc64le.tar.gz
+tar xzf vsearch-2.26.1-linux-ppc64le.tar.gz
 ```
 
 Or these commands if you are using a Linux aarch64 (arm64) system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-aarch64.tar.gz
-tar xzf vsearch-2.25.0-linux-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-aarch64.tar.gz
+tar xzf vsearch-2.26.1-linux-aarch64.tar.gz
 ```
 
 Or these commands if you are using a Mac with an Apple Silicon CPU:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-macos-aarch64.tar.gz
-tar xzf vsearch-2.25.0-macos-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-macos-aarch64.tar.gz
+tar xzf vsearch-2.26.1-macos-aarch64.tar.gz
 ```
 
 Or these commands if you are using a Mac with an Intel CPU:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-macos-x86_64.tar.gz
-tar xzf vsearch-2.25.0-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-macos-x86_64.tar.gz
+tar xzf vsearch-2.26.1-macos-x86_64.tar.gz
 ```
 
 Or if you are using Windows, download and extract (unzip) the contents of this file:
 
 ```
-https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-win-x86_64.zip
 ```
 
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.25.0-linux-x86_64` or `vsearch-2.25.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.26.1-linux-x86_64` or `vsearch-2.26.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
 
 **Windows**: You will now have the binary distribution in a folder
-called `vsearch-2.25.0-win-x86_64`. The vsearch executable is called
+called `vsearch-2.26.1-win-x86_64`. The vsearch executable is called
 `vsearch.exe`. The manual in PDF format is called
 `vsearch_manual.pdf`. If you want to be able to call `vsearch.exe`
 from any command prompt window, you can put the vsearch executable in
@@ -133,7 +133,7 @@ searching for it in the Start menu, `Edit` user variables, add
 your changes.
 
 
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
 
 
 ## Packages, plugins, and wrappers


=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
 # Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.25.0], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
+AC_INIT([vsearch], [2.26.1], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
 AC_CANONICAL_TARGET
 AM_INIT_AUTOMAKE([subdir-objects])
 AC_LANG([C++])


=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
 .\" ============================================================================
-.TH vsearch 1 "November 10, 2023" "version 2.25.0" "USER COMMANDS"
+.TH vsearch 1 "November 25, 2023" "version 2.26.1" "USER COMMANDS"
 .\" ============================================================================
 .SH NAME
 vsearch \(em a versatile open-source tool for microbiome analysis,
@@ -4834,6 +4834,15 @@ warnings for sha1.c. Compile for release (not debug) by default.
 .BR v2.25.0\~ "released November 10th, 2023"
 Allow a given percentage of mismatches between chimeras and parents
 for the experimental chimeras_denovo command.
+.TP
+.BR v2.26.0\~ "released November 24th, 2023"
+Enable the maxseqlength and minseqlength options for the chimera
+detection commands. When the usearch_global or search_exact commands
+are used, OTU tables will include samples and OTUs with no matches.
+.TP
+.BR v2.26.1\~ "released November 25th, 2023"
+No real changes, but the previous version was released without proper
+updates to the source code.
 .LP
 .\" ============================================================================
 .\" TODO:


=====================================
src/Makefile.am
=====================================
@@ -1,23 +1,25 @@
 bin_PROGRAMS = $(top_builddir)/bin/vsearch
 
+AM_CFLAGS = -Wall -Wsign-compare
+
 if TARGET_PPC
-AM_CFLAGS=-Wall -Wsign-compare -mcpu=powerpc64le -maltivec
+AM_CFLAGS += -mcpu=powerpc64le -maltivec
 else
 if TARGET_AARCH64
-AM_CFLAGS=-Wall -Wsign-compare -march=armv8-a+simd -mtune=generic
+AM_CFLAGS += -march=armv8-a+simd -mtune=generic
 else
-AM_CFLAGS=-Wall -Wsign-compare -march=x86-64 -mtune=generic
+AM_CFLAGS += -march=x86-64 -mtune=generic
 endif
 endif
 
 # Conditionally set NDEBUG based on ENABLE_DEBUG
 if ENABLE_DEBUG
-AM_CFLAGS += -UNDEBUG
+AM_CFLAGS += -UNDEBUG -Wcast-align -Wextra -Wfloat-equal
 else
 AM_CFLAGS += -DNDEBUG
 endif
 
-AM_CXXFLAGS=$(AM_CFLAGS) -std=c++11
+AM_CXXFLAGS = $(AM_CFLAGS) -std=c++11
 
 export MACOSX_DEPLOYMENT_TARGET=10.9
 


=====================================
src/allpairs.cc
=====================================
@@ -135,8 +135,7 @@ void allpairs_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
-                          qsequence_rc);
+                          qseqlen);
     }
 
   if (fp_samout)
@@ -146,7 +145,6 @@ void allpairs_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
                           qsequence_rc);
     }
 
@@ -169,7 +167,6 @@ void allpairs_output_results(int hit_count,
                                           hp,
                                           query_head,
                                           qsequence,
-                                          qseqlen,
                                           qsequence_rc);
             }
 
@@ -186,11 +183,7 @@ void allpairs_output_results(int hit_count,
           if (fp_tsegout)
             {
               results_show_tsegout_one(fp_tsegout,
-                                       hp,
-                                       query_head,
-                                       qsequence,
-                                       qseqlen,
-                                       qsequence_rc);
+                                       hp);
             }
 
           if (fp_uc)
@@ -200,9 +193,7 @@ void allpairs_output_results(int hit_count,
                   results_show_uc_one(fp_uc,
                                       hp,
                                       query_head,
-                                      qsequence,
                                       qseqlen,
-                                      qsequence_rc,
                                       hp->target);
                 }
             }
@@ -222,9 +213,7 @@ void allpairs_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          hp,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }
@@ -235,9 +224,7 @@ void allpairs_output_results(int hit_count,
           results_show_uc_one(fp_uc,
                               nullptr,
                               query_head,
-                              qsequence,
                               qseqlen,
-                              qsequence_rc,
                               0);
         }
 
@@ -258,9 +245,7 @@ void allpairs_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          nullptr,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }


=====================================
src/chimera.cc
=====================================
@@ -59,6 +59,7 @@
 */
 
 #include "vsearch.h"
+#include <vector>
 
 /*
   This code implements the method described in this paper:
@@ -168,14 +169,18 @@ void realloc_arrays(struct chimera_info_s * ci)
 {
   if (opt_chimeras_denovo)
     {
-      if (opt_chimeras_parts == 0)
+      if (opt_chimeras_parts == 0) {
         parts = (ci->query_len + maxparts - 1) / maxparts;
-      else
+      }
+      else {
         parts = opt_chimeras_parts;
-      if (parts < 2)
+      }
+      if (parts < 2) {
         parts = 2;
-      else if (parts > maxparts)
+      }
+      else if (parts > maxparts) {
         parts = maxparts;
+      }
     }
   else
     {
@@ -183,7 +188,7 @@ void realloc_arrays(struct chimera_info_s * ci)
       parts = 4;
     }
 
-  int maxhlen = MAX(ci->query_head_len,1);
+  const int maxhlen = MAX(ci->query_head_len, 1);
   if (maxhlen > ci->head_alloc)
     {
       ci->head_alloc = maxhlen;
@@ -192,8 +197,8 @@ void realloc_arrays(struct chimera_info_s * ci)
 
   /* realloc arrays based on query length */
 
-  int maxqlen = MAX(ci->query_len, 1);
-  int maxpartlen = (maxqlen + parts - 1) / parts;
+  const int maxqlen = MAX(ci->query_len, 1);
+  const int maxpartlen = (maxqlen + parts - 1) / parts;
 
   if (maxqlen > ci->query_alloc)
     {
@@ -201,8 +206,7 @@ void realloc_arrays(struct chimera_info_s * ci)
 
       ci->query_seq = (char*) xrealloc(ci->query_seq, maxqlen + 1);
 
-      for(auto & i
-            : ci->si)
+      for(auto & i: ci->si)
         {
           i.qsequence = (char*) xrealloc(i.qsequence, maxpartlen + 1);
         }
@@ -221,16 +225,16 @@ void realloc_arrays(struct chimera_info_s * ci)
       ci->scan_q = (double *) xrealloc(ci->scan_q,
                                        (maxqlen + 1) * sizeof(double));
 
-      int maxalnlen = maxqlen + 2 * db_getlongestsequence();
+      const int maxalnlen = maxqlen + 2 * db_getlongestsequence();
       for (int f = 0; f < maxparents ; f++)
         {
-          ci->paln[f] = (char*) xrealloc(ci->paln[f], maxalnlen+1);
+          ci->paln[f] = (char*) xrealloc(ci->paln[f], maxalnlen + 1);
         }
-      ci->qaln = (char*) xrealloc(ci->qaln, maxalnlen+1);
-      ci->diffs = (char*) xrealloc(ci->diffs, maxalnlen+1);
-      ci->votes = (char*) xrealloc(ci->votes, maxalnlen+1);
-      ci->model = (char*) xrealloc(ci->model, maxalnlen+1);
-      ci->ignore = (char*) xrealloc(ci->ignore, maxalnlen+1);
+      ci->qaln = (char*) xrealloc(ci->qaln, maxalnlen + 1);
+      ci->diffs = (char*) xrealloc(ci->diffs, maxalnlen + 1);
+      ci->votes = (char*) xrealloc(ci->votes, maxalnlen + 1);
+      ci->model = (char*) xrealloc(ci->model, maxalnlen + 1);
+      ci->ignore = (char*) xrealloc(ci->ignore, maxalnlen + 1);
     }
 }
 
@@ -269,15 +273,15 @@ void find_matches(struct chimera_info_s * ci)
           switch (op)
             {
             case 'M':
-              for(int k=0; k<run; k++)
+              for(int k = 0; k < run; k++)
                 {
                   if (chrmap_4bit[(int)(qseq[qpos])] &
                       chrmap_4bit[(int)(tseq[tpos])])
                     {
                       ci->match[i * ci->query_len + qpos] = 1;
                     }
-                  qpos++;
-                  tpos++;
+                  ++qpos;
+                  ++tpos;
                 }
               break;
 
@@ -394,11 +398,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
       best_parents[f].start = -1;
     }
 
-  bool position_used[ci->query_len];
-  for (int i = 0; i < ci->query_len; i++)
-    {
-      position_used[i] = false;
-    }
+  std::vector<bool> position_used(ci->query_len, false);
 
   int pos_remaining = ci->query_len;
   int parents_found = 0;
@@ -455,7 +455,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
           best_parents[f].cand = best_cand;
           best_parents[f].start = best_start;
           best_parents[f].len = best_len;
-          parents_found++;
+          ++parents_found;
 
 #if 0
           if (f == 0)
@@ -502,7 +502,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
     printf("Not covered completely (%d).\n", pos_remaining);
 #endif
 
-  return (parents_found > 1) && (pos_remaining == 0);
+  return (parents_found > 1) and (pos_remaining == 0);
 }
 
 int find_best_parents(struct chimera_info_s * ci)
@@ -517,10 +517,7 @@ int find_best_parents(struct chimera_info_s * ci)
       ci->best_parents[f] = -1;
     }
 
-  bool cand_selected[ci->cand_count];
-
-  for (int i = 0; i < ci->cand_count; i++)
-    cand_selected[i] = false;
+  std::vector<bool> cand_selected(ci->cand_count, false);
 
   for (int f = 0; f < 2; f++)
     {
@@ -556,7 +553,7 @@ int find_best_parents(struct chimera_info_s * ci)
 
       for(int i = 0; i < ci->cand_count; i++)
         {
-          if (! cand_selected[i])
+          if (not cand_selected[i])
             {
               int sum = 0;
               for(int qpos = 0; qpos < ci->query_len; qpos++)
@@ -582,17 +579,15 @@ int find_best_parents(struct chimera_info_s * ci)
 
       /* find parent with the most wins */
 
-      int wins[ci->cand_count];
-      for (int i = 0; i < ci->cand_count; i++)
-        wins[i] = 0;
+      std::vector<int> wins(ci->cand_count, 0);
 
-      for(int qpos = window-1; qpos < ci->query_len; qpos++)
+      for(int qpos = window - 1; qpos < ci->query_len; qpos++)
         {
           if (ci->maxsmooth[qpos] != 0)
             {
-              for(int i=0; i < ci->cand_count; i++)
+              for(int i = 0; i < ci->cand_count; i++)
                 {
-                  if (! cand_selected[i])
+                  if (not cand_selected[i])
                     {
                       int z = i * ci->query_len + qpos;
                       if (ci->smooth[z] == ci->maxsmooth[qpos])
@@ -619,8 +614,9 @@ int find_best_parents(struct chimera_info_s * ci)
 
       /* terminate loop if no parent found */
 
-      if (best_parent_cand[f] < 0)
+      if (best_parent_cand[f] < 0) {
         break;
+      }
 
 #if 0
       printf("Query %d: Best parent (%d) candidate: %d. Wins: %d\n",
@@ -633,7 +629,7 @@ int find_best_parents(struct chimera_info_s * ci)
 
   /* Check if at least 2 candidates selected */
 
-  return (best_parent_cand[0] >= 0) && (best_parent_cand[1] >= 0);
+  return (best_parent_cand[0] >= 0) and (best_parent_cand[1] >= 0);
 }
 
 
@@ -676,7 +672,7 @@ int find_max_alignment_length(struct chimera_info_s * ci)
 
   /* find total alignment length */
   int alnlen = 0;
-  for(int i=0; i < ci->query_len+1; i++)
+  for(int i = 0; i < ci->query_len + 1; i++)
     {
       alnlen += ci->maxi[i];
     }
@@ -728,11 +724,11 @@ void fill_alignment_parents(struct chimera_info_s * ci)
             }
           else
             {
-              for(int x=0; x < run; x++)
+              for(int x = 0; x < run; x++)
                 {
-                  if (!inserted)
+                  if (not inserted)
                     {
-                      for(int y=0; y < ci->maxi[qpos]; y++)
+                      for(int y = 0; y < ci->maxi[qpos]; y++)
                         {
                           *t++ = '-';
                         }
@@ -747,7 +743,7 @@ void fill_alignment_parents(struct chimera_info_s * ci)
                       *t++ = '-';
                     }
 
-                  qpos++;
+                  ++qpos;
                   inserted = 0;
                 }
             }
@@ -755,7 +751,7 @@ void fill_alignment_parents(struct chimera_info_s * ci)
 
       /* add any gaps at the end */
 
-      if (! inserted)
+      if (not inserted)
         {
           for(int x=0; x < ci->maxi[qpos]; x++)
             {
@@ -784,11 +780,11 @@ int eval_parents_long(struct chimera_info_s * ci)
   int m = 0;
   char * q = ci->qaln;
   int qpos = 0;
-  for (int i=0; i < ci->query_len; i++)
+  for (int i = 0; i < ci->query_len; i++)
     {
       if (qpos >= (ci->best_start[m] + ci->best_len[m]))
         m++;
-      for (int j=0; j < ci->maxi[i]; j++)
+      for (int j = 0; j < ci->maxi[i]; j++)
         {
           *q++ = '-';
           *pm++ = 'A' + m;
@@ -796,7 +792,7 @@ int eval_parents_long(struct chimera_info_s * ci)
       *q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
       *pm++ = 'A' + m;
     }
-  for (int j=0; j < ci->maxi[ci->query_len]; j++)
+  for (int j = 0; j < ci->maxi[ci->query_len]; j++)
     {
       *q++ = '-';
       *pm++ = 'A' + m;
@@ -816,7 +812,7 @@ int eval_parents_long(struct chimera_info_s * ci)
       /* lower case parent symbols that differ from query */
 
       for (int f = 0; f < ci->parents_found; f++)
-        if (psym[f] && (psym[f] != qsym))
+        if (psym[f] and (psym[f] != qsym))
           ci->paln[f][i] = tolower(ci->paln[f][i]);
 
       /* compute diffs */
@@ -825,7 +821,7 @@ int eval_parents_long(struct chimera_info_s * ci)
 
       bool all_defined = qsym;
       for (int f = 0; f < ci->parents_found; f++)
-        if (!psym[f])
+        if (! psym[f])
           all_defined = false;
 
       if (all_defined)
@@ -897,7 +893,7 @@ int eval_parents_long(struct chimera_info_s * ci)
 
   xpthread_mutex_lock(&mutex_output);
 
-  if (opt_alnout && (status == 4))
+  if (opt_alnout and (status == 4))
     {
       fprintf(fp_uchimealns, "\n");
       fprintf(fp_uchimealns, "----------------------------------------"
@@ -1061,7 +1057,7 @@ int eval_parents(struct chimera_info_s * ci)
 
   char * q = ci->qaln;
   int qpos = 0;
-  for (int i=0; i < ci->query_len; i++)
+  for (int i = 0; i < ci->query_len; i++)
     {
       for (int j=0; j < ci->maxi[i]; j++)
         {
@@ -1069,7 +1065,7 @@ int eval_parents(struct chimera_info_s * ci)
         }
       *q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
     }
-  for (int j=0; j < ci->maxi[ci->query_len]; j++)
+  for (int j = 0; j < ci->maxi[ci->query_len]; j++)
     {
       *q++ = '-';
     }
@@ -1110,12 +1106,12 @@ int eval_parents(struct chimera_info_s * ci)
 
       /* lower case parent symbols that differ from query */
 
-      if (p1sym && (p1sym != qsym))
+      if (p1sym and (p1sym != qsym))
         {
           ci->paln[0][i] = tolower(ci->paln[0][i]);
         }
 
-      if (p2sym && (p2sym != qsym))
+      if (p2sym and (p2sym != qsym))
         {
           ci->paln[1][i] = tolower(ci->paln[1][i]);
         }
@@ -1124,7 +1120,7 @@ int eval_parents(struct chimera_info_s * ci)
 
       char diff;
 
-      if (qsym && p1sym && p2sym)
+      if (qsym and p1sym and p2sym)
         {
           if (p1sym == p2sym)
             {
@@ -1171,21 +1167,21 @@ int eval_parents(struct chimera_info_s * ci)
 
   for (int i = 0; i < alnlen; i++)
     {
-      if (!ci->ignore[i])
+      if (not ci->ignore[i])
         {
           char diff = ci->diffs[i];
 
           if (diff == 'A')
             {
-              sumA++;
+              ++sumA;
             }
           else if (diff == 'B')
             {
-              sumB++;
+              ++sumB;
             }
           else if (diff != ' ')
             {
-              sumN++;
+              ++sumN;
             }
         }
 
@@ -1209,32 +1205,34 @@ int eval_parents(struct chimera_info_s * ci)
   int best_left_a = 0;
   int best_right_a = 0;
 
-  for (int i=0; i<alnlen; i++)
+  for (int i = 0; i < alnlen; i++)
     {
-      if(!ci->ignore[i])
+      if(not ci->ignore[i])
         {
           char diff = ci->diffs[i];
           if (diff != ' ')
             {
               if (diff == 'A')
                 {
-                  left_y++;
-                  right_n--;
+                  ++left_y;
+                  --right_n;
                 }
               else if (diff == 'B')
                 {
-                  left_n++;
-                  right_y--;
+                  ++left_n;
+                  --right_y;
                 }
               else
                 {
-                  left_a++;
-                  right_a--;
+                  ++left_a;
+                  --right_a;
                 }
 
-              double left_h, right_h, h;
+              double left_h = 0;
+              double right_h = 0;
+              double h = 0;
 
-              if ((left_y > left_n) && (right_y > right_n))
+              if ((left_y > left_n) and (right_y > right_n))
                 {
                   left_h = left_y / (opt_xn * (left_n + opt_dn) + left_a);
                   right_h = right_y / (opt_xn * (right_n + opt_dn) + right_a);
@@ -1253,7 +1251,7 @@ int eval_parents(struct chimera_info_s * ci)
                       best_right_a = right_a;
                     }
                 }
-              else if ((left_n > left_y) && (right_n > right_y))
+              else if ((left_n > left_y) and (right_n > right_y))
                 {
                   /* swap left/right and yes/no */
 
@@ -1369,7 +1367,7 @@ int eval_parents(struct chimera_info_s * ci)
 
       for(int i = 0; i < alnlen; i++)
         {
-          if (! ci->ignore[i])
+          if (not ci->ignore[i])
             {
               cols++;
 
@@ -1414,9 +1412,10 @@ int eval_parents(struct chimera_info_s * ci)
       int sumL = best_left_n + best_left_a + best_left_y;
       int sumR = best_right_n + best_right_a + best_right_y;
 
-      if (opt_uchime2_denovo || opt_uchime3_denovo)
+      if (opt_uchime2_denovo or opt_uchime3_denovo)
         {
-          if ((QM == 100.0) && (QT < 100.0))
+          // fix -Wfloat-equal: if match_QM == cols, then QM == 100.0
+          if ((match_QM == cols) and (QT < 100.0))
             {
               status = 4;
             }
@@ -1425,8 +1424,8 @@ int eval_parents(struct chimera_info_s * ci)
         if (best_h >= opt_minh)
           {
             status = 3;
-            if ((divdiff >= opt_mindiv) &&
-                (sumL >= opt_mindiffs) &&
+            if ((divdiff >= opt_mindiv) and
+                (sumL >= opt_mindiffs) and
                 (sumR >= opt_mindiffs))
               {
                 status = 4;
@@ -1437,7 +1436,7 @@ int eval_parents(struct chimera_info_s * ci)
 
       xpthread_mutex_lock(&mutex_output);
 
-      if (opt_uchimealns && (status == 4))
+      if (opt_uchimealns and (status == 4))
         {
           fprintf(fp_uchimealns, "\n");
           fprintf(fp_uchimealns, "----------------------------------------"
@@ -1617,6 +1616,7 @@ int eval_parents(struct chimera_info_s * ci)
   return status;
 }
 
+// refactoring: enum struct status {};
 /*
   new chimeric status:
   0: no parents, non-chimeric
@@ -1851,7 +1851,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
 
       if (opt_uchime_ref)
         {
-          if (fasta_next(query_fasta_h, ! opt_notrunclabels,
+          if (fasta_next(query_fasta_h, not opt_notrunclabels,
                          chrmap_no_change))
             {
               ci->query_head_len = fasta_get_header_length(query_fasta_h);
@@ -1909,13 +1909,13 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
 
       if (ci->query_len >= parts)
         {
-          for (int i=0; i<parts; i++)
+          for (int i = 0; i < parts; i++)
             {
               struct hit * hits;
               int hit_count;
-              search_onequery(ci->si+i, opt_qmask);
-              search_joinhits(ci->si+i, nullptr, & hits, & hit_count);
-              for(int j=0; j<hit_count; j++)
+              search_onequery(ci->si + i, opt_qmask);
+              search_joinhits(ci->si + i, nullptr, & hits, & hit_count);
+              for(int j = 0; j < hit_count; j++)
                 {
                   if (hits[j].accepted)
                     {
@@ -1926,7 +1926,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
             }
         }
 
-      for(int i=0; i < allhits_count; i++)
+      for(int i = 0; i < allhits_count; i++)
         {
           unsigned int target = allhits_list[i].target;
 
@@ -1968,7 +1968,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
                ci->snwgaps,
                ci->nwcigar);
 
-      for(int i=0; i < ci->cand_count; i++)
+      for(int i = 0; i < ci->cand_count; i++)
         {
           int64_t target = ci->cand_list[i];
           int64_t nwscore = ci->snwscore[i];
@@ -2114,7 +2114,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
           nonchimera_abundance += ci->query_size;
 
           /* output no parents, no chimeras */
-          if ((status < 2) && opt_uchimeout)
+          if ((status < 2) and opt_uchimeout)
             {
               fprintf(fp_uchimeout, "0.0000\t");
 
@@ -2160,13 +2160,13 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
       if (status < 3)
         {
           /* uchime_denovo: add non-chimeras to db */
-          if (opt_uchime_denovo || opt_uchime2_denovo || opt_uchime3_denovo || opt_chimeras_denovo)
+          if (opt_uchime_denovo or opt_uchime2_denovo or opt_uchime3_denovo or opt_chimeras_denovo)
             {
               dbindex_addsequence(seqno, opt_qmask);
             }
         }
 
-      for (int i=0; i < ci->cand_count; i++)
+      for (int i = 0; i < ci->cand_count; i++)
         {
           if (ci->nwcigar[i])
             {
@@ -2213,14 +2213,14 @@ void chimera_threads_run()
   xpthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
 
   /* create worker threads */
-  for(int64_t t=0; t<opt_threads; t++)
+  for(int64_t t = 0; t < opt_threads; t++)
     {
-      xpthread_create(pthread+t, & attr,
+      xpthread_create(pthread + t, & attr,
                       chimera_thread_worker, (void*)t);
     }
 
   /* finish worker threads */
-  for(int t=0; t<opt_threads; t++)
+  for(int t = 0; t < opt_threads; t++)
     {
       xpthread_join(pthread[t], nullptr);
     }
@@ -2233,7 +2233,7 @@ void open_chimera_file(FILE * * f, char * name)
   if (name)
     {
       *f = fopen_output(name);
-      if (!*f)
+      if (! *f)
         {
           fatal("Unable to open file %s for writing", name);
         }
@@ -2325,7 +2325,7 @@ void chimera()
             {
               dust_all();
             }
-          else if ((opt_dbmask == MASK_SOFT) && (opt_hardmask))
+          else if ((opt_dbmask == MASK_SOFT) and (opt_hardmask))
             {
               hardmask_all();
             }
@@ -2364,7 +2364,7 @@ void chimera()
         {
           dust_all();
         }
-      else if ((opt_qmask == MASK_SOFT) && (opt_hardmask))
+      else if ((opt_qmask == MASK_SOFT) and (opt_hardmask))
         {
           hardmask_all();
         }
@@ -2376,13 +2376,36 @@ void chimera()
 
   if (opt_log)
     {
-      fprintf(fp_log, "%8.2f  minh\n", opt_minh);
-      fprintf(fp_log, "%8.2f  xn\n", opt_xn);
-      fprintf(fp_log, "%8.2f  dn\n", opt_dn);
-      fprintf(fp_log, "%8.2f  xa\n", 1.0);
-      fprintf(fp_log, "%8.2f  mindiv\n", opt_mindiv);
+      if (opt_uchime_ref || opt_uchime_denovo)
+        {
+          fprintf(fp_log, "%8.2f  minh\n", opt_minh);
+        }
+
+      if (opt_uchime_ref ||
+          opt_uchime_denovo ||
+          opt_uchime2_denovo ||
+          opt_uchime3_denovo)
+        {
+          fprintf(fp_log, "%8.2f  xn\n", opt_xn);
+          fprintf(fp_log, "%8.2f  dn\n", opt_dn);
+          fprintf(fp_log, "%8.2f  xa\n", 1.0);
+        }
+
+      if (opt_uchime_ref || opt_uchime_denovo)
+        {
+          fprintf(fp_log, "%8.2f  mindiv\n", opt_mindiv);
+        }
+
       fprintf(fp_log, "%8.2f  id\n", opt_id);
-      fprintf(fp_log, "%8d  maxp\n", 2);
+
+      if (opt_uchime_ref ||
+          opt_uchime_denovo ||
+          opt_uchime2_denovo ||
+          opt_uchime3_denovo)
+        {
+          fprintf(fp_log, "%8d  maxp\n", 2);
+        }
+
       fprintf(fp_log, "\n");
     }
 
@@ -2393,7 +2416,7 @@ void chimera()
 
   progress_done();
 
-  if (!opt_quiet)
+  if (! opt_quiet)
     {
       if (total_count > 0)
         {


=====================================
src/cluster.cc
=====================================
@@ -430,7 +430,7 @@ void cluster_core_results_hit(struct hit * best,
     {
       results_show_uc_one(fp_uc,
                           best, query_head,
-                          qsequence, qseqlen, qsequence_rc,
+                          qseqlen,
                           clusterno);
     }
 
@@ -438,14 +438,14 @@ void cluster_core_results_hit(struct hit * best,
     {
       results_show_alnout(fp_alnout,
                           best, 1, query_head,
-                          qsequence, qseqlen, qsequence_rc);
+                          qsequence, qseqlen);
     }
 
   if (fp_samout)
     {
       results_show_samout(fp_samout,
                           best, 1, query_head,
-                          qsequence, qseqlen, qsequence_rc);
+                          qsequence, qsequence_rc);
     }
 
   if (fp_fastapairs)
@@ -454,7 +454,6 @@ void cluster_core_results_hit(struct hit * best,
                                   best,
                                   query_head,
                                   qsequence,
-                                  qseqlen,
                                   qsequence_rc);
     }
 
@@ -471,11 +470,7 @@ void cluster_core_results_hit(struct hit * best,
   if (fp_tsegout)
     {
       results_show_tsegout_one(fp_tsegout,
-                               best,
-                               query_head,
-                               qsequence,
-                               qseqlen,
-                               qsequence_rc);
+                               best);
     }
 
   if (fp_userout)
@@ -487,7 +482,7 @@ void cluster_core_results_hit(struct hit * best,
   if (fp_blast6out)
     {
       results_show_blast6out_one(fp_blast6out, best, query_head,
-                                 qsequence, qseqlen, qsequence_rc);
+                                 qseqlen);
     }
 
   if (opt_matched)
@@ -551,7 +546,7 @@ void cluster_core_results_nohit(int clusterno,
       if (fp_blast6out)
         {
           results_show_blast6out_one(fp_blast6out, nullptr, query_head,
-                                     qsequence, qseqlen, qsequence_rc);
+                                     qseqlen);
         }
     }
 


=====================================
src/db.cc
=====================================
@@ -255,8 +255,8 @@ void db_read(const char * filename, int upcase)
       if (sequences > 0)
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, "
-                  "min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, "
+                  "min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n",
                   db_getnucleotidecount(),
                   db_getsequencecount(),
                   db_getshortestsequence(),
@@ -266,7 +266,7 @@ void db_read(const char * filename, int upcase)
       else
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   db_getnucleotidecount(),
                   db_getsequencecount());
         }
@@ -277,8 +277,8 @@ void db_read(const char * filename, int upcase)
       if (sequences > 0)
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, "
-                  "min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, "
+                  "min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n\n",
                   db_getnucleotidecount(),
                   db_getsequencecount(),
                   db_getshortestsequence(),
@@ -288,7 +288,7 @@ void db_read(const char * filename, int upcase)
       else
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n\n",
                   db_getnucleotidecount(),
                   db_getsequencecount());
         }


=====================================
src/derep.cc
=====================================
@@ -701,8 +701,8 @@ void derep(char * input_filename, bool use_header)
       if (sequencecount > 0)
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
-                  ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+                  ", max %" PRIu64 ", avg %.0f\n",
                   nucleotidecount,
                   sequencecount,
                   shortest,
@@ -712,7 +712,7 @@ void derep(char * input_filename, bool use_header)
       else
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   nucleotidecount,
                   sequencecount);
         }
@@ -723,8 +723,8 @@ void derep(char * input_filename, bool use_header)
       if (sequencecount > 0)
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
-                  ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+                  ", max %" PRIu64 ", avg %.0f\n",
                   nucleotidecount,
                   sequencecount,
                   shortest,
@@ -734,7 +734,7 @@ void derep(char * input_filename, bool use_header)
       else
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   nucleotidecount,
                   sequencecount);
         }


=====================================
src/derepsmallmem.cc
=====================================
@@ -117,20 +117,30 @@ double find_median()
             }
         }
 
-      double mid = below_count + cand_count + above_count;
-      if (mid == 0)
+      if (below_count + cand_count + above_count == 0U) // fix -Wfloat-equal
         return 0;
-      mid = mid / 2.0;
 
-      if (mid >= below_count)
+      if (above_count + cand_count >= below_count)
+        // mid >= below_count
         {
-          if (mid <= below_count + cand_count)
+          if (above_count <= below_count + cand_count)
+            // mid <= below_count + cand_count
             {
-              if (mid == below_count + cand_count)
+              if (above_count == below_count + cand_count)
+                // mid == below_count + cand_count
+                // same as:
+                // (below_count + cand_count + above_count) / 2 == below_count + cand_count
+                // which simplifies into:
+                // above_count == below_count + cand_count
                 {
                   return (cand + above) / 2.0;
                 }
-              else if (mid == below_count)
+              else if (above_count + cand_count == below_count)
+                // mid == below_count
+                // same as:
+                // (below_count + cand_count + above_count) / 2 == below_count
+                // which simplifies into:
+                // above_count + cand_count == below_count
                 {
                   return (below + cand) / 2.0;
                 }
@@ -233,7 +243,7 @@ void derep_smallmem(char * input_filename)
     }
   else
     {
-      fatal("Ouput file for dereplication must be specified with --fastaout");
+      fatal("Output file for dereplication must be specified with --fastaout");
     }
 
   uint64_t filesize = fastx_get_size(h);
@@ -412,8 +422,8 @@ void derep_smallmem(char * input_filename)
       if (sequencecount > 0)
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
-                  ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+                  ", max %" PRIu64 ", avg %.0f\n",
                   nucleotidecount,
                   sequencecount,
                   shortest,
@@ -423,7 +433,7 @@ void derep_smallmem(char * input_filename)
       else
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   nucleotidecount,
                   sequencecount);
         }
@@ -434,8 +444,8 @@ void derep_smallmem(char * input_filename)
       if (sequencecount > 0)
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
-                  ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+                  ", max %" PRIu64 ", avg %.0f\n",
                   nucleotidecount,
                   sequencecount,
                   shortest,
@@ -445,7 +455,7 @@ void derep_smallmem(char * input_filename)
       else
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   nucleotidecount,
                   sequencecount);
         }


=====================================
src/eestats.cc
=====================================
@@ -59,6 +59,8 @@
 */
 
 #include "vsearch.h"
+#include <algorithm>  // std::max
+
 
 inline int fastq_get_qual_eestats(char q)
 {
@@ -476,7 +478,10 @@ void fastq_eestats2()
       if (len > longest)
         {
           longest = len;
-          int new_len_steps = 1 + MAX(0, (MIN(longest, (uint64_t)opt_length_cutoffs_longest) - opt_length_cutoffs_shortest) / opt_length_cutoffs_increment);
+          // opt_length_cutoffs_longest is an int between 1 and INT_MAX
+          int high = MIN(longest, (uint64_t)(opt_length_cutoffs_longest));
+          int new_len_steps = 1 + MAX(0, ((high - opt_length_cutoffs_shortest)
+                                          / opt_length_cutoffs_increment));
 
           if (new_len_steps > len_steps)
             {


=====================================
src/kmerhash.cc
=====================================
@@ -59,6 +59,8 @@
 */
 
 #include "vsearch.h"
+#include <vector>
+
 
 #define HASH CityHash64
 
@@ -173,9 +175,7 @@ void kh_insert_kmers(struct kh_handle_s * kh, int k, char * seq, int len)
 
 int kh_find_best_diagonal(struct kh_handle_s * kh, int k, char * seq, int len)
 {
-  int diag_counts[kh->maxpos];
-
-  memset(diag_counts, 0, kh->maxpos * sizeof(int));
+  std::vector<int> diag_counts(kh->maxpos, 0);
 
   int kmers = 1 << (2 * k);
   unsigned int kmer_mask = kmers - 1;


=====================================
src/mask.cc
=====================================
@@ -179,6 +179,7 @@ static int seqcount = 0;
 
 void * dust_all_worker(void * vp)
 {
+  (void) vp; // not used, but required for thread creation
   while(true)
     {
       xpthread_mutex_lock(&mutex);


=====================================
src/mergepairs.cc
=====================================
@@ -59,6 +59,7 @@
 */
 
 #include "vsearch.h"
+#include <vector>
 
 /* chunk constants */
 
@@ -700,10 +701,10 @@ int64_t optimize(merge_data_t * ip,
 
   int kmers = 0;
 
-  int diags[ip->fwd_trunc + ip->rev_trunc];
+  std::vector<int> diags(ip->fwd_trunc + ip->rev_trunc, 0);
 
   kh_insert_kmers(kmerhash, k, ip->fwd_sequence, ip->fwd_trunc);
-  kh_find_diagonals(kmerhash, k, ip->rev_sequence, ip->rev_trunc, diags);
+  kh_find_diagonals(kmerhash, k, ip->rev_sequence, ip->rev_trunc, diags.data());
 
   for(int64_t i = i1; i <= i2; i++)
     {


=====================================
src/otutable.cc
=====================================
@@ -154,107 +154,126 @@ void otutable_add(char * query_header, char * target_header, int64_t abundance)
 {
   /* read sample annotation in query */
 
-  int len_sample;
+  int len_sample = 0;
   char * start_sample = query_header;
+  char * sample_name = nullptr;
 
-#ifdef HAVE_REGEX_H
-  regmatch_t pmatch_sample[5];
-  if (!regexec(&otutable->regex_sample, query_header, 5, pmatch_sample, 0))
+  if (query_header)
     {
-      /* match: use the matching sample name */
-      len_sample = pmatch_sample[3].rm_eo - pmatch_sample[3].rm_so;
-      start_sample += pmatch_sample[3].rm_so;
-    }
+#ifdef HAVE_REGEX_H
+      regmatch_t pmatch_sample[5];
+      if (!regexec(&otutable->regex_sample, query_header, 5, pmatch_sample, 0))
+        {
+          /* match: use the matching sample name */
+          len_sample = pmatch_sample[3].rm_eo - pmatch_sample[3].rm_so;
+          start_sample += pmatch_sample[3].rm_so;
+        }
 #else
-  std::cmatch cmatch_sample;
-  if (regex_search(query_header, cmatch_sample, regex_sample))
-    {
-      len_sample = cmatch_sample.length(3);
-      start_sample += cmatch_sample.position(3);
-    }
+      std::cmatch cmatch_sample;
+      if (regex_search(query_header, cmatch_sample, regex_sample))
+        {
+          len_sample = cmatch_sample.length(3);
+          start_sample += cmatch_sample.position(3);
+        }
 #endif
-  else
-    {
-      /* no match: use first name in header with A-Za-z0-9_ */
-      len_sample = strspn(query_header,
-                          "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-                          "abcdefghijklmnopqrstuvwxyz"
-                          "_"
-                          "0123456789");
+      else
+        {
+          /* no match: use first name in header with A-Za-z0-9_ */
+          len_sample = strspn(query_header,
+                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+                              "abcdefghijklmnopqrstuvwxyz"
+                              "_"
+                              "0123456789");
+        }
+
+      sample_name = (char *) xmalloc(len_sample+1);
+      strncpy(sample_name, start_sample, len_sample);
+      sample_name[len_sample] = 0;
     }
-  char * sample_name = (char *) xmalloc(len_sample+1);
-  strncpy(sample_name, start_sample, len_sample);
-  sample_name[len_sample] = 0;
 
 
   /* read OTU annotation in target */
 
-  int len_otu;
+  int len_otu = 0;
   char * start_otu = target_header;
+  char * otu_name = nullptr;
 
-#ifdef HAVE_REGEX_H
-  regmatch_t pmatch_otu[4];
-  if (!regexec(&otutable->regex_otu, target_header, 4, pmatch_otu, 0))
+  if (target_header)
     {
-      /* match: use the matching otu name */
-      len_otu = pmatch_otu[2].rm_eo - pmatch_otu[2].rm_so;
-      start_otu += pmatch_otu[2].rm_so;
-    }
+#ifdef HAVE_REGEX_H
+      regmatch_t pmatch_otu[4];
+      if (!regexec(&otutable->regex_otu, target_header, 4, pmatch_otu, 0))
+        {
+          /* match: use the matching otu name */
+          len_otu = pmatch_otu[2].rm_eo - pmatch_otu[2].rm_so;
+          start_otu += pmatch_otu[2].rm_so;
+        }
 #else
-  std::cmatch cmatch_otu;
-  if (regex_search(target_header, cmatch_otu, regex_otu))
-    {
-      len_otu = cmatch_otu.length(2);
-      start_otu += cmatch_otu.position(2);
-    }
+      std::cmatch cmatch_otu;
+      if (regex_search(target_header, cmatch_otu, regex_otu))
+        {
+          len_otu = cmatch_otu.length(2);
+          start_otu += cmatch_otu.position(2);
+        }
 #endif
-  else
-    {
-      /* no match: use first name in header up to ; */
-      len_otu = strcspn(target_header, ";");
-    }
-  char * otu_name = (char *) xmalloc(len_otu+1);
-  strncpy(otu_name, start_otu, len_otu);
-  otu_name[len_otu] = 0;
+      else
+        {
+          /* no match: use first name in header up to ; */
+          len_otu = strcspn(target_header, ";");
+        }
 
+      otu_name = (char *) xmalloc(len_otu+1);
+      strncpy(otu_name, start_otu, len_otu);
+      otu_name[len_otu] = 0;
 
-  /* read tax annotation in target */
+      /* read tax annotation in target */
 
 #ifdef HAVE_REGEX_H
-  char * start_tax = target_header;
+      char * start_tax = target_header;
 
-  regmatch_t pmatch_tax[4];
-  if (!regexec(&otutable->regex_tax, target_header, 4, pmatch_tax, 0))
-    {
-      /* match: use the matching tax name */
-      int len_tax = pmatch_tax[2].rm_eo - pmatch_tax[2].rm_so;
-      start_tax += pmatch_tax[2].rm_so;
-
-      char * tax_name = (char *) xmalloc(len_tax+1);
-      strncpy(tax_name, start_tax, len_tax);
-      tax_name[len_tax] = 0;
-      otutable->otu_tax_map[otu_name] = tax_name;
-      xfree(tax_name);
-    }
+      regmatch_t pmatch_tax[4];
+      if (!regexec(&otutable->regex_tax, target_header, 4, pmatch_tax, 0))
+        {
+          /* match: use the matching tax name */
+          int len_tax = pmatch_tax[2].rm_eo - pmatch_tax[2].rm_so;
+          start_tax += pmatch_tax[2].rm_so;
+
+          char * tax_name = (char *) xmalloc(len_tax+1);
+          strncpy(tax_name, start_tax, len_tax);
+          tax_name[len_tax] = 0;
+          otutable->otu_tax_map[otu_name] = tax_name;
+          xfree(tax_name);
+        }
 #else
-  std::cmatch cmatch_tax;
-  if (regex_search(target_header, cmatch_tax, regex_tax))
-    {
-      otutable->otu_tax_map[otu_name] = cmatch_tax.str(2);
-    }
+      std::cmatch cmatch_tax;
+      if (regex_search(target_header, cmatch_tax, regex_tax))
+        {
+          otutable->otu_tax_map[otu_name] = cmatch_tax.str(2);
+        }
 #endif
+    }
 
   /* store data */
 
-  otutable->sample_set.insert(sample_name);
-  otutable->otu_set.insert(otu_name);
-  otutable->sample_otu_count[string_pair_t(sample_name,otu_name)]
-    += abundance;
-  otutable->otu_sample_count[string_pair_t(otu_name,sample_name)]
-    += abundance;
+  if (sample_name)
+    otutable->sample_set.insert(sample_name);
+
+  if (otu_name)
+    otutable->otu_set.insert(otu_name);
+
+  if (sample_name && otu_name && abundance)
+    {
+      otutable->sample_otu_count[string_pair_t(sample_name,otu_name)]
+        += abundance;
+      otutable->otu_sample_count[string_pair_t(otu_name,sample_name)]
+        += abundance;
+    }
+
+  if (otu_name)
+    xfree(otu_name);
 
-  xfree(otu_name);
-  xfree(sample_name);
+  if (sample_name)
+    xfree(sample_name);
 }
 
 void otutable_print_otutabout(FILE * fp)


=====================================
src/results.cc
=====================================
@@ -64,7 +64,6 @@ void results_show_fastapairs_one(FILE * fp,
                                  struct hit * hp,
                                  char * query_head,
                                  char * qsequence,
-                                 int64_t qseqlen,
                                  char * rc)
 {
   /* http://www.drive5.com/usearch/manual/fastapairs.html */
@@ -144,11 +143,7 @@ void results_show_qsegout_one(FILE * fp,
 }
 
 void results_show_tsegout_one(FILE * fp,
-                              struct hit * hp,
-                              char * query_head,
-                              char * qsequence,
-                              int64_t qseqlen,
-                              char * rc)
+                              struct hit * hp)
 {
   if (hp)
     {
@@ -176,9 +171,7 @@ void results_show_tsegout_one(FILE * fp,
 void results_show_blast6out_one(FILE * fp,
                                 struct hit * hp,
                                 char * query_head,
-                                char * qsequence,
-                                int64_t qseqlen,
-                                char * rc)
+                                int64_t qseqlen)
 {
 
   /*
@@ -242,9 +235,7 @@ void results_show_blast6out_one(FILE * fp,
 void results_show_uc_one(FILE * fp,
                          struct hit * hp,
                          char * query_head,
-                         char * qsequence,
                          int64_t qseqlen,
-                         char * rc,
                          int clusterno)
 {
   /*
@@ -519,10 +510,7 @@ void results_show_userout_one(FILE * fp, struct hit * hp,
 void results_show_lcaout(FILE * fp,
                          struct hit * hits,
                          int hitcount,
-                         char * query_head,
-                         char * qsequence,
-                         int64_t qseqlen,
-                         char * rc)
+                         char * query_head)
 {
   /* Output last common ancestor (LCA) of the hits,
      in a similar way to the Sintax command */
@@ -665,8 +653,7 @@ void results_show_alnout(FILE * fp,
                          int hitcount,
                          char * query_head,
                          char * qsequence,
-                         int64_t qseqlen,
-                         char * rc)
+                         int64_t qseqlen)
 {
   /* http://drive5.com/usearch/manual/alnout.html */
 
@@ -904,7 +891,6 @@ void results_show_samout(FILE * fp,
                          int hitcount,
                          char * query_head,
                          char * qsequence,
-                         int64_t qseqlen,
                          char * rc)
 {
   /*


=====================================
src/results.h
=====================================
@@ -63,30 +63,22 @@ void results_show_alnout(FILE * fp,
                          int hitcount,
                          char * query_head,
                          char * qsequence,
-                         int64_t qseqlen,
-                         char * rc);
+                         int64_t qseqlen);
 
 void results_show_lcaout(FILE * fp,
                          struct hit * hits,
                          int hitcount,
-                         char * query_head,
-                         char * qsequence,
-                         int64_t qseqlen,
-                         char * rc);
+                         char * query_head);
 
 void results_show_blast6out_one(FILE * fp,
                                 struct hit * hp,
                                 char * query_head,
-                                char * qsequence,
-                                int64_t qseqlen,
-                                char * rc);
+                                int64_t qseqlen);
 
 void results_show_uc_one(FILE * fp,
                          struct hit * hp,
                          char * query_head,
-                         char * qsequence,
                          int64_t qseqlen,
-                         char * rc,
                          int clusterno);
 
 void results_show_userout_one(FILE * fp,
@@ -100,7 +92,6 @@ void results_show_fastapairs_one(FILE * fp,
                                  struct hit * hp,
                                  char * query_head,
                                  char * qsequence,
-                                 int64_t qseqlen,
                                  char * rc);
 
 void results_show_qsegout_one(FILE * fp,
@@ -111,11 +102,7 @@ void results_show_qsegout_one(FILE * fp,
                               char * rc);
 
 void results_show_tsegout_one(FILE * fp,
-                              struct hit * hp,
-                              char * query_head,
-                              char * qsequence,
-                              int64_t qseqlen,
-                              char * rc);
+                              struct hit * hp);
 
 void results_show_samheader(FILE * fp,
                             char * cmdline,
@@ -126,5 +113,4 @@ void results_show_samout(FILE * fp,
                          int hitcount,
                          char * query_head,
                          char * qsequence,
-                         int64_t qseqlen,
                          char * rc);


=====================================
src/search.cc
=====================================
@@ -118,8 +118,7 @@ void search_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
-                          qsequence_rc);
+                          qseqlen);
     }
 
   if (fp_lcaout)
@@ -127,10 +126,7 @@ void search_output_results(int hit_count,
       results_show_lcaout(fp_lcaout,
                           hits,
                           toreport,
-                          query_head,
-                          qsequence,
-                          qseqlen,
-                          qsequence_rc);
+                          query_head);
     }
 
   if (fp_samout)
@@ -140,7 +136,6 @@ void search_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
                           qsequence_rc);
     }
 
@@ -170,7 +165,6 @@ void search_output_results(int hit_count,
                                           hp,
                                           query_head,
                                           qsequence,
-                                          qseqlen,
                                           qsequence_rc);
             }
 
@@ -187,11 +181,7 @@ void search_output_results(int hit_count,
           if (fp_tsegout)
             {
               results_show_tsegout_one(fp_tsegout,
-                                       hp,
-                                       query_head,
-                                       qsequence,
-                                       qseqlen,
-                                       qsequence_rc);
+                                       hp);
             }
 
           if (fp_uc)
@@ -201,9 +191,7 @@ void search_output_results(int hit_count,
                   results_show_uc_one(fp_uc,
                                       hp,
                                       query_head,
-                                      qsequence,
                                       qseqlen,
-                                      qsequence_rc,
                                       hp->target);
                 }
             }
@@ -223,22 +211,25 @@ void search_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          hp,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }
   else
     {
+      if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+        {
+          otutable_add(query_head,
+                       nullptr,
+                       qsize);
+        }
+
       if (fp_uc)
         {
           results_show_uc_one(fp_uc,
                               nullptr,
                               query_head,
-                              qsequence,
                               qseqlen,
-                              qsequence_rc,
                               0);
         }
 
@@ -259,9 +250,7 @@ void search_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          nullptr,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }
@@ -904,6 +893,13 @@ void usearch_global(char * cmdline, char * progheader)
         }
     }
 
+
+  // Add OTUs with no matches to OTU table
+  if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+    for(int64_t i=0; i<seqcount; i++)
+      if (! dbmatched[i])
+        otutable_add(nullptr, db_getheader(i), 0);
+
   if (opt_biomout)
     {
       otutable_print_biomout(fp_biomout);


=====================================
src/searchexact.cc
=====================================
@@ -193,8 +193,7 @@ void search_exact_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
-                          qsequence_rc);
+                          qseqlen);
     }
 
   if (fp_samout)
@@ -204,7 +203,6 @@ void search_exact_output_results(int hit_count,
                           toreport,
                           query_head,
                           qsequence,
-                          qseqlen,
                           qsequence_rc);
     }
 
@@ -234,7 +232,6 @@ void search_exact_output_results(int hit_count,
                                           hp,
                                           query_head,
                                           qsequence,
-                                          qseqlen,
                                           qsequence_rc);
             }
 
@@ -251,11 +248,7 @@ void search_exact_output_results(int hit_count,
           if (fp_tsegout)
             {
               results_show_tsegout_one(fp_tsegout,
-                                       hp,
-                                       query_head,
-                                       qsequence,
-                                       qseqlen,
-                                       qsequence_rc);
+                                       hp);
             }
 
           if (fp_uc)
@@ -265,9 +258,7 @@ void search_exact_output_results(int hit_count,
                   results_show_uc_one(fp_uc,
                                       hp,
                                       query_head,
-                                      qsequence,
                                       qseqlen,
-                                      qsequence_rc,
                                       hp->target);
                 }
             }
@@ -287,22 +278,25 @@ void search_exact_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          hp,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }
   else
     {
+      if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+        {
+          otutable_add(query_head,
+                       nullptr,
+                       qsize);
+        }
+
       if (fp_uc)
         {
           results_show_uc_one(fp_uc,
                               nullptr,
                               query_head,
-                              qsequence,
                               qseqlen,
-                              qsequence_rc,
                               0);
         }
 
@@ -323,9 +317,7 @@ void search_exact_output_results(int hit_count,
               results_show_blast6out_one(fp_blast6out,
                                          nullptr,
                                          query_head,
-                                         qsequence,
-                                         qseqlen,
-                                         qsequence_rc);
+                                         qseqlen);
             }
         }
     }
@@ -912,6 +904,12 @@ void search_exact(char * cmdline, char * progheader)
         }
     }
 
+  // Add OTUs with no matches to OTU table
+  if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+    for(int64_t i=0; i<seqcount; i++)
+      if (! dbmatched[i])
+        otutable_add(nullptr, db_getheader(i), 0);
+
   if (fp_biomout)
     {
       otutable_print_biomout(fp_biomout);


=====================================
src/sha1.h
=====================================
@@ -17,7 +17,7 @@ typedef struct {
 #define SHA1_DIGEST_SIZE 20
 
 void SHA1_Init(SHA1_CTX* context);
-void SHA1_Update(SHA1_CTX* context, const uint8_t* data, const size_t len);
+void SHA1_Update(SHA1_CTX* context, const uint8_t* data, size_t len);
 void SHA1_Final(SHA1_CTX* context, uint8_t digest[SHA1_DIGEST_SIZE]);
 
 #ifdef __cplusplus


=====================================
src/sintax.cc
=====================================
@@ -99,7 +99,6 @@ static int classified = 0;
 void sintax_analyse(char * query_head,
                     int strand,
                     int best_seqno,
-                    int best_count,
                     int * all_seqno,
                     int count)
 {
@@ -207,10 +206,6 @@ void sintax_analyse(char * query_head,
         }
     }
 
-#if 0
-  fprintf(fp_tabbedout, "\t%d\t%d", best_count, count);
-#endif
-
   fprintf(fp_tabbedout, "\n");
   xpthread_mutex_unlock(&mutex_output);
 }
@@ -313,7 +308,6 @@ void sintax_query(int64_t t)
   sintax_analyse(query_head,
                  best_strand,
                  best_seqno[best_strand],
-                 best_count[best_strand],
                  all_seqno[best_strand],
                  boot_count[best_strand]);
 


=====================================
src/udb.cc
=====================================
@@ -565,7 +565,7 @@ void udb_read(const char * filename,
       if (seqcount > 0)
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n",
                   db_getnucleotidecount(),
                   db_getsequencecount(),
                   db_getshortestsequence(),
@@ -575,7 +575,7 @@ void udb_read(const char * filename,
       else
         {
           fprintf(stderr,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n",
                   db_getnucleotidecount(),
                   db_getsequencecount());
         }
@@ -586,7 +586,7 @@ void udb_read(const char * filename,
       if (seqcount > 0)
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n\n",
                   db_getnucleotidecount(),
                   db_getsequencecount(),
                   db_getshortestsequence(),
@@ -596,7 +596,7 @@ void udb_read(const char * filename,
       else
         {
           fprintf(fp_log,
-                  "%'" PRIu64 " nt in %'" PRIu64 " seqs\n\n",
+                  "%" PRIu64 " nt in %" PRIu64 " seqs\n\n",
                   db_getnucleotidecount(),
                   db_getsequencecount());
         }


=====================================
src/vsearch.cc
=====================================
@@ -2728,6 +2728,8 @@ void args_init(int argc, char **argv)
         option_label_suffix,
         option_log,
         option_match,
+	option_maxseqlength,
+	option_minseqlength,
         option_mismatch,
         option_no_progress,
         option_nonchimeras,
@@ -4162,9 +4164,11 @@ void args_init(int argc, char **argv)
         option_lengthout,
         option_log,
         option_match,
+	option_maxseqlength,
         option_mindiffs,
         option_mindiv,
         option_minh,
+	option_minseqlength,
         option_mismatch,
         option_no_progress,
         option_nonchimeras,
@@ -4204,9 +4208,11 @@ void args_init(int argc, char **argv)
         option_lengthout,
         option_log,
         option_match,
+	option_maxseqlength,
         option_mindiffs,
         option_mindiv,
         option_minh,
+	option_minseqlength,
         option_mismatch,
         option_no_progress,
         option_nonchimeras,
@@ -4246,9 +4252,11 @@ void args_init(int argc, char **argv)
         option_lengthout,
         option_log,
         option_match,
+	option_maxseqlength,
         option_mindiffs,
         option_mindiv,
         option_minh,
+	option_minseqlength,
         option_mismatch,
         option_no_progress,
         option_nonchimeras,
@@ -4290,9 +4298,11 @@ void args_init(int argc, char **argv)
         option_lengthout,
         option_log,
         option_match,
+	option_maxseqlength,
         option_mindiffs,
         option_mindiv,
         option_minh,
+	option_minseqlength,
         option_mismatch,
         option_no_progress,
         option_nonchimeras,


=====================================
src/xstring.h
=====================================
@@ -110,7 +110,7 @@ class xstring
 
   void add_c(char c)
   {
-    size_t needed = 1;
+    const size_t needed = 1;
     if (length + needed + 1 > alloc)
       {
         alloc = length + needed + 1;
@@ -123,7 +123,7 @@ class xstring
 
   void add_d(int d)
   {
-    int needed = snprintf(nullptr, 0, "%d", d);
+    const int needed = snprintf(nullptr, 0, "%d", d);
     if (needed < 0)
       {
         fatal("snprintf failed");
@@ -140,7 +140,7 @@ class xstring
 
   void add_s(char * s)
   {
-    size_t needed = strlen(s);
+    const size_t needed = strlen(s);
     if (length + needed + 1 > alloc)
       {
         alloc = length + needed + 1;



View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/cc613f494df2cc0a08f39c9580db67638a98cc11

-- 
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/cc613f494df2cc0a08f39c9580db67638a98cc11
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20231127/7aac28e6/attachment-0001.htm>


More information about the debian-med-commit mailing list