[med-svn] [Git][med-team/vsearch][master] 5 commits: New upstream version 2.26.1
Étienne Mollier (@emollier)
gitlab at salsa.debian.org
Mon Nov 27 20:52:03 GMT 2023
Étienne Mollier pushed to branch master at Debian Med / vsearch
Commits:
cc613f49 by Étienne Mollier at 2023-11-27T21:39:33+01:00
New upstream version 2.26.1
- - - - -
67f1a8e3 by Étienne Mollier at 2023-11-27T21:39:33+01:00
routine-update: New upstream version
- - - - -
eb930647 by Étienne Mollier at 2023-11-27T21:39:34+01:00
Update upstream source from tag 'upstream/2.26.1'
Update to upstream version '2.26.1'
with Debian dir 014de38fab1ac44640a15bdcfd41ac80d0e1f17f
- - - - -
709c56c8 by Étienne Mollier at 2023-11-27T21:43:03+01:00
typo.patch: remove: applied upstream.
- - - - -
bdecc575 by Étienne Mollier at 2023-11-27T21:49:52+01:00
ready to upload to unstable.
- - - - -
27 changed files:
- README.md
- configure.ac
- debian/changelog
- debian/patches/series
- − debian/patches/typo.patch
- man/vsearch.1
- src/Makefile.am
- src/allpairs.cc
- src/chimera.cc
- src/cluster.cc
- src/db.cc
- src/derep.cc
- src/derepsmallmem.cc
- src/eestats.cc
- src/kmerhash.cc
- src/mask.cc
- src/mergepairs.cc
- src/otutable.cc
- src/results.cc
- src/results.h
- src/search.cc
- src/searchexact.cc
- src/sha1.h
- src/sintax.cc
- src/udb.cc
- src/vsearch.cc
- src/xstring.h
Changes:
=====================================
README.md
=====================================
@@ -37,7 +37,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
## Getting Help
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
## Example
@@ -50,9 +50,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
**Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
```
-wget https://github.com/torognes/vsearch/archive/v2.25.0.tar.gz
-tar xzf v2.25.0.tar.gz
-cd vsearch-2.25.0
+wget https://github.com/torognes/vsearch/archive/v2.26.1.tar.gz
+tar xzf v2.26.1.tar.gz
+cd vsearch-2.26.1
./autogen.sh
./configure CFLAGS="-O3" CXXFLAGS="-O3"
make
@@ -81,48 +81,48 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-x86_64.tar.gz
-tar xzf vsearch-2.25.0-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-x86_64.tar.gz
+tar xzf vsearch-2.26.1-linux-x86_64.tar.gz
```
Or these commands if you are using a Linux ppc64le system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-ppc64le.tar.gz
-tar xzf vsearch-2.25.0-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-ppc64le.tar.gz
+tar xzf vsearch-2.26.1-linux-ppc64le.tar.gz
```
Or these commands if you are using a Linux aarch64 (arm64) system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-linux-aarch64.tar.gz
-tar xzf vsearch-2.25.0-linux-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-linux-aarch64.tar.gz
+tar xzf vsearch-2.26.1-linux-aarch64.tar.gz
```
Or these commands if you are using a Mac with an Apple Silicon CPU:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-macos-aarch64.tar.gz
-tar xzf vsearch-2.25.0-macos-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-macos-aarch64.tar.gz
+tar xzf vsearch-2.26.1-macos-aarch64.tar.gz
```
Or these commands if you are using a Mac with an Intel CPU:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-macos-x86_64.tar.gz
-tar xzf vsearch-2.25.0-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-macos-x86_64.tar.gz
+tar xzf vsearch-2.26.1-macos-x86_64.tar.gz
```
Or if you are using Windows, download and extract (unzip) the contents of this file:
```
-https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch-2.25.0-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch-2.26.1-win-x86_64.zip
```
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.25.0-linux-x86_64` or `vsearch-2.25.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.26.1-linux-x86_64` or `vsearch-2.26.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
**Windows**: You will now have the binary distribution in a folder
-called `vsearch-2.25.0-win-x86_64`. The vsearch executable is called
+called `vsearch-2.26.1-win-x86_64`. The vsearch executable is called
`vsearch.exe`. The manual in PDF format is called
`vsearch_manual.pdf`. If you want to be able to call `vsearch.exe`
from any command prompt window, you can put the vsearch executable in
@@ -133,7 +133,7 @@ searching for it in the Start menu, `Edit` user variables, add
your changes.
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.25.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.26.1/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
## Packages, plugins, and wrappers
=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.25.0], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
+AC_INIT([vsearch], [2.26.1], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE([subdir-objects])
AC_LANG([C++])
=====================================
debian/changelog
=====================================
@@ -1,3 +1,10 @@
+vsearch (2.26.1-1) unstable; urgency=medium
+
+ * New upstream version 2.26.1
+ * typo.patch: remove: applied upstream.
+
+ -- Étienne Mollier <emollier at debian.org> Mon, 27 Nov 2023 21:46:01 +0100
+
vsearch (2.25.0-1) unstable; urgency=medium
* New upstream version 2.25.0
=====================================
debian/patches/series
=====================================
@@ -1,2 +1 @@
sysconf_memory_sizing.patch
-typo.patch
=====================================
debian/patches/typo.patch deleted
=====================================
@@ -1,17 +0,0 @@
-Description: fix minor typo caught by lintian.
-Author: Étienne Mollier <emollier at debian.org>
-Forwarded: https://github.com/torognes/vsearch/pull/540
-Last-Update: 2023-11-15
----
-This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
---- vsearch.orig/src/derepsmallmem.cc
-+++ vsearch/src/derepsmallmem.cc
-@@ -233,7 +233,7 @@
- }
- else
- {
-- fatal("Ouput file for dereplication must be specified with --fastaout");
-+ fatal("Output file for dereplication must be specified with --fastaout");
- }
-
- uint64_t filesize = fastx_get_size(h);
=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
.\" ============================================================================
-.TH vsearch 1 "November 10, 2023" "version 2.25.0" "USER COMMANDS"
+.TH vsearch 1 "November 25, 2023" "version 2.26.1" "USER COMMANDS"
.\" ============================================================================
.SH NAME
vsearch \(em a versatile open-source tool for microbiome analysis,
@@ -4834,6 +4834,15 @@ warnings for sha1.c. Compile for release (not debug) by default.
.BR v2.25.0\~ "released November 10th, 2023"
Allow a given percentage of mismatches between chimeras and parents
for the experimental chimeras_denovo command.
+.TP
+.BR v2.26.0\~ "released November 24th, 2023"
+Enable the maxseqlength and minseqlength options for the chimera
+detection commands. When the usearch_global or search_exact commands
+are used, OTU tables will include samples and OTUs with no matches.
+.TP
+.BR v2.26.1\~ "released November 25th, 2023"
+No real changes, but the previous version was released without proper
+updates to the source code.
.LP
.\" ============================================================================
.\" TODO:
=====================================
src/Makefile.am
=====================================
@@ -1,23 +1,25 @@
bin_PROGRAMS = $(top_builddir)/bin/vsearch
+AM_CFLAGS = -Wall -Wsign-compare
+
if TARGET_PPC
-AM_CFLAGS=-Wall -Wsign-compare -mcpu=powerpc64le -maltivec
+AM_CFLAGS += -mcpu=powerpc64le -maltivec
else
if TARGET_AARCH64
-AM_CFLAGS=-Wall -Wsign-compare -march=armv8-a+simd -mtune=generic
+AM_CFLAGS += -march=armv8-a+simd -mtune=generic
else
-AM_CFLAGS=-Wall -Wsign-compare -march=x86-64 -mtune=generic
+AM_CFLAGS += -march=x86-64 -mtune=generic
endif
endif
# Conditionally set NDEBUG based on ENABLE_DEBUG
if ENABLE_DEBUG
-AM_CFLAGS += -UNDEBUG
+AM_CFLAGS += -UNDEBUG -Wcast-align -Wextra -Wfloat-equal
else
AM_CFLAGS += -DNDEBUG
endif
-AM_CXXFLAGS=$(AM_CFLAGS) -std=c++11
+AM_CXXFLAGS = $(AM_CFLAGS) -std=c++11
export MACOSX_DEPLOYMENT_TARGET=10.9
=====================================
src/allpairs.cc
=====================================
@@ -135,8 +135,7 @@ void allpairs_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
if (fp_samout)
@@ -146,7 +145,6 @@ void allpairs_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -169,7 +167,6 @@ void allpairs_output_results(int hit_count,
hp,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -186,11 +183,7 @@ void allpairs_output_results(int hit_count,
if (fp_tsegout)
{
results_show_tsegout_one(fp_tsegout,
- hp,
- query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ hp);
}
if (fp_uc)
@@ -200,9 +193,7 @@ void allpairs_output_results(int hit_count,
results_show_uc_one(fp_uc,
hp,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
hp->target);
}
}
@@ -222,9 +213,7 @@ void allpairs_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
hp,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
@@ -235,9 +224,7 @@ void allpairs_output_results(int hit_count,
results_show_uc_one(fp_uc,
nullptr,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
0);
}
@@ -258,9 +245,7 @@ void allpairs_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
nullptr,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
=====================================
src/chimera.cc
=====================================
@@ -59,6 +59,7 @@
*/
#include "vsearch.h"
+#include <vector>
/*
This code implements the method described in this paper:
@@ -168,14 +169,18 @@ void realloc_arrays(struct chimera_info_s * ci)
{
if (opt_chimeras_denovo)
{
- if (opt_chimeras_parts == 0)
+ if (opt_chimeras_parts == 0) {
parts = (ci->query_len + maxparts - 1) / maxparts;
- else
+ }
+ else {
parts = opt_chimeras_parts;
- if (parts < 2)
+ }
+ if (parts < 2) {
parts = 2;
- else if (parts > maxparts)
+ }
+ else if (parts > maxparts) {
parts = maxparts;
+ }
}
else
{
@@ -183,7 +188,7 @@ void realloc_arrays(struct chimera_info_s * ci)
parts = 4;
}
- int maxhlen = MAX(ci->query_head_len,1);
+ const int maxhlen = MAX(ci->query_head_len, 1);
if (maxhlen > ci->head_alloc)
{
ci->head_alloc = maxhlen;
@@ -192,8 +197,8 @@ void realloc_arrays(struct chimera_info_s * ci)
/* realloc arrays based on query length */
- int maxqlen = MAX(ci->query_len, 1);
- int maxpartlen = (maxqlen + parts - 1) / parts;
+ const int maxqlen = MAX(ci->query_len, 1);
+ const int maxpartlen = (maxqlen + parts - 1) / parts;
if (maxqlen > ci->query_alloc)
{
@@ -201,8 +206,7 @@ void realloc_arrays(struct chimera_info_s * ci)
ci->query_seq = (char*) xrealloc(ci->query_seq, maxqlen + 1);
- for(auto & i
- : ci->si)
+ for(auto & i: ci->si)
{
i.qsequence = (char*) xrealloc(i.qsequence, maxpartlen + 1);
}
@@ -221,16 +225,16 @@ void realloc_arrays(struct chimera_info_s * ci)
ci->scan_q = (double *) xrealloc(ci->scan_q,
(maxqlen + 1) * sizeof(double));
- int maxalnlen = maxqlen + 2 * db_getlongestsequence();
+ const int maxalnlen = maxqlen + 2 * db_getlongestsequence();
for (int f = 0; f < maxparents ; f++)
{
- ci->paln[f] = (char*) xrealloc(ci->paln[f], maxalnlen+1);
+ ci->paln[f] = (char*) xrealloc(ci->paln[f], maxalnlen + 1);
}
- ci->qaln = (char*) xrealloc(ci->qaln, maxalnlen+1);
- ci->diffs = (char*) xrealloc(ci->diffs, maxalnlen+1);
- ci->votes = (char*) xrealloc(ci->votes, maxalnlen+1);
- ci->model = (char*) xrealloc(ci->model, maxalnlen+1);
- ci->ignore = (char*) xrealloc(ci->ignore, maxalnlen+1);
+ ci->qaln = (char*) xrealloc(ci->qaln, maxalnlen + 1);
+ ci->diffs = (char*) xrealloc(ci->diffs, maxalnlen + 1);
+ ci->votes = (char*) xrealloc(ci->votes, maxalnlen + 1);
+ ci->model = (char*) xrealloc(ci->model, maxalnlen + 1);
+ ci->ignore = (char*) xrealloc(ci->ignore, maxalnlen + 1);
}
}
@@ -269,15 +273,15 @@ void find_matches(struct chimera_info_s * ci)
switch (op)
{
case 'M':
- for(int k=0; k<run; k++)
+ for(int k = 0; k < run; k++)
{
if (chrmap_4bit[(int)(qseq[qpos])] &
chrmap_4bit[(int)(tseq[tpos])])
{
ci->match[i * ci->query_len + qpos] = 1;
}
- qpos++;
- tpos++;
+ ++qpos;
+ ++tpos;
}
break;
@@ -394,11 +398,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
best_parents[f].start = -1;
}
- bool position_used[ci->query_len];
- for (int i = 0; i < ci->query_len; i++)
- {
- position_used[i] = false;
- }
+ std::vector<bool> position_used(ci->query_len, false);
int pos_remaining = ci->query_len;
int parents_found = 0;
@@ -455,7 +455,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
best_parents[f].cand = best_cand;
best_parents[f].start = best_start;
best_parents[f].len = best_len;
- parents_found++;
+ ++parents_found;
#if 0
if (f == 0)
@@ -502,7 +502,7 @@ int find_best_parents_long(struct chimera_info_s * ci)
printf("Not covered completely (%d).\n", pos_remaining);
#endif
- return (parents_found > 1) && (pos_remaining == 0);
+ return (parents_found > 1) and (pos_remaining == 0);
}
int find_best_parents(struct chimera_info_s * ci)
@@ -517,10 +517,7 @@ int find_best_parents(struct chimera_info_s * ci)
ci->best_parents[f] = -1;
}
- bool cand_selected[ci->cand_count];
-
- for (int i = 0; i < ci->cand_count; i++)
- cand_selected[i] = false;
+ std::vector<bool> cand_selected(ci->cand_count, false);
for (int f = 0; f < 2; f++)
{
@@ -556,7 +553,7 @@ int find_best_parents(struct chimera_info_s * ci)
for(int i = 0; i < ci->cand_count; i++)
{
- if (! cand_selected[i])
+ if (not cand_selected[i])
{
int sum = 0;
for(int qpos = 0; qpos < ci->query_len; qpos++)
@@ -582,17 +579,15 @@ int find_best_parents(struct chimera_info_s * ci)
/* find parent with the most wins */
- int wins[ci->cand_count];
- for (int i = 0; i < ci->cand_count; i++)
- wins[i] = 0;
+ std::vector<int> wins(ci->cand_count, 0);
- for(int qpos = window-1; qpos < ci->query_len; qpos++)
+ for(int qpos = window - 1; qpos < ci->query_len; qpos++)
{
if (ci->maxsmooth[qpos] != 0)
{
- for(int i=0; i < ci->cand_count; i++)
+ for(int i = 0; i < ci->cand_count; i++)
{
- if (! cand_selected[i])
+ if (not cand_selected[i])
{
int z = i * ci->query_len + qpos;
if (ci->smooth[z] == ci->maxsmooth[qpos])
@@ -619,8 +614,9 @@ int find_best_parents(struct chimera_info_s * ci)
/* terminate loop if no parent found */
- if (best_parent_cand[f] < 0)
+ if (best_parent_cand[f] < 0) {
break;
+ }
#if 0
printf("Query %d: Best parent (%d) candidate: %d. Wins: %d\n",
@@ -633,7 +629,7 @@ int find_best_parents(struct chimera_info_s * ci)
/* Check if at least 2 candidates selected */
- return (best_parent_cand[0] >= 0) && (best_parent_cand[1] >= 0);
+ return (best_parent_cand[0] >= 0) and (best_parent_cand[1] >= 0);
}
@@ -676,7 +672,7 @@ int find_max_alignment_length(struct chimera_info_s * ci)
/* find total alignment length */
int alnlen = 0;
- for(int i=0; i < ci->query_len+1; i++)
+ for(int i = 0; i < ci->query_len + 1; i++)
{
alnlen += ci->maxi[i];
}
@@ -728,11 +724,11 @@ void fill_alignment_parents(struct chimera_info_s * ci)
}
else
{
- for(int x=0; x < run; x++)
+ for(int x = 0; x < run; x++)
{
- if (!inserted)
+ if (not inserted)
{
- for(int y=0; y < ci->maxi[qpos]; y++)
+ for(int y = 0; y < ci->maxi[qpos]; y++)
{
*t++ = '-';
}
@@ -747,7 +743,7 @@ void fill_alignment_parents(struct chimera_info_s * ci)
*t++ = '-';
}
- qpos++;
+ ++qpos;
inserted = 0;
}
}
@@ -755,7 +751,7 @@ void fill_alignment_parents(struct chimera_info_s * ci)
/* add any gaps at the end */
- if (! inserted)
+ if (not inserted)
{
for(int x=0; x < ci->maxi[qpos]; x++)
{
@@ -784,11 +780,11 @@ int eval_parents_long(struct chimera_info_s * ci)
int m = 0;
char * q = ci->qaln;
int qpos = 0;
- for (int i=0; i < ci->query_len; i++)
+ for (int i = 0; i < ci->query_len; i++)
{
if (qpos >= (ci->best_start[m] + ci->best_len[m]))
m++;
- for (int j=0; j < ci->maxi[i]; j++)
+ for (int j = 0; j < ci->maxi[i]; j++)
{
*q++ = '-';
*pm++ = 'A' + m;
@@ -796,7 +792,7 @@ int eval_parents_long(struct chimera_info_s * ci)
*q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
*pm++ = 'A' + m;
}
- for (int j=0; j < ci->maxi[ci->query_len]; j++)
+ for (int j = 0; j < ci->maxi[ci->query_len]; j++)
{
*q++ = '-';
*pm++ = 'A' + m;
@@ -816,7 +812,7 @@ int eval_parents_long(struct chimera_info_s * ci)
/* lower case parent symbols that differ from query */
for (int f = 0; f < ci->parents_found; f++)
- if (psym[f] && (psym[f] != qsym))
+ if (psym[f] and (psym[f] != qsym))
ci->paln[f][i] = tolower(ci->paln[f][i]);
/* compute diffs */
@@ -825,7 +821,7 @@ int eval_parents_long(struct chimera_info_s * ci)
bool all_defined = qsym;
for (int f = 0; f < ci->parents_found; f++)
- if (!psym[f])
+ if (! psym[f])
all_defined = false;
if (all_defined)
@@ -897,7 +893,7 @@ int eval_parents_long(struct chimera_info_s * ci)
xpthread_mutex_lock(&mutex_output);
- if (opt_alnout && (status == 4))
+ if (opt_alnout and (status == 4))
{
fprintf(fp_uchimealns, "\n");
fprintf(fp_uchimealns, "----------------------------------------"
@@ -1061,7 +1057,7 @@ int eval_parents(struct chimera_info_s * ci)
char * q = ci->qaln;
int qpos = 0;
- for (int i=0; i < ci->query_len; i++)
+ for (int i = 0; i < ci->query_len; i++)
{
for (int j=0; j < ci->maxi[i]; j++)
{
@@ -1069,7 +1065,7 @@ int eval_parents(struct chimera_info_s * ci)
}
*q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
}
- for (int j=0; j < ci->maxi[ci->query_len]; j++)
+ for (int j = 0; j < ci->maxi[ci->query_len]; j++)
{
*q++ = '-';
}
@@ -1110,12 +1106,12 @@ int eval_parents(struct chimera_info_s * ci)
/* lower case parent symbols that differ from query */
- if (p1sym && (p1sym != qsym))
+ if (p1sym and (p1sym != qsym))
{
ci->paln[0][i] = tolower(ci->paln[0][i]);
}
- if (p2sym && (p2sym != qsym))
+ if (p2sym and (p2sym != qsym))
{
ci->paln[1][i] = tolower(ci->paln[1][i]);
}
@@ -1124,7 +1120,7 @@ int eval_parents(struct chimera_info_s * ci)
char diff;
- if (qsym && p1sym && p2sym)
+ if (qsym and p1sym and p2sym)
{
if (p1sym == p2sym)
{
@@ -1171,21 +1167,21 @@ int eval_parents(struct chimera_info_s * ci)
for (int i = 0; i < alnlen; i++)
{
- if (!ci->ignore[i])
+ if (not ci->ignore[i])
{
char diff = ci->diffs[i];
if (diff == 'A')
{
- sumA++;
+ ++sumA;
}
else if (diff == 'B')
{
- sumB++;
+ ++sumB;
}
else if (diff != ' ')
{
- sumN++;
+ ++sumN;
}
}
@@ -1209,32 +1205,34 @@ int eval_parents(struct chimera_info_s * ci)
int best_left_a = 0;
int best_right_a = 0;
- for (int i=0; i<alnlen; i++)
+ for (int i = 0; i < alnlen; i++)
{
- if(!ci->ignore[i])
+ if(not ci->ignore[i])
{
char diff = ci->diffs[i];
if (diff != ' ')
{
if (diff == 'A')
{
- left_y++;
- right_n--;
+ ++left_y;
+ --right_n;
}
else if (diff == 'B')
{
- left_n++;
- right_y--;
+ ++left_n;
+ --right_y;
}
else
{
- left_a++;
- right_a--;
+ ++left_a;
+ --right_a;
}
- double left_h, right_h, h;
+ double left_h = 0;
+ double right_h = 0;
+ double h = 0;
- if ((left_y > left_n) && (right_y > right_n))
+ if ((left_y > left_n) and (right_y > right_n))
{
left_h = left_y / (opt_xn * (left_n + opt_dn) + left_a);
right_h = right_y / (opt_xn * (right_n + opt_dn) + right_a);
@@ -1253,7 +1251,7 @@ int eval_parents(struct chimera_info_s * ci)
best_right_a = right_a;
}
}
- else if ((left_n > left_y) && (right_n > right_y))
+ else if ((left_n > left_y) and (right_n > right_y))
{
/* swap left/right and yes/no */
@@ -1369,7 +1367,7 @@ int eval_parents(struct chimera_info_s * ci)
for(int i = 0; i < alnlen; i++)
{
- if (! ci->ignore[i])
+ if (not ci->ignore[i])
{
cols++;
@@ -1414,9 +1412,10 @@ int eval_parents(struct chimera_info_s * ci)
int sumL = best_left_n + best_left_a + best_left_y;
int sumR = best_right_n + best_right_a + best_right_y;
- if (opt_uchime2_denovo || opt_uchime3_denovo)
+ if (opt_uchime2_denovo or opt_uchime3_denovo)
{
- if ((QM == 100.0) && (QT < 100.0))
+ // fix -Wfloat-equal: if match_QM == cols, then QM == 100.0
+ if ((match_QM == cols) and (QT < 100.0))
{
status = 4;
}
@@ -1425,8 +1424,8 @@ int eval_parents(struct chimera_info_s * ci)
if (best_h >= opt_minh)
{
status = 3;
- if ((divdiff >= opt_mindiv) &&
- (sumL >= opt_mindiffs) &&
+ if ((divdiff >= opt_mindiv) and
+ (sumL >= opt_mindiffs) and
(sumR >= opt_mindiffs))
{
status = 4;
@@ -1437,7 +1436,7 @@ int eval_parents(struct chimera_info_s * ci)
xpthread_mutex_lock(&mutex_output);
- if (opt_uchimealns && (status == 4))
+ if (opt_uchimealns and (status == 4))
{
fprintf(fp_uchimealns, "\n");
fprintf(fp_uchimealns, "----------------------------------------"
@@ -1617,6 +1616,7 @@ int eval_parents(struct chimera_info_s * ci)
return status;
}
+// refactoring: enum struct status {};
/*
new chimeric status:
0: no parents, non-chimeric
@@ -1851,7 +1851,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
if (opt_uchime_ref)
{
- if (fasta_next(query_fasta_h, ! opt_notrunclabels,
+ if (fasta_next(query_fasta_h, not opt_notrunclabels,
chrmap_no_change))
{
ci->query_head_len = fasta_get_header_length(query_fasta_h);
@@ -1909,13 +1909,13 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
if (ci->query_len >= parts)
{
- for (int i=0; i<parts; i++)
+ for (int i = 0; i < parts; i++)
{
struct hit * hits;
int hit_count;
- search_onequery(ci->si+i, opt_qmask);
- search_joinhits(ci->si+i, nullptr, & hits, & hit_count);
- for(int j=0; j<hit_count; j++)
+ search_onequery(ci->si + i, opt_qmask);
+ search_joinhits(ci->si + i, nullptr, & hits, & hit_count);
+ for(int j = 0; j < hit_count; j++)
{
if (hits[j].accepted)
{
@@ -1926,7 +1926,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
}
}
- for(int i=0; i < allhits_count; i++)
+ for(int i = 0; i < allhits_count; i++)
{
unsigned int target = allhits_list[i].target;
@@ -1968,7 +1968,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
ci->snwgaps,
ci->nwcigar);
- for(int i=0; i < ci->cand_count; i++)
+ for(int i = 0; i < ci->cand_count; i++)
{
int64_t target = ci->cand_list[i];
int64_t nwscore = ci->snwscore[i];
@@ -2114,7 +2114,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
nonchimera_abundance += ci->query_size;
/* output no parents, no chimeras */
- if ((status < 2) && opt_uchimeout)
+ if ((status < 2) and opt_uchimeout)
{
fprintf(fp_uchimeout, "0.0000\t");
@@ -2160,13 +2160,13 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
if (status < 3)
{
/* uchime_denovo: add non-chimeras to db */
- if (opt_uchime_denovo || opt_uchime2_denovo || opt_uchime3_denovo || opt_chimeras_denovo)
+ if (opt_uchime_denovo or opt_uchime2_denovo or opt_uchime3_denovo or opt_chimeras_denovo)
{
dbindex_addsequence(seqno, opt_qmask);
}
}
- for (int i=0; i < ci->cand_count; i++)
+ for (int i = 0; i < ci->cand_count; i++)
{
if (ci->nwcigar[i])
{
@@ -2213,14 +2213,14 @@ void chimera_threads_run()
xpthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
/* create worker threads */
- for(int64_t t=0; t<opt_threads; t++)
+ for(int64_t t = 0; t < opt_threads; t++)
{
- xpthread_create(pthread+t, & attr,
+ xpthread_create(pthread + t, & attr,
chimera_thread_worker, (void*)t);
}
/* finish worker threads */
- for(int t=0; t<opt_threads; t++)
+ for(int t = 0; t < opt_threads; t++)
{
xpthread_join(pthread[t], nullptr);
}
@@ -2233,7 +2233,7 @@ void open_chimera_file(FILE * * f, char * name)
if (name)
{
*f = fopen_output(name);
- if (!*f)
+ if (! *f)
{
fatal("Unable to open file %s for writing", name);
}
@@ -2325,7 +2325,7 @@ void chimera()
{
dust_all();
}
- else if ((opt_dbmask == MASK_SOFT) && (opt_hardmask))
+ else if ((opt_dbmask == MASK_SOFT) and (opt_hardmask))
{
hardmask_all();
}
@@ -2364,7 +2364,7 @@ void chimera()
{
dust_all();
}
- else if ((opt_qmask == MASK_SOFT) && (opt_hardmask))
+ else if ((opt_qmask == MASK_SOFT) and (opt_hardmask))
{
hardmask_all();
}
@@ -2376,13 +2376,36 @@ void chimera()
if (opt_log)
{
- fprintf(fp_log, "%8.2f minh\n", opt_minh);
- fprintf(fp_log, "%8.2f xn\n", opt_xn);
- fprintf(fp_log, "%8.2f dn\n", opt_dn);
- fprintf(fp_log, "%8.2f xa\n", 1.0);
- fprintf(fp_log, "%8.2f mindiv\n", opt_mindiv);
+ if (opt_uchime_ref || opt_uchime_denovo)
+ {
+ fprintf(fp_log, "%8.2f minh\n", opt_minh);
+ }
+
+ if (opt_uchime_ref ||
+ opt_uchime_denovo ||
+ opt_uchime2_denovo ||
+ opt_uchime3_denovo)
+ {
+ fprintf(fp_log, "%8.2f xn\n", opt_xn);
+ fprintf(fp_log, "%8.2f dn\n", opt_dn);
+ fprintf(fp_log, "%8.2f xa\n", 1.0);
+ }
+
+ if (opt_uchime_ref || opt_uchime_denovo)
+ {
+ fprintf(fp_log, "%8.2f mindiv\n", opt_mindiv);
+ }
+
fprintf(fp_log, "%8.2f id\n", opt_id);
- fprintf(fp_log, "%8d maxp\n", 2);
+
+ if (opt_uchime_ref ||
+ opt_uchime_denovo ||
+ opt_uchime2_denovo ||
+ opt_uchime3_denovo)
+ {
+ fprintf(fp_log, "%8d maxp\n", 2);
+ }
+
fprintf(fp_log, "\n");
}
@@ -2393,7 +2416,7 @@ void chimera()
progress_done();
- if (!opt_quiet)
+ if (! opt_quiet)
{
if (total_count > 0)
{
=====================================
src/cluster.cc
=====================================
@@ -430,7 +430,7 @@ void cluster_core_results_hit(struct hit * best,
{
results_show_uc_one(fp_uc,
best, query_head,
- qsequence, qseqlen, qsequence_rc,
+ qseqlen,
clusterno);
}
@@ -438,14 +438,14 @@ void cluster_core_results_hit(struct hit * best,
{
results_show_alnout(fp_alnout,
best, 1, query_head,
- qsequence, qseqlen, qsequence_rc);
+ qsequence, qseqlen);
}
if (fp_samout)
{
results_show_samout(fp_samout,
best, 1, query_head,
- qsequence, qseqlen, qsequence_rc);
+ qsequence, qsequence_rc);
}
if (fp_fastapairs)
@@ -454,7 +454,6 @@ void cluster_core_results_hit(struct hit * best,
best,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -471,11 +470,7 @@ void cluster_core_results_hit(struct hit * best,
if (fp_tsegout)
{
results_show_tsegout_one(fp_tsegout,
- best,
- query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ best);
}
if (fp_userout)
@@ -487,7 +482,7 @@ void cluster_core_results_hit(struct hit * best,
if (fp_blast6out)
{
results_show_blast6out_one(fp_blast6out, best, query_head,
- qsequence, qseqlen, qsequence_rc);
+ qseqlen);
}
if (opt_matched)
@@ -551,7 +546,7 @@ void cluster_core_results_nohit(int clusterno,
if (fp_blast6out)
{
results_show_blast6out_one(fp_blast6out, nullptr, query_head,
- qsequence, qseqlen, qsequence_rc);
+ qseqlen);
}
}
=====================================
src/db.cc
=====================================
@@ -255,8 +255,8 @@ void db_read(const char * filename, int upcase)
if (sequences > 0)
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, "
- "min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, "
+ "min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n",
db_getnucleotidecount(),
db_getsequencecount(),
db_getshortestsequence(),
@@ -266,7 +266,7 @@ void db_read(const char * filename, int upcase)
else
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
db_getnucleotidecount(),
db_getsequencecount());
}
@@ -277,8 +277,8 @@ void db_read(const char * filename, int upcase)
if (sequences > 0)
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, "
- "min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, "
+ "min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n\n",
db_getnucleotidecount(),
db_getsequencecount(),
db_getshortestsequence(),
@@ -288,7 +288,7 @@ void db_read(const char * filename, int upcase)
else
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n\n",
db_getnucleotidecount(),
db_getsequencecount());
}
=====================================
src/derep.cc
=====================================
@@ -701,8 +701,8 @@ void derep(char * input_filename, bool use_header)
if (sequencecount > 0)
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
- ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+ ", max %" PRIu64 ", avg %.0f\n",
nucleotidecount,
sequencecount,
shortest,
@@ -712,7 +712,7 @@ void derep(char * input_filename, bool use_header)
else
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
nucleotidecount,
sequencecount);
}
@@ -723,8 +723,8 @@ void derep(char * input_filename, bool use_header)
if (sequencecount > 0)
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
- ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+ ", max %" PRIu64 ", avg %.0f\n",
nucleotidecount,
sequencecount,
shortest,
@@ -734,7 +734,7 @@ void derep(char * input_filename, bool use_header)
else
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
nucleotidecount,
sequencecount);
}
=====================================
src/derepsmallmem.cc
=====================================
@@ -117,20 +117,30 @@ double find_median()
}
}
- double mid = below_count + cand_count + above_count;
- if (mid == 0)
+ if (below_count + cand_count + above_count == 0U) // fix -Wfloat-equal
return 0;
- mid = mid / 2.0;
- if (mid >= below_count)
+ if (above_count + cand_count >= below_count)
+ // mid >= below_count
{
- if (mid <= below_count + cand_count)
+ if (above_count <= below_count + cand_count)
+ // mid <= below_count + cand_count
{
- if (mid == below_count + cand_count)
+ if (above_count == below_count + cand_count)
+ // mid == below_count + cand_count
+ // same as:
+ // (below_count + cand_count + above_count) / 2 == below_count + cand_count
+ // which simplifies into:
+ // above_count == below_count + cand_count
{
return (cand + above) / 2.0;
}
- else if (mid == below_count)
+ else if (above_count + cand_count == below_count)
+ // mid == below_count
+ // same as:
+ // (below_count + cand_count + above_count) / 2 == below_count
+ // which simplifies into:
+ // above_count + cand_count == below_count
{
return (below + cand) / 2.0;
}
@@ -233,7 +243,7 @@ void derep_smallmem(char * input_filename)
}
else
{
- fatal("Ouput file for dereplication must be specified with --fastaout");
+ fatal("Output file for dereplication must be specified with --fastaout");
}
uint64_t filesize = fastx_get_size(h);
@@ -412,8 +422,8 @@ void derep_smallmem(char * input_filename)
if (sequencecount > 0)
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
- ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+ ", max %" PRIu64 ", avg %.0f\n",
nucleotidecount,
sequencecount,
shortest,
@@ -423,7 +433,7 @@ void derep_smallmem(char * input_filename)
else
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
nucleotidecount,
sequencecount);
}
@@ -434,8 +444,8 @@ void derep_smallmem(char * input_filename)
if (sequencecount > 0)
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64
- ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64
+ ", max %" PRIu64 ", avg %.0f\n",
nucleotidecount,
sequencecount,
shortest,
@@ -445,7 +455,7 @@ void derep_smallmem(char * input_filename)
else
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
nucleotidecount,
sequencecount);
}
=====================================
src/eestats.cc
=====================================
@@ -59,6 +59,8 @@
*/
#include "vsearch.h"
+#include <algorithm> // std::max
+
inline int fastq_get_qual_eestats(char q)
{
@@ -476,7 +478,10 @@ void fastq_eestats2()
if (len > longest)
{
longest = len;
- int new_len_steps = 1 + MAX(0, (MIN(longest, (uint64_t)opt_length_cutoffs_longest) - opt_length_cutoffs_shortest) / opt_length_cutoffs_increment);
+ // opt_length_cutoffs_longest is an int between 1 and INT_MAX
+ int high = MIN(longest, (uint64_t)(opt_length_cutoffs_longest));
+ int new_len_steps = 1 + MAX(0, ((high - opt_length_cutoffs_shortest)
+ / opt_length_cutoffs_increment));
if (new_len_steps > len_steps)
{
=====================================
src/kmerhash.cc
=====================================
@@ -59,6 +59,8 @@
*/
#include "vsearch.h"
+#include <vector>
+
#define HASH CityHash64
@@ -173,9 +175,7 @@ void kh_insert_kmers(struct kh_handle_s * kh, int k, char * seq, int len)
int kh_find_best_diagonal(struct kh_handle_s * kh, int k, char * seq, int len)
{
- int diag_counts[kh->maxpos];
-
- memset(diag_counts, 0, kh->maxpos * sizeof(int));
+ std::vector<int> diag_counts(kh->maxpos, 0);
int kmers = 1 << (2 * k);
unsigned int kmer_mask = kmers - 1;
=====================================
src/mask.cc
=====================================
@@ -179,6 +179,7 @@ static int seqcount = 0;
void * dust_all_worker(void * vp)
{
+ (void) vp; // not used, but required for thread creation
while(true)
{
xpthread_mutex_lock(&mutex);
=====================================
src/mergepairs.cc
=====================================
@@ -59,6 +59,7 @@
*/
#include "vsearch.h"
+#include <vector>
/* chunk constants */
@@ -700,10 +701,10 @@ int64_t optimize(merge_data_t * ip,
int kmers = 0;
- int diags[ip->fwd_trunc + ip->rev_trunc];
+ std::vector<int> diags(ip->fwd_trunc + ip->rev_trunc, 0);
kh_insert_kmers(kmerhash, k, ip->fwd_sequence, ip->fwd_trunc);
- kh_find_diagonals(kmerhash, k, ip->rev_sequence, ip->rev_trunc, diags);
+ kh_find_diagonals(kmerhash, k, ip->rev_sequence, ip->rev_trunc, diags.data());
for(int64_t i = i1; i <= i2; i++)
{
=====================================
src/otutable.cc
=====================================
@@ -154,107 +154,126 @@ void otutable_add(char * query_header, char * target_header, int64_t abundance)
{
/* read sample annotation in query */
- int len_sample;
+ int len_sample = 0;
char * start_sample = query_header;
+ char * sample_name = nullptr;
-#ifdef HAVE_REGEX_H
- regmatch_t pmatch_sample[5];
- if (!regexec(&otutable->regex_sample, query_header, 5, pmatch_sample, 0))
+ if (query_header)
{
- /* match: use the matching sample name */
- len_sample = pmatch_sample[3].rm_eo - pmatch_sample[3].rm_so;
- start_sample += pmatch_sample[3].rm_so;
- }
+#ifdef HAVE_REGEX_H
+ regmatch_t pmatch_sample[5];
+ if (!regexec(&otutable->regex_sample, query_header, 5, pmatch_sample, 0))
+ {
+ /* match: use the matching sample name */
+ len_sample = pmatch_sample[3].rm_eo - pmatch_sample[3].rm_so;
+ start_sample += pmatch_sample[3].rm_so;
+ }
#else
- std::cmatch cmatch_sample;
- if (regex_search(query_header, cmatch_sample, regex_sample))
- {
- len_sample = cmatch_sample.length(3);
- start_sample += cmatch_sample.position(3);
- }
+ std::cmatch cmatch_sample;
+ if (regex_search(query_header, cmatch_sample, regex_sample))
+ {
+ len_sample = cmatch_sample.length(3);
+ start_sample += cmatch_sample.position(3);
+ }
#endif
- else
- {
- /* no match: use first name in header with A-Za-z0-9_ */
- len_sample = strspn(query_header,
- "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
- "abcdefghijklmnopqrstuvwxyz"
- "_"
- "0123456789");
+ else
+ {
+ /* no match: use first name in header with A-Za-z0-9_ */
+ len_sample = strspn(query_header,
+ "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+ "abcdefghijklmnopqrstuvwxyz"
+ "_"
+ "0123456789");
+ }
+
+ sample_name = (char *) xmalloc(len_sample+1);
+ strncpy(sample_name, start_sample, len_sample);
+ sample_name[len_sample] = 0;
}
- char * sample_name = (char *) xmalloc(len_sample+1);
- strncpy(sample_name, start_sample, len_sample);
- sample_name[len_sample] = 0;
/* read OTU annotation in target */
- int len_otu;
+ int len_otu = 0;
char * start_otu = target_header;
+ char * otu_name = nullptr;
-#ifdef HAVE_REGEX_H
- regmatch_t pmatch_otu[4];
- if (!regexec(&otutable->regex_otu, target_header, 4, pmatch_otu, 0))
+ if (target_header)
{
- /* match: use the matching otu name */
- len_otu = pmatch_otu[2].rm_eo - pmatch_otu[2].rm_so;
- start_otu += pmatch_otu[2].rm_so;
- }
+#ifdef HAVE_REGEX_H
+ regmatch_t pmatch_otu[4];
+ if (!regexec(&otutable->regex_otu, target_header, 4, pmatch_otu, 0))
+ {
+ /* match: use the matching otu name */
+ len_otu = pmatch_otu[2].rm_eo - pmatch_otu[2].rm_so;
+ start_otu += pmatch_otu[2].rm_so;
+ }
#else
- std::cmatch cmatch_otu;
- if (regex_search(target_header, cmatch_otu, regex_otu))
- {
- len_otu = cmatch_otu.length(2);
- start_otu += cmatch_otu.position(2);
- }
+ std::cmatch cmatch_otu;
+ if (regex_search(target_header, cmatch_otu, regex_otu))
+ {
+ len_otu = cmatch_otu.length(2);
+ start_otu += cmatch_otu.position(2);
+ }
#endif
- else
- {
- /* no match: use first name in header up to ; */
- len_otu = strcspn(target_header, ";");
- }
- char * otu_name = (char *) xmalloc(len_otu+1);
- strncpy(otu_name, start_otu, len_otu);
- otu_name[len_otu] = 0;
+ else
+ {
+ /* no match: use first name in header up to ; */
+ len_otu = strcspn(target_header, ";");
+ }
+ otu_name = (char *) xmalloc(len_otu+1);
+ strncpy(otu_name, start_otu, len_otu);
+ otu_name[len_otu] = 0;
- /* read tax annotation in target */
+ /* read tax annotation in target */
#ifdef HAVE_REGEX_H
- char * start_tax = target_header;
+ char * start_tax = target_header;
- regmatch_t pmatch_tax[4];
- if (!regexec(&otutable->regex_tax, target_header, 4, pmatch_tax, 0))
- {
- /* match: use the matching tax name */
- int len_tax = pmatch_tax[2].rm_eo - pmatch_tax[2].rm_so;
- start_tax += pmatch_tax[2].rm_so;
-
- char * tax_name = (char *) xmalloc(len_tax+1);
- strncpy(tax_name, start_tax, len_tax);
- tax_name[len_tax] = 0;
- otutable->otu_tax_map[otu_name] = tax_name;
- xfree(tax_name);
- }
+ regmatch_t pmatch_tax[4];
+ if (!regexec(&otutable->regex_tax, target_header, 4, pmatch_tax, 0))
+ {
+ /* match: use the matching tax name */
+ int len_tax = pmatch_tax[2].rm_eo - pmatch_tax[2].rm_so;
+ start_tax += pmatch_tax[2].rm_so;
+
+ char * tax_name = (char *) xmalloc(len_tax+1);
+ strncpy(tax_name, start_tax, len_tax);
+ tax_name[len_tax] = 0;
+ otutable->otu_tax_map[otu_name] = tax_name;
+ xfree(tax_name);
+ }
#else
- std::cmatch cmatch_tax;
- if (regex_search(target_header, cmatch_tax, regex_tax))
- {
- otutable->otu_tax_map[otu_name] = cmatch_tax.str(2);
- }
+ std::cmatch cmatch_tax;
+ if (regex_search(target_header, cmatch_tax, regex_tax))
+ {
+ otutable->otu_tax_map[otu_name] = cmatch_tax.str(2);
+ }
#endif
+ }
/* store data */
- otutable->sample_set.insert(sample_name);
- otutable->otu_set.insert(otu_name);
- otutable->sample_otu_count[string_pair_t(sample_name,otu_name)]
- += abundance;
- otutable->otu_sample_count[string_pair_t(otu_name,sample_name)]
- += abundance;
+ if (sample_name)
+ otutable->sample_set.insert(sample_name);
+
+ if (otu_name)
+ otutable->otu_set.insert(otu_name);
+
+ if (sample_name && otu_name && abundance)
+ {
+ otutable->sample_otu_count[string_pair_t(sample_name,otu_name)]
+ += abundance;
+ otutable->otu_sample_count[string_pair_t(otu_name,sample_name)]
+ += abundance;
+ }
+
+ if (otu_name)
+ xfree(otu_name);
- xfree(otu_name);
- xfree(sample_name);
+ if (sample_name)
+ xfree(sample_name);
}
void otutable_print_otutabout(FILE * fp)
=====================================
src/results.cc
=====================================
@@ -64,7 +64,6 @@ void results_show_fastapairs_one(FILE * fp,
struct hit * hp,
char * query_head,
char * qsequence,
- int64_t qseqlen,
char * rc)
{
/* http://www.drive5.com/usearch/manual/fastapairs.html */
@@ -144,11 +143,7 @@ void results_show_qsegout_one(FILE * fp,
}
void results_show_tsegout_one(FILE * fp,
- struct hit * hp,
- char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc)
+ struct hit * hp)
{
if (hp)
{
@@ -176,9 +171,7 @@ void results_show_tsegout_one(FILE * fp,
void results_show_blast6out_one(FILE * fp,
struct hit * hp,
char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc)
+ int64_t qseqlen)
{
/*
@@ -242,9 +235,7 @@ void results_show_blast6out_one(FILE * fp,
void results_show_uc_one(FILE * fp,
struct hit * hp,
char * query_head,
- char * qsequence,
int64_t qseqlen,
- char * rc,
int clusterno)
{
/*
@@ -519,10 +510,7 @@ void results_show_userout_one(FILE * fp, struct hit * hp,
void results_show_lcaout(FILE * fp,
struct hit * hits,
int hitcount,
- char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc)
+ char * query_head)
{
/* Output last common ancestor (LCA) of the hits,
in a similar way to the Sintax command */
@@ -665,8 +653,7 @@ void results_show_alnout(FILE * fp,
int hitcount,
char * query_head,
char * qsequence,
- int64_t qseqlen,
- char * rc)
+ int64_t qseqlen)
{
/* http://drive5.com/usearch/manual/alnout.html */
@@ -904,7 +891,6 @@ void results_show_samout(FILE * fp,
int hitcount,
char * query_head,
char * qsequence,
- int64_t qseqlen,
char * rc)
{
/*
=====================================
src/results.h
=====================================
@@ -63,30 +63,22 @@ void results_show_alnout(FILE * fp,
int hitcount,
char * query_head,
char * qsequence,
- int64_t qseqlen,
- char * rc);
+ int64_t qseqlen);
void results_show_lcaout(FILE * fp,
struct hit * hits,
int hitcount,
- char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc);
+ char * query_head);
void results_show_blast6out_one(FILE * fp,
struct hit * hp,
char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc);
+ int64_t qseqlen);
void results_show_uc_one(FILE * fp,
struct hit * hp,
char * query_head,
- char * qsequence,
int64_t qseqlen,
- char * rc,
int clusterno);
void results_show_userout_one(FILE * fp,
@@ -100,7 +92,6 @@ void results_show_fastapairs_one(FILE * fp,
struct hit * hp,
char * query_head,
char * qsequence,
- int64_t qseqlen,
char * rc);
void results_show_qsegout_one(FILE * fp,
@@ -111,11 +102,7 @@ void results_show_qsegout_one(FILE * fp,
char * rc);
void results_show_tsegout_one(FILE * fp,
- struct hit * hp,
- char * query_head,
- char * qsequence,
- int64_t qseqlen,
- char * rc);
+ struct hit * hp);
void results_show_samheader(FILE * fp,
char * cmdline,
@@ -126,5 +113,4 @@ void results_show_samout(FILE * fp,
int hitcount,
char * query_head,
char * qsequence,
- int64_t qseqlen,
char * rc);
=====================================
src/search.cc
=====================================
@@ -118,8 +118,7 @@ void search_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
if (fp_lcaout)
@@ -127,10 +126,7 @@ void search_output_results(int hit_count,
results_show_lcaout(fp_lcaout,
hits,
toreport,
- query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ query_head);
}
if (fp_samout)
@@ -140,7 +136,6 @@ void search_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -170,7 +165,6 @@ void search_output_results(int hit_count,
hp,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -187,11 +181,7 @@ void search_output_results(int hit_count,
if (fp_tsegout)
{
results_show_tsegout_one(fp_tsegout,
- hp,
- query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ hp);
}
if (fp_uc)
@@ -201,9 +191,7 @@ void search_output_results(int hit_count,
results_show_uc_one(fp_uc,
hp,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
hp->target);
}
}
@@ -223,22 +211,25 @@ void search_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
hp,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
else
{
+ if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+ {
+ otutable_add(query_head,
+ nullptr,
+ qsize);
+ }
+
if (fp_uc)
{
results_show_uc_one(fp_uc,
nullptr,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
0);
}
@@ -259,9 +250,7 @@ void search_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
nullptr,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
@@ -904,6 +893,13 @@ void usearch_global(char * cmdline, char * progheader)
}
}
+
+ // Add OTUs with no matches to OTU table
+ if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+ for(int64_t i=0; i<seqcount; i++)
+ if (! dbmatched[i])
+ otutable_add(nullptr, db_getheader(i), 0);
+
if (opt_biomout)
{
otutable_print_biomout(fp_biomout);
=====================================
src/searchexact.cc
=====================================
@@ -193,8 +193,7 @@ void search_exact_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
if (fp_samout)
@@ -204,7 +203,6 @@ void search_exact_output_results(int hit_count,
toreport,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -234,7 +232,6 @@ void search_exact_output_results(int hit_count,
hp,
query_head,
qsequence,
- qseqlen,
qsequence_rc);
}
@@ -251,11 +248,7 @@ void search_exact_output_results(int hit_count,
if (fp_tsegout)
{
results_show_tsegout_one(fp_tsegout,
- hp,
- query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ hp);
}
if (fp_uc)
@@ -265,9 +258,7 @@ void search_exact_output_results(int hit_count,
results_show_uc_one(fp_uc,
hp,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
hp->target);
}
}
@@ -287,22 +278,25 @@ void search_exact_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
hp,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
else
{
+ if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+ {
+ otutable_add(query_head,
+ nullptr,
+ qsize);
+ }
+
if (fp_uc)
{
results_show_uc_one(fp_uc,
nullptr,
query_head,
- qsequence,
qseqlen,
- qsequence_rc,
0);
}
@@ -323,9 +317,7 @@ void search_exact_output_results(int hit_count,
results_show_blast6out_one(fp_blast6out,
nullptr,
query_head,
- qsequence,
- qseqlen,
- qsequence_rc);
+ qseqlen);
}
}
}
@@ -912,6 +904,12 @@ void search_exact(char * cmdline, char * progheader)
}
}
+ // Add OTUs with no matches to OTU table
+ if (opt_otutabout || opt_mothur_shared_out || opt_biomout)
+ for(int64_t i=0; i<seqcount; i++)
+ if (! dbmatched[i])
+ otutable_add(nullptr, db_getheader(i), 0);
+
if (fp_biomout)
{
otutable_print_biomout(fp_biomout);
=====================================
src/sha1.h
=====================================
@@ -17,7 +17,7 @@ typedef struct {
#define SHA1_DIGEST_SIZE 20
void SHA1_Init(SHA1_CTX* context);
-void SHA1_Update(SHA1_CTX* context, const uint8_t* data, const size_t len);
+void SHA1_Update(SHA1_CTX* context, const uint8_t* data, size_t len);
void SHA1_Final(SHA1_CTX* context, uint8_t digest[SHA1_DIGEST_SIZE]);
#ifdef __cplusplus
=====================================
src/sintax.cc
=====================================
@@ -99,7 +99,6 @@ static int classified = 0;
void sintax_analyse(char * query_head,
int strand,
int best_seqno,
- int best_count,
int * all_seqno,
int count)
{
@@ -207,10 +206,6 @@ void sintax_analyse(char * query_head,
}
}
-#if 0
- fprintf(fp_tabbedout, "\t%d\t%d", best_count, count);
-#endif
-
fprintf(fp_tabbedout, "\n");
xpthread_mutex_unlock(&mutex_output);
}
@@ -313,7 +308,6 @@ void sintax_query(int64_t t)
sintax_analyse(query_head,
best_strand,
best_seqno[best_strand],
- best_count[best_strand],
all_seqno[best_strand],
boot_count[best_strand]);
=====================================
src/udb.cc
=====================================
@@ -565,7 +565,7 @@ void udb_read(const char * filename,
if (seqcount > 0)
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n",
db_getnucleotidecount(),
db_getsequencecount(),
db_getshortestsequence(),
@@ -575,7 +575,7 @@ void udb_read(const char * filename,
else
{
fprintf(stderr,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n",
db_getnucleotidecount(),
db_getsequencecount());
}
@@ -586,7 +586,7 @@ void udb_read(const char * filename,
if (seqcount > 0)
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs, min %'" PRIu64 ", max %'" PRIu64 ", avg %'.0f\n\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs, min %" PRIu64 ", max %" PRIu64 ", avg %.0f\n\n",
db_getnucleotidecount(),
db_getsequencecount(),
db_getshortestsequence(),
@@ -596,7 +596,7 @@ void udb_read(const char * filename,
else
{
fprintf(fp_log,
- "%'" PRIu64 " nt in %'" PRIu64 " seqs\n\n",
+ "%" PRIu64 " nt in %" PRIu64 " seqs\n\n",
db_getnucleotidecount(),
db_getsequencecount());
}
=====================================
src/vsearch.cc
=====================================
@@ -2728,6 +2728,8 @@ void args_init(int argc, char **argv)
option_label_suffix,
option_log,
option_match,
+ option_maxseqlength,
+ option_minseqlength,
option_mismatch,
option_no_progress,
option_nonchimeras,
@@ -4162,9 +4164,11 @@ void args_init(int argc, char **argv)
option_lengthout,
option_log,
option_match,
+ option_maxseqlength,
option_mindiffs,
option_mindiv,
option_minh,
+ option_minseqlength,
option_mismatch,
option_no_progress,
option_nonchimeras,
@@ -4204,9 +4208,11 @@ void args_init(int argc, char **argv)
option_lengthout,
option_log,
option_match,
+ option_maxseqlength,
option_mindiffs,
option_mindiv,
option_minh,
+ option_minseqlength,
option_mismatch,
option_no_progress,
option_nonchimeras,
@@ -4246,9 +4252,11 @@ void args_init(int argc, char **argv)
option_lengthout,
option_log,
option_match,
+ option_maxseqlength,
option_mindiffs,
option_mindiv,
option_minh,
+ option_minseqlength,
option_mismatch,
option_no_progress,
option_nonchimeras,
@@ -4290,9 +4298,11 @@ void args_init(int argc, char **argv)
option_lengthout,
option_log,
option_match,
+ option_maxseqlength,
option_mindiffs,
option_mindiv,
option_minh,
+ option_minseqlength,
option_mismatch,
option_no_progress,
option_nonchimeras,
=====================================
src/xstring.h
=====================================
@@ -110,7 +110,7 @@ class xstring
void add_c(char c)
{
- size_t needed = 1;
+ const size_t needed = 1;
if (length + needed + 1 > alloc)
{
alloc = length + needed + 1;
@@ -123,7 +123,7 @@ class xstring
void add_d(int d)
{
- int needed = snprintf(nullptr, 0, "%d", d);
+ const int needed = snprintf(nullptr, 0, "%d", d);
if (needed < 0)
{
fatal("snprintf failed");
@@ -140,7 +140,7 @@ class xstring
void add_s(char * s)
{
- size_t needed = strlen(s);
+ const size_t needed = strlen(s);
if (length + needed + 1 > alloc)
{
alloc = length + needed + 1;
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/0d922f889f969a5e618bced9f95d665b7a8ddc7b...bdecc575410efd6452b766585b6aa617fcf1a6fd
--
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/0d922f889f969a5e618bced9f95d665b7a8ddc7b...bdecc575410efd6452b766585b6aa617fcf1a6fd
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20231127/36a0a61e/attachment-0001.htm>
More information about the debian-med-commit
mailing list