[med-svn] [Git][med-team/vsearch][upstream] New upstream version 2.24.0

Étienne Mollier (@emollier) gitlab at salsa.debian.org
Wed Nov 8 10:12:20 GMT 2023



Étienne Mollier pushed to branch upstream at Debian Med / vsearch


Commits:
5e7c1018 by Étienne Mollier at 2023-11-08T10:13:36+01:00
New upstream version 2.24.0
- - - - -


10 changed files:

- README.md
- configure.ac
- man/vsearch.1
- src/Makefile.am
- src/chimera.cc
- src/chimera.h
- src/mergepairs.cc
- src/sha1.c
- src/vsearch.cc
- src/vsearch.h


Changes:

=====================================
README.md
=====================================
@@ -26,8 +26,8 @@ VSEARCH stands for vectorized search, as the tool takes advantage of parallelism
 
 Various packages, plugins and wrappers are also available from other sources - see [below](https://github.com/torognes/vsearch#packages-plugins-and-wrappers).
 
-The source code compiles correctly with `gcc` (versions 4.8.5 to 12.0)
-and `llvm-clang` (3.8 to 15.0). The source code should also compile on
+The source code compiles correctly with `gcc` (versions 4.8.5 to 13.0)
+and `llvm-clang` (3.8 to 17.0). The source code should also compile on
 [FreeBSD](https://www.freebsd.org/) and
 [NetBSD](https://www.netbsd.org/) systems.
 
@@ -37,7 +37,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
 
 ## Getting Help
 
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
 
 ## Example
 
@@ -50,9 +50,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
 **Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
 
 ```
-wget https://github.com/torognes/vsearch/archive/v2.23.0.tar.gz
-tar xzf v2.23.0.tar.gz
-cd vsearch-2.23.0
+wget https://github.com/torognes/vsearch/archive/v2.24.0.tar.gz
+tar xzf v2.24.0.tar.gz
+cd vsearch-2.24.0
 ./autogen.sh
 ./configure CFLAGS="-O3" CXXFLAGS="-O3"
 make
@@ -81,48 +81,48 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
 Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-x86_64.tar.gz
-tar xzf vsearch-2.23.0-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-linux-x86_64.tar.gz
+tar xzf vsearch-2.24.0-linux-x86_64.tar.gz
 ```
 
 Or these commands if you are using a Linux ppc64le system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-ppc64le.tar.gz
-tar xzf vsearch-2.23.0-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-linux-ppc64le.tar.gz
+tar xzf vsearch-2.24.0-linux-ppc64le.tar.gz
 ```
 
 Or these commands if you are using a Linux aarch64 (arm64) system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-aarch64.tar.gz
-tar xzf vsearch-2.23.0-linux-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-linux-aarch64.tar.gz
+tar xzf vsearch-2.24.0-linux-aarch64.tar.gz
 ```
 
 Or these commands if you are using a Mac with an Apple Silicon CPU:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-macos-aarch64.tar.gz
-tar xzf vsearch-2.23.0-macos-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-macos-aarch64.tar.gz
+tar xzf vsearch-2.24.0-macos-aarch64.tar.gz
 ```
 
 Or these commands if you are using a Mac with an Intel CPU:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-macos-x86_64.tar.gz
-tar xzf vsearch-2.23.0-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-macos-x86_64.tar.gz
+tar xzf vsearch-2.24.0-macos-x86_64.tar.gz
 ```
 
 Or if you are using Windows, download and extract (unzip) the contents of this file:
 
 ```
-https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch-2.24.0-win-x86_64.zip
 ```
 
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.23.0-linux-x86_64` or `vsearch-2.23.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.24.0-linux-x86_64` or `vsearch-2.24.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
 
 **Windows**: You will now have the binary distribution in a folder
-called `vsearch-2.23.0-win-x86_64`. The vsearch executable is called
+called `vsearch-2.24.0-win-x86_64`. The vsearch executable is called
 `vsearch.exe`. The manual in PDF format is called
 `vsearch_manual.pdf`. If you want to be able to call `vsearch.exe`
 from any command prompt window, you can put the vsearch executable in
@@ -133,7 +133,7 @@ searching for it in the Start menu, `Edit` user variables, add
 your changes.
 
 
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.24.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
 
 
 ## Packages, plugins, and wrappers


=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
 # Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.23.0], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
+AC_INIT([vsearch], [2.24.0], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
 AC_CANONICAL_TARGET
 AM_INIT_AUTOMAKE([subdir-objects])
 AC_LANG([C++])
@@ -69,6 +69,16 @@ AS_IF([test "x$enable_pdfman" != "xno"], [
   fi
 ])
 
+
+# Check for --enable-debug option
+AC_ARG_ENABLE([debug],
+  [AS_HELP_STRING([--enable-debug], [Enable debug build])],
+  [enable_debug=$enableval],
+  [enable_debug=no])
+
+# Define AM_CONDITIONAL for debug
+AM_CONDITIONAL([ENABLE_DEBUG], [test "x$enable_debug" = "xyes"])
+
 have_man_html=no
 
 case $target in


=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
 .\" ============================================================================
-.TH vsearch 1 "July 7, 2023" "version 2.23.0" "USER COMMANDS"
+.TH vsearch 1 "October 26, 2023" "version 2.24.0" "USER COMMANDS"
 .\" ============================================================================
 .SH NAME
 vsearch \(em a versatile open-source tool for microbiome analysis,
@@ -1575,8 +1575,8 @@ information from headers.
 .BI \-\-eetabbedout \0filename
 When specified with the \-\-fastq_mergepairs command, write statistics
 with expected errors of each merged read to the given file. The file
-is a tab separated file with four columns: The number of errors
-expected in the forward read, the number of expected errors in the
+is a tab separated file with four columns: The number of expected
+errors in the forward read, the number of expected errors in the
 reverse read, the number of observed errors in the forward read, and
 the number of observed errors in the reverse read. The observed number
 of errors are the number of differences in the overlap region of the
@@ -1816,7 +1816,7 @@ discard sequences with more than the specified number of bases.
 .TP
 .BI \-\-fastq_maxmergelen\~ "positive integer"
 When using \-\-fastq_mergepairs, specify the maximum length of the
-merged sequence. By default there is no limit.
+merged sequence (default is 1,000,000).
 .TAG fastq_maxns
 .TP
 .BI \-\-fastq_maxns\~ "positive integer"
@@ -2000,12 +2000,14 @@ position with a quality \fIQ\fR below 5, 10, 15 or 20 (option
 .TP
 .BI \-\-fastq_stripleft\~ "positive integer"
 When using \-\-fastq_filter or \-\-fastx_filter, strip the specified
-number of bases from the left end of the reads.
+number of bases from the left end of the reads. If the length of the
+resulting read is null, then the read is discarded.
 .TAG fastq_stripright
 .TP
 .BI \-\-fastq_stripright\~ "positive integer"
 When using \-\-fastq_filter or \-\-fastx_filter, strip the specified
-number of bases from the right end of the reads.
+number of bases from the right end of the reads. If the length of the
+resulting read is null, then the read is discarded.
 .TAG fastq_tail
 .TP
 .BI \-\-fastq_tail\~ "positive integer"
@@ -2031,9 +2033,9 @@ the specified length. Shorter sequences are not discarded.
 .TAG fastq_truncqual
 .TP
 .BI \-\-fastq_truncqual\~ "positive integer"
-When using \-\-fastq_filter or \-\-fastx_filter, truncate sequences
-starting from the first base with the specified base quality score
-value or lower.
+When using \-\-fastq_filter, \-\-fastq_mergepairs or \-\-fastx_filter,
+truncate sequences starting from the first base with the specified
+base quality score value or lower.
 .TAG fastqout
 .TP
 .BI \-\-fastqout \0filename
@@ -2597,7 +2599,7 @@ columns). The field is set to 0 if there is no alignment.
 integer value).
 .IP \n+[step].
 \fIopens\fR: number of columns containing a gap opening (zero or
-positive integer value).
+positive integer value, excluding terminal gaps).
 .IP \n+[step].
 \fIqlo\fR: first nucleotide of the query aligned with the
 target. Always equal to 1 if there is an alignment, 0 otherwise (see
@@ -3708,7 +3710,8 @@ Number of columns containing a gap extension (zero or positive integer
 value).
 .TP
 .B gaps
-Number of columns containing a gap (zero or positive integer value).
+Number of columns containing a gap (zero or positive integer value,
+excluding terminal gaps).
 .TP
 .B id
 The percentage of identity, according to the identity definition
@@ -3752,7 +3755,7 @@ value).
 .TP
 .B opens
 Number of columns containing a gap opening (zero or positive integer
-value).
+value, excluding terminal gaps).
 .TP
 .B pairs
 Number of columns containing only nucleotides. That value corresponds
@@ -4822,6 +4825,11 @@ precision for eeout option. Add warning about sintax algorithm, random
 seed and multiple threads. Refactor chimera detection code. Add
 undocumented experimental long_chimeras_denovo command. Fix segfault
 with clustering. Add more references.
+.TP
+.BR v2.24.0\~ "released October 26th, 2023"
+Update documentation. Improve code. Allow up to 20 parents for the
+undocumented and experimental chimeras_denovo command. Fix compilation
+warnings for sha1.c. Compile for release (not debug) by default.
 .LP
 .\" ============================================================================
 .\" TODO:


=====================================
src/Makefile.am
=====================================
@@ -10,6 +10,13 @@ AM_CFLAGS=-Wall -Wsign-compare -march=x86-64 -mtune=generic
 endif
 endif
 
+# Conditionally set NDEBUG based on ENABLE_DEBUG
+if ENABLE_DEBUG
+AM_CFLAGS += -UNDEBUG
+else
+AM_CFLAGS += -DNDEBUG
+endif
+
 AM_CXXFLAGS=$(AM_CFLAGS) -std=c++11
 
 export MACOSX_DEPLOYMENT_TARGET=10.9


=====================================
src/chimera.cc
=====================================
@@ -73,7 +73,6 @@
 /* global constants/data, no need for synchronization */
 static int parts = 0;
 const int maxparts = 100;
-const int maxparents = 4; /* max, could be fewer */
 const int window = 64;
 const int few = 4;
 const int maxcandidates = few * maxparts;


=====================================
src/chimera.h
=====================================
@@ -58,4 +58,6 @@
 
 */
 
+const int maxparents = 20; /* max, could be fewer */
+
 void chimera();


=====================================
src/mergepairs.cc
=====================================
@@ -281,19 +281,28 @@ inline int get_qual(char q)
   return qual;
 }
 
-inline double q_to_p(int q)
+
+inline auto q_to_p(int quality_symbol) -> double
 {
-  int x = q - opt_fastq_ascii;
-  if (x < 2)
-    {
-      return 0.75;
-    }
-  else
-    {
-      return exp10(-x/10.0);
-    }
+  static constexpr int low_quality_threshold = 2;
+  static constexpr double max_probability = 0.75;
+  static constexpr double quality_divider = 10.0;
+  static constexpr double power_base = 10.0;
+
+  assert(quality_symbol >= 33);
+  assert(quality_symbol <= 126);
+
+  const auto quality_value = static_cast<int>(quality_symbol - opt_fastq_ascii);
+
+  // refactor: extract branch to a separate operation
+  if (quality_value < low_quality_threshold) {
+    return max_probability;
+  }
+  // probability = 10^-(quality / 10)
+  return std::pow(power_base, -quality_value / quality_divider);
 }
 
+
 void precompute_qual()
 {
   /* Precompute tables of scores etc */


=====================================
src/sha1.c
=====================================
@@ -237,14 +237,21 @@ void SHA1_Final(SHA1_CTX* context, uint8_t digest[SHA1_DIGEST_SIZE])
 {
     uint32_t i;
     uint8_t  finalcount[8];
+    uint8_t padding_buffer[64];
+
+    for (i = 0; i < 64; i++) {
+      padding_buffer[i] = 0;
+    }
 
     for (i = 0; i < 8; i++) {
         finalcount[i] = (unsigned char)((context->count[(i >= 4 ? 0 : 1)]
          >> ((3-(i & 3)) * 8) ) & 255);  /* Endian independent */
     }
-    SHA1_Update(context, (uint8_t *)"\200", 1);
+    padding_buffer[0] = 0x80;
+    SHA1_Update(context, padding_buffer, 1);
+    padding_buffer[0] = 0x00;
     while ((context->count[0] & 504) != 448) {
-        SHA1_Update(context, (uint8_t *)"\0", 1);
+        SHA1_Update(context, padding_buffer, 1);
     }
     SHA1_Update(context, finalcount, 8);  /* Should cause a SHA1_Transform() */
     for (i = 0; i < SHA1_DIGEST_SIZE; i++) {


=====================================
src/vsearch.cc
=====================================
@@ -4692,7 +4692,7 @@ void args_init(int argc, char **argv)
 
   if (opt_min_unmasked_pct > opt_max_unmasked_pct)
     {
-      fatal("The argument to --min_unmasked_pct cannot be larger than to --max_unmasked_pct");
+      fatal("The argument to --min_unmasked_pct cannot be larger than --max_unmasked_pct");
     }
 
   if ((opt_fastq_ascii != 33) && (opt_fastq_ascii != 64))
@@ -4702,7 +4702,7 @@ void args_init(int argc, char **argv)
 
   if (opt_fastq_qmin > opt_fastq_qmax)
     {
-      fatal("The argument to --fastq_qmin cannot be larger than to --fastq_qmax");
+      fatal("The argument to --fastq_qmin cannot be equal to or greater than --fastq_qmax");
     }
 
   if (opt_fastq_ascii + opt_fastq_qmin < 33)
@@ -4717,7 +4717,7 @@ void args_init(int argc, char **argv)
 
   if (opt_fastq_qminout > opt_fastq_qmaxout)
     {
-      fatal("The argument to --fastq_qminout cannot be larger than to --fastq_qmaxout");
+      fatal("The argument to --fastq_qminout cannot be larger than --fastq_qmaxout");
     }
 
   if ((opt_fastq_asciiout != 33) && (opt_fastq_asciiout != 64))
@@ -4775,9 +4775,11 @@ void args_init(int argc, char **argv)
       fatal("The argument to chimeras_length_min must be at least 1");
     }
 
-  if ((opt_chimeras_parents_max < 2) || (opt_chimeras_parents_max > 4))
+  if ((opt_chimeras_parents_max < 2) || (opt_chimeras_parents_max > maxparents))
     {
-      fatal("The argument to chimeras_parents_max must be in the range 2 to 4");
+      char maxparents_string[25];
+      snprintf(maxparents_string, 25, "%d", maxparents);
+      fatal("The argument to chimeras_parents_max must be in the range 2 to %s.\n", maxparents_string);
     }
 
   if (options_selected[option_chimeras_parts] &&


=====================================
src/vsearch.h
=====================================
@@ -83,6 +83,7 @@
 #include <map>
 #include <set>
 #include <string>
+#include <cassert>
 
 /* include appropriate regex library */
 



View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/5e7c1018719731336974166c041c20050311b9da

-- 
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/5e7c1018719731336974166c041c20050311b9da
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20231108/bff3e2ad/attachment-0001.htm>


More information about the debian-med-commit mailing list