[med-svn] [Git][med-team/vsearch][master] 4 commits: New upstream version 2.15.1
Nilesh Patra
gitlab at salsa.debian.org
Fri Oct 30 19:22:45 GMT 2020
Nilesh Patra pushed to branch master at Debian Med / vsearch
Commits:
b20dc72e by Nilesh Patra at 2020-10-31T00:46:13+05:30
New upstream version 2.15.1
- - - - -
7a3f99ac by Nilesh Patra at 2020-10-31T00:46:13+05:30
routine-update: New upstream version
- - - - -
4b7737af by Nilesh Patra at 2020-10-31T00:46:15+05:30
Update upstream source from tag 'upstream/2.15.1'
Update to upstream version '2.15.1'
with Debian dir 080c9c547d7535a31bd1c36f687190b338fd8d67
- - - - -
63ce66fe by Nilesh Patra at 2020-10-31T00:46:32+05:30
routine-update: Ready to upload to unstable
- - - - -
10 changed files:
- README.md
- configure.ac
- debian/changelog
- man/vsearch.1
- src/arch.cc
- src/arch.h
- src/derep.cc
- src/dynlibs.cc
- src/filter.cc
- src/vsearch.cc
Changes:
=====================================
README.md
=====================================
@@ -34,7 +34,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
## Getting Help
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
## Example
@@ -47,9 +47,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
**Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
```
-wget https://github.com/torognes/vsearch/archive/v2.15.0.tar.gz
-tar xzf v2.15.0.tar.gz
-cd vsearch-2.15.0
+wget https://github.com/torognes/vsearch/archive/v2.15.1.tar.gz
+tar xzf v2.15.1.tar.gz
+cd vsearch-2.15.1
./autogen.sh
./configure
make
@@ -78,43 +78,43 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch-2.15.0-linux-x86_64.tar.gz
-tar xzf vsearch-2.15.0-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch-2.15.1-linux-x86_64.tar.gz
+tar xzf vsearch-2.15.1-linux-x86_64.tar.gz
```
Or these commands if you are using a Linux ppc64le system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch-2.15.0-linux-ppc64le.tar.gz
-tar xzf vsearch-2.15.0-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch-2.15.1-linux-ppc64le.tar.gz
+tar xzf vsearch-2.15.1-linux-ppc64le.tar.gz
```
Or these commands if you are using a Linux aarch64 system:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch-2.15.0-linux-aarch64.tar.gz
-tar xzf vsearch-2.15.0-linux-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch-2.15.1-linux-aarch64.tar.gz
+tar xzf vsearch-2.15.1-linux-aarch64.tar.gz
```
Or these commands if you are using a Mac:
```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch-2.15.0-macos-x86_64.tar.gz
-tar xzf vsearch-2.15.0-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch-2.15.1-macos-x86_64.tar.gz
+tar xzf vsearch-2.15.1-macos-x86_64.tar.gz
```
Or if you are using Windows, download and extract (unzip) the contents of this file:
```
-https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch-2.15.0-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch-2.15.1-win-x86_64.zip
```
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.15.0-linux-x86_64` or `vsearch-2.15.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.15.1-linux-x86_64` or `vsearch-2.15.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`.
-Windows: You will now have the binary distribution in a folder called `vsearch-2.15.0-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
+Windows: You will now have the binary distribution in a folder called `vsearch-2.15.1-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.15.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.15.1/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
## Packages, plugins, and wrappers
@@ -156,9 +156,11 @@ When compiling VSEARCH the header files for the following two optional libraries
* libz (zlib library) (zlib.h header file) (optional)
* libbz2 (bzip2lib library) (bzlib.h header file) (optional)
+VSEARCH will automatically check whether these libraries are available and load them dynamically.
+
On Windows these libraries are called zlib1.dll and bz2.dll.
-VSEARCH will automatically check whether these libraries are available and load them dynamically.
+Unfortunately, VSEARCH will not work properly with all the different variants of the `zlib1.dll` file on Windows. One that works well is provided by the MinGW-w64 project and is found in the `bin` folder within the [zlib-1.2.5-bin-x64.zip](https://sourceforge.net/projects/mingw-w64/files/External%20binary%20packages%20%28Win64%20hosted%29/Binaries%20%2864-bit%29/zlib-1.2.5-bin-x64.zip) archive available on SourceForge. The MD5 of the `zlib1.dll` file should be `0f67ee0b965d3d29388c238aebcf60bc`.
To create the PDF file with the manual the ps2pdf tool is required. It is part of the ghostscript package.
=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.15.0], [torognes at ifi.uio.no])
+AC_INIT([vsearch], [2.15.1], [torognes at ifi.uio.no])
AC_CANONICAL_TARGET
AM_INIT_AUTOMAKE([subdir-objects])
AC_LANG([C++])
=====================================
debian/changelog
=====================================
@@ -1,3 +1,9 @@
+vsearch (2.15.1-1) unstable; urgency=medium
+
+ * New upstream version
+
+ -- Nilesh Patra <npatra974 at gmail.com> Sat, 31 Oct 2020 00:46:32 +0530
+
vsearch (2.15.0-1) unstable; urgency=medium
* New upstream version 2.15.0
=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
.\" ============================================================================
-.TH vsearch 1 "June 19, 2020" "version 2.15.0" "USER COMMANDS"
+.TH vsearch 1 "October 28, 2020" "version 2.15.1" "USER COMMANDS"
.\" ============================================================================
.SH NAME
vsearch \(em chimera detection, clustering, dereplication and
@@ -977,7 +977,7 @@ Label of the centroid sequence (H), or set to '*' (S, C).
.TP
.BI \-\-unoise_alpha\~ real
Specify the alpha parameter to the \-\-cluster_unoise command. The
-default i 2.0.
+default is 2.0.
.TAG usersort
.TP
.B \-\-usersort
@@ -1324,11 +1324,11 @@ command can be used to convert SFF files to FASTQ.
.TAG eeout
.TP 9
.B \-\-eeout
-When using \-\-fastq_filter or \-\-fastq_mergepairs, include the
-number of expected errors (ee) in the sequence header of FASTQ and
-FASTA files. This option is a synonym of the \-\-fastq_eeout
-option. Use the \-\-xee option to remove this information from
-headers.
+When using \-\-fastq_filter, \-\-fastx_filter or \-\-fastq_mergepairs,
+include the number of expected errors (ee) in the sequence header of
+FASTQ and FASTA output files. This option is a synonym of the
+\-\-fastq_eeout option. Use the \-\-xee option to remove this
+information from headers.
.TAG eetabbedout
.TP
.BI \-\-eetabbedout \0filename
@@ -1792,24 +1792,30 @@ detected. If the input consists of paired sequences, an input file
with reverse reads may be specified with the \-\-reverse option, and
corresponding output will be written to the files specified with the
\-\-fastqout_rev, \-\-fastaout_rev, \-\-fastqout_discarded_rev, and
-\-\-fastaout_discarded_rev options. Output can not be written to FASTQ files
-if the input is in FASTA format. The sequences are first trimmed and
-then filtered based on the remaining bases. Sequences may be trimmed
-using the options \-\-fastq_stripleft, \-\-fastq_stripright,
+\-\-fastaout_discarded_rev options. Output can not be written to FASTQ
+files if the input is in FASTA format. The sequences are first trimmed
+and then filtered based on the remaining bases. Sequences may be
+trimmed using the options \-\-fastq_stripleft, \-\-fastq_stripright,
\-\-fastq_truncee, \-\-fastq_trunclen, \-\-fastq_trunclen_keep and
-\-\-fastq_truncqual. The sequences may be filtered using the options
+\-\-fastq_truncqual. The sequences may be filtered using the options
\-\-fastq_maxee, \-\-fastq_maxee_rate, \-\-fastq_maxlen,
\-\-fastq_maxns, \-\-fastq_minlen (default 1), \-\-fastq_trunclen,
\-\-maxsize, and \-\-minsize. Sequences not satisfying the
requirements are discarded. For pairs of sequences, both sequences in
-a pair must satisfy the requirements, otherwise both are
-discarded. If no shortening or filtering options are given, all
-sequences are written to the output files, possibly after conversion
-from FASTQ to FASTA format. The \-\-relabel option may be used to
-relabel the output sequences. The \-\-eeout option may be used to output the
-expected number of errors in each sequence. After all sequences have
-been processed, the number of kept and discarded sequences will be
-shown, as well as how many of the kept sequences were trimmed.
+a pair must satisfy the requirements, otherwise both are discarded. If
+no shortening or filtering options are given, all sequences are
+written to the output files, possibly after conversion from FASTQ to
+FASTA format. The \-\-relabel option may be used to relabel the output
+sequences. The \-\-eeout option may be used to output the expected
+number of errors in each sequence. After all sequences have been
+processed, the number of kept and discarded sequences will be shown,
+as well as how many of the kept sequences were trimmed. When the input
+is in FASTA format, the following options are not accepted because
+quality scores are not available: \-\-eeout, \-\-fastq_ascii,
+\-\-fastq_eeout, \-\-fastq_maxee, \-\-fastq_maxee_rate, \-\-fastq_out,
+\-\-fastq_qmax, \-\-fastq_qmin, \-\-fastq_truncee,
+\-\-fastq_truncqual, \-\-fastqout_discarded,
+\-\-fastqout_discarded_rev, \-\-fastqout_rev.
.TAG fastx_revcomp
.TP
.BI \-\-fastx_revcomp \0filename
@@ -4285,6 +4291,13 @@ error messages when parsing FASTQ files. Add missing fastq_qminout
option and fix label_suffix option for fastq_mergepairs. Add derep_id
command that dereplicates based on both label and sequence. Remove
compilation warnings.
+.TP
+.BR v2.15.1\~ "released October 28th, 2020"
+Fix for dereplication when including reverse complement sequences and
+headers. Make some extra checks when loading compression libraries and
+add more diagnostic output about them to the output of the version
+command. Report an error when fastx_filter is used with FASTA input
+and options that require FASTQ input. Update manual.
.LP
.\" ============================================================================
.\" TODO:
=====================================
src/arch.cc
=====================================
@@ -307,3 +307,16 @@ const char * xstrcasestr(const char * haystack, const char * needle)
return strcasestr(haystack, needle);
#endif
}
+
+#ifdef _WIN32
+FARPROC arch_dlsym(HMODULE handle, const char * symbol)
+#else
+void * arch_dlsym(void * handle, const char * symbol)
+#endif
+{
+#ifdef _WIN32
+ return GetProcAddress(handle, symbol);
+#else
+ return dlsym(handle, symbol);
+#endif
+}
=====================================
src/arch.h
=====================================
@@ -83,3 +83,9 @@ int xopen_read(const char * path);
int xopen_write(const char * path);
const char * xstrcasestr(const char * haystack, const char * needle);
+
+#ifdef _WIN32
+FARPROC arch_dlsym(HMODULE handle, const char * symbol);
+#else
+void * arch_dlsym(void * handle, const char * symbol);
+#endif
=====================================
src/derep.cc
=====================================
@@ -387,9 +387,13 @@ void derep(char * input_filename, bool use_header)
collision when the number of sequences is about 5e9.
*/
- uint64_t hash = HASH(seq_up, seqlen);
+ uint64_t hash_header;
if (use_header)
- hash ^= HASH(header, headerlen);
+ hash_header = HASH(header, headerlen);
+ else
+ hash_header = 0;
+
+ uint64_t hash = HASH(seq_up, seqlen) ^ hash_header;
uint64_t j = hash & hash_mask;
struct bucket * bp = hashtable + j;
@@ -408,7 +412,7 @@ void derep(char * input_filename, bool use_header)
/* no match on plus strand */
/* check minus strand as well */
- uint64_t rc_hash = HASH(rc_seq_up, seqlen);
+ uint64_t rc_hash = HASH(rc_seq_up, seqlen) ^ hash_header;
uint64_t k = rc_hash & hash_mask;
struct bucket * rc_bp = hashtable + k;
=====================================
src/dynlibs.cc
=====================================
@@ -72,13 +72,11 @@ const char gz_libname[] = "libz.so";
# endif
void * gz_lib;
# endif
-gzFile (*gzdopen_p)(int, const char *);
-int (*gzclose_p)(gzFile);
-int (*gzread_p)(gzFile, void *, unsigned);
-int (*gzgetc_p)(gzFile);
-int (*gzrewind_p)(gzFile);
-int (*gzungetc_p)(int, gzFile);
-const char * (*gzerror_p)(gzFile, int*);
+
+gzFile ZEXPORT (*gzdopen_p) OF((int, const char *));
+int ZEXPORT (*gzclose_p) OF((gzFile));
+int ZEXPORT (*gzread_p) OF((gzFile, void *, unsigned));
+
#endif
#ifdef HAVE_BZLIB_H
@@ -98,19 +96,6 @@ void (*BZ2_bzReadClose_p)(int*, BZFILE*);
int (*BZ2_bzRead_p)(int*, BZFILE*, void*, int);
#endif
-#ifdef _WIN32
-FARPROC arch_dlsym(HMODULE handle, const char * symbol)
-#else
-void * arch_dlsym(void * handle, const char * symbol)
-#endif
-{
-#ifdef _WIN32
- return GetProcAddress(handle, symbol);
-#else
- return dlsym(handle, symbol);
-#endif
-}
-
void dynlibs_open()
{
#ifdef HAVE_ZLIB_H
@@ -124,10 +109,8 @@ void dynlibs_open()
gzdopen_p = (gzFile (*)(int, const char*)) arch_dlsym(gz_lib, "gzdopen");
gzclose_p = (int (*)(gzFile)) arch_dlsym(gz_lib, "gzclose");
gzread_p = (int (*)(gzFile, void*, unsigned)) arch_dlsym(gz_lib, "gzread");
- gzgetc_p = (int (*)(gzFile)) arch_dlsym(gz_lib, "gzgetc");
- gzrewind_p = (int (*)(gzFile)) arch_dlsym(gz_lib, "gzrewind");
- gzerror_p = (const char * (*)(gzFile, int*)) arch_dlsym(gz_lib, "gzerror");
- gzungetc_p = (int (*)(int, gzFile)) arch_dlsym(gz_lib, "gzungetc");
+ if (!(gzdopen_p && gzclose_p && gzread_p))
+ fatal("Invalid compression library (zlib)");
}
#endif
@@ -145,6 +128,8 @@ void dynlibs_open()
arch_dlsym(bz2_lib, "BZ2_bzReadClose");
BZ2_bzRead_p = (int (*)(int*, BZFILE*, void*, int))
arch_dlsym(bz2_lib, "BZ2_bzRead");
+ if (!(BZ2_bzReadOpen_p && BZ2_bzReadClose_p && BZ2_bzRead_p))
+ fatal("Invalid compression library (bz2)");
}
#endif
}
=====================================
src/filter.cc
=====================================
@@ -223,11 +223,29 @@ void filter(bool fastq_only, char * filename)
if (!h1)
fatal("Unrecognized file type (not proper FASTA or FASTQ format)");
- if (fastq_only && ! h1->is_fastq)
- fatal("FASTA input files not allowed with fastq_filter, consider using fastx_filter command instead");
-
- if ((opt_fastqout || opt_fastqout_discarded) && ! h1->is_fastq)
- fatal("Cannot write FASTQ output with FASTA input file (no quality scores)");
+ if (! h1->is_fastq)
+ {
+ if (fastq_only)
+ {
+ fatal("FASTA input files not allowed with fastq_filter, consider using fastx_filter command instead");
+ }
+ else if (opt_eeout ||
+ (opt_fastq_ascii != 33) ||
+ opt_fastq_eeout ||
+ (opt_fastq_maxee < DBL_MAX) ||
+ (opt_fastq_maxee_rate < DBL_MAX) ||
+ opt_fastqout ||
+ (opt_fastq_qmax < 41) ||
+ (opt_fastq_qmin > 0) ||
+ (opt_fastq_truncee < DBL_MAX) ||
+ (opt_fastq_truncqual < LONG_MIN) ||
+ opt_fastqout_discarded ||
+ opt_fastqout_discarded_rev ||
+ opt_fastqout_rev)
+ {
+ fatal("The following options are not accepted with the fastx_filter command when the input is a FASTA file, because quality scores are not available: eeout, fastq_ascii, fastq_eeout, fastq_maxee, fastq_maxee_rate, fastq_out, fastq_qmax, fastq_qmin, fastq_truncee, fastq_truncqual, fastqout_discarded, fastqout_discarded_rev, fastqout_rev");
+ }
+ }
uint64_t filesize = fastx_get_size(h1);
@@ -238,11 +256,32 @@ void filter(bool fastq_only, char * filename)
if (!h2)
fatal("Unrecognized file type (not proper FASTA or FASTQ format) for reverse reads");
- if (fastq_only && ! h2->is_fastq)
- fatal("FASTA input files not allowed with fastq_filter, consider using fastx_filter command instead");
+ if (h1->is_fastq != h2->is_fastq)
+ fatal("The forward and reverse input sequence must in the same format, either FASTA or FASTQ");
- if ((opt_fastqout_rev || opt_fastqout_discarded_rev) && ! h2->is_fastq)
- fatal("Cannot write FASTQ output with a FASTA input file, lacking quality scores");
+ if (! h2->is_fastq)
+ {
+ if (fastq_only)
+ {
+ fatal("FASTA input files not allowed with fastq_filter, consider using fastx_filter command instead");
+ }
+ else if (opt_eeout ||
+ (opt_fastq_ascii != 33) ||
+ opt_fastq_eeout ||
+ (opt_fastq_maxee < DBL_MAX) ||
+ (opt_fastq_maxee_rate < DBL_MAX) ||
+ opt_fastqout ||
+ (opt_fastq_qmax < 41) ||
+ (opt_fastq_qmin > 0) ||
+ (opt_fastq_truncee < DBL_MAX) ||
+ (opt_fastq_truncqual < LONG_MIN) ||
+ opt_fastqout_discarded ||
+ opt_fastqout_discarded_rev ||
+ opt_fastqout_rev)
+ {
+ fatal("The following options are not accepted with the fastx_filter command when the input is a FASTA file, because quality scores are not available: eeout, fastq_ascii, fastq_eeout, fastq_maxee, fastq_maxee_rate, fastq_out, fastq_qmax, fastq_qmin, fastq_truncee, fastq_truncqual, fastqout_discarded, fastqout_discarded_rev, fastqout_rev");
+ }
+ }
}
FILE * fp_fastaout = 0;
=====================================
src/vsearch.cc
=====================================
@@ -4192,7 +4192,23 @@ void cmd_version()
#ifdef HAVE_ZLIB_H
printf("Compiled with support for gzip-compressed files,");
if (gz_lib)
- printf(" and the library is loaded.\n");
+ {
+ printf(" and the library is loaded.\n");
+
+ char * (*zlibVersion_p)();
+ zlibVersion_p = (char * (*)()) arch_dlsym(gz_lib,
+ "zlibVersion");
+ char * gz_version = (*zlibVersion_p)();
+ uLong (*zlibCompileFlags_p)(void);
+ zlibCompileFlags_p = (uLong (*)()) arch_dlsym(gz_lib,
+ "zlibCompileFlags");
+ uLong flags = (*zlibCompileFlags_p)();
+
+ printf("zlib version %s, compile flags %lx", gz_version, flags);
+ if (flags & 0x0400)
+ printf(" (ZLIB_WINAPI)");
+ printf("\n");
+ }
else
printf(" but the library was not found.\n");
#else
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/91780227165a2cd116ba9198a8bf8990fca730d9...63ce66fe858819f1b3391d959bdbf10f946b34bc
--
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/compare/91780227165a2cd116ba9198a8bf8990fca730d9...63ce66fe858819f1b3391d959bdbf10f946b34bc
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20201030/93663c1d/attachment-0001.html>
More information about the debian-med-commit
mailing list