[med-svn] [Git][med-team/vsearch][upstream] New upstream version 2.23.0

Andreas Tille (@tille) gitlab at salsa.debian.org
Thu Jul 13 12:31:45 BST 2023



Andreas Tille pushed to branch upstream at Debian Med / vsearch


Commits:
e00e6bba by Andreas Tille at 2023-07-13T13:24:27+02:00
New upstream version 2.23.0
- - - - -


27 changed files:

- .travis.yml
- + CITATION.cff
- README.md
- configure.ac
- man/vsearch.1
- src/align.cc
- src/align_simd.cc
- src/attributes.cc
- src/attributes.h
- src/chimera.cc
- src/cluster.cc
- src/derep.cc
- src/derepsmallmem.cc
- src/fasta.cc
- src/fastq.cc
- src/getseq.cc
- src/mergepairs.cc
- src/msa.cc
- src/results.cc
- src/search.cc
- src/searchcore.cc
- src/searchcore.h
- src/searchexact.cc
- src/udb.cc
- src/vsearch.cc
- src/vsearch.h
- src/xstring.h


Changes:

=====================================
.travis.yml
=====================================
@@ -4,7 +4,7 @@ language:
 arch:
 #- amd64
 - arm64
-- ppc64le
+#- ppc64le
 
 os:
 - linux


=====================================
CITATION.cff
=====================================
@@ -0,0 +1,46 @@
+cff-version: 1.2.0
+message: "If you use this software, please cite it as below."
+authors:
+- family-names: "Rognes"
+  given-names: "Torbjørn"
+  orcid: "https://orcid.org/0000-0002-9329-9974"
+- family-names: "Flouri"
+  given-names: "Tomas"
+  orcid: "https://orcid.org/0000-0002-8474-9507"
+- family-names: "Nichols"
+  given-names: "Benjamin"
+- family-names: "Quince"
+  given-names: "Christopher"
+  orcid: "https://orcid.org/0000-0003-1884-8440"
+- family-names: "Mahé"
+  given-names: "Frédéric"
+  orcid: "https://orcid.org/0000-0002-2808-0984"
+title: "VSEARCH: versatile open-source tool for microbiome analysis"
+version: 2.22.1
+date-released: 2022-09-19
+url: "https://github.com/torognes/vsearch"
+preferred-citation:
+  type: article
+  authors:
+  - family-names: "Rognes"
+    given-names: "Torbjørn"
+    orcid: "https://orcid.org/0000-0002-9329-9974"
+  - family-names: "Flouri"
+    given-names: "Tomas"
+    orcid: "https://orcid.org/0000-0002-8474-9507"
+  - family-names: "Nichols"
+    given-names: "Ben"
+  - family-names: "Quince"
+    given-names: "Christopher"
+    orcid: "https://orcid.org/0000-0003-1884-8440"
+  - family-names: "Mahé"
+    given-names: "Frédéric"
+    orcid: "https://orcid.org/0000-0002-2808-0984"
+  doi: "10.7717/peerj.2584"
+  journal: "Peer Journal"
+  day: 18
+  month: 10
+  start: e2584 # First page number
+  title: "VSEARCH: a versatile open source tool for metagenomic"
+  volume: 4
+  year: 2016


=====================================
README.md
=====================================
@@ -1,4 +1,4 @@
-[![Build Status](https://travis-ci.com/torognes/vsearch.svg?branch=master)](https://travis-ci.com/torognes/vsearch)
+[![Build Status](https://app.travis-ci.com/torognes/vsearch.svg?branch=master)](https://app.travis-ci.com/torognes/vsearch)
 
 # VSEARCH
 
@@ -37,7 +37,7 @@ Most of the nucleotide based commands and options in USEARCH version 7 are suppo
 
 ## Getting Help
 
-If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
+If you can't find an answer in the [VSEARCH documentation](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf), please visit the [VSEARCH Web Forum](https://groups.google.com/forum/#!forum/vsearch-forum) to post a question or start a discussion.
 
 ## Example
 
@@ -50,9 +50,9 @@ In the example below, VSEARCH will identify sequences in the file database.fsa t
 **Source distribution** To download the source distribution from a [release](https://github.com/torognes/vsearch/releases) and build the executable and the documentation, use the following commands:
 
 ```
-wget https://github.com/torognes/vsearch/archive/v2.22.1.tar.gz
-tar xzf v2.22.1.tar.gz
-cd vsearch-2.22.1
+wget https://github.com/torognes/vsearch/archive/v2.23.0.tar.gz
+tar xzf v2.23.0.tar.gz
+cd vsearch-2.23.0
 ./autogen.sh
 ./configure CFLAGS="-O3" CXXFLAGS="-O3"
 make
@@ -61,7 +61,7 @@ make install  # as root or sudo make install
 
 You may customize the installation directory using the `--prefix=DIR` option to `configure`. If the compression libraries [zlib](https://www.zlib.net) and/or [bzip2](https://www.sourceware.org/bzip2/) are installed on the system, they will be detected automatically and support for compressed files will be included in vsearch. Support for compressed files may be disabled using the `--disable-zlib` and `--disable-bzip2` options to `configure`. A PDF version of the manual will be created from the `vsearch.1` manual file if `ps2pdf` is available, unless disabled using the `--disable-pdfman` option to `configure`. It is recommended to run configure with the options `CFLAGS="-O3"` and `CXXFLAGS="-O3"`. Other  options may also be applied to `configure`, please run `configure -h` to see them all. GNU autoconf (version 2.63 or later), automake and the GCC C++ compiler is required to build vsearch. Version 3.82 or later of Make may be required on Linux, while version 3.81 is sufficient on macOS.
 
-The distributed Linux ppc64le and aarch64 binaries and the Windows binary were compiled using the [Mingw-w64](http://mingw-w64.org/) C++ cross-compiler.
+The distributed Linux ppc64le and aarch64 binaries were compiled using the C++ cross-compiler. The Windows binary was built using [Mingw-w64](http://mingw-w64.org/).
 
 **Cloning the repo** Instead of downloading the source distribution as a compressed archive, you could clone the repo and build it as shown below. The options to `configure` as described above are still valid.
 
@@ -81,43 +81,59 @@ Binary distributions are provided for x86-64 systems running GNU/Linux, macOS (v
 Download the appropriate executable for your system using the following commands if you are using a Linux x86_64 system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch-2.22.1-linux-x86_64.tar.gz
-tar xzf vsearch-2.22.1-linux-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-x86_64.tar.gz
+tar xzf vsearch-2.23.0-linux-x86_64.tar.gz
 ```
 
 Or these commands if you are using a Linux ppc64le system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch-2.22.1-linux-ppc64le.tar.gz
-tar xzf vsearch-2.22.1-linux-ppc64le.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-ppc64le.tar.gz
+tar xzf vsearch-2.23.0-linux-ppc64le.tar.gz
 ```
 
 Or these commands if you are using a Linux aarch64 (arm64) system:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch-2.22.1-linux-aarch64.tar.gz
-tar xzf vsearch-2.22.1-linux-aarch64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-linux-aarch64.tar.gz
+tar xzf vsearch-2.23.0-linux-aarch64.tar.gz
 ```
 
-Or these commands if you are using a Mac:
+Or these commands if you are using a Mac with an Apple Silicon CPU:
 
 ```sh
-wget https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch-2.22.1-macos-x86_64.tar.gz
-tar xzf vsearch-2.22.1-macos-x86_64.tar.gz
+wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-macos-aarch64.tar.gz
+tar xzf vsearch-2.23.0-macos-aarch64.tar.gz
+```
+
+Or these commands if you are using a Mac with an Intel CPU:
+
+```sh
+wget https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-macos-x86_64.tar.gz
+tar xzf vsearch-2.23.0-macos-x86_64.tar.gz
 ```
 
 Or if you are using Windows, download and extract (unzip) the contents of this file:
 
 ```
-https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch-2.22.1-win-x86_64.zip
+https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch-2.23.0-win-x86_64.zip
 ```
 
-Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.22.1-linux-x86_64` or `vsearch-2.22.1-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
+Linux and Mac: You will now have the binary distribution in a folder called `vsearch-2.23.0-linux-x86_64` or `vsearch-2.23.0-macos-x86_64` in which you will find three subfolders `bin`, `man` and `doc`. We recommend making a copy or a symbolic link to the vsearch binary `bin/vsearch` in a folder included in your `$PATH`, and a copy or a symbolic link to the vsearch man page `man/vsearch.1` in a folder included in your `$MANPATH`. The PDF version of the manual is available in `doc/vsearch_manual.pdf`. Versions with statically compiled libraries are available for Linux systems. These have "-static" in their name, and could be used on systems that do not have all the necessary libraries installed.
 
-Windows: You will now have the binary distribution in a folder called `vsearch-2.22.1-win-x86_64`. The vsearch executable is called `vsearch.exe`. The manual in PDF format is called `vsearch_manual.pdf`.
+**Windows**: You will now have the binary distribution in a folder
+called `vsearch-2.23.0-win-x86_64`. The vsearch executable is called
+`vsearch.exe`. The manual in PDF format is called
+`vsearch_manual.pdf`. If you want to be able to call `vsearch.exe`
+from any command prompt window, you can put the vsearch executable in
+a folder (for instance `C:\Users\<yourname>\bin`), and add the new
+folder to the user `Path`: open the `Environment Variables` window by
+searching for it in the Start menu, `Edit` user variables, add
+`;C:\Users\<yourname>\bin` to the end of the `Path` variable, and save
+your changes.
 
 
-**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.22.1/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
+**Documentation** The VSEARCH user's manual is available in the `man` folder in the form of a [man page](https://github.com/torognes/vsearch/blob/master/man/vsearch.1). A pdf version ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf)) will be generated by `make`. To install the manpage manually, copy the `vsearch.1` file or a create a symbolic link to `vsearch.1` in a folder included in your `$MANPATH`. The manual in both formats is also available with the binary distribution. The manual in PDF form ([vsearch_manual.pdf](https://github.com/torognes/vsearch/releases/download/v2.23.0/vsearch_manual.pdf)) is also attached to the latest [release](https://github.com/torognes/vsearch/releases).
 
 
 ## Packages, plugins, and wrappers
@@ -314,16 +330,26 @@ and the [Protist Ribosomal Reference Database (PR<sup>2</sup>)](https://github.c
 *Bioinformatics*, 26 (19): 2460-2461.
 doi:[10.1093/bioinformatics/btq461](https://doi.org/10.1093/bioinformatics/btq461)
 
-* Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R (2011)
-**UCHIME improves sensitivity and speed of chimera detection.**
-*Bioinformatics*, 27 (16): 2194-2200.
-doi:[10.1093/bioinformatics/btr381](https://doi.org/10.1093/bioinformatics/btr381)
+* Edgar RC (2016)
+**SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences.**
+*bioRxiv*.
+doi:[10.1101/074161](https://doi.org/10.1101/074161)
+
+* Edgar RC (2016)
+**UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing.**
+*bioRxiv*.
+doi:[10.1101/081257](https://doi.org/10.1101/081257)
 
 * Edgar RC, Flyvbjerg H (2015)
 **Error filtering, pair assembly and error correction for next-generation sequencing reads.**
 *Bioinformatics*, 31 (21): 3476-3482.
 doi:[10.1093/bioinformatics/btv401](https://doi.org/10.1093/bioinformatics/btv401)
 
+* Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R (2011)
+**UCHIME improves sensitivity and speed of chimera detection.**
+*Bioinformatics*, 27 (16): 2194-2200.
+doi:[10.1093/bioinformatics/btr381](https://doi.org/10.1093/bioinformatics/btr381)
+
 * Guillou L, Bachar D, Audic S, Bass D, Berney C, Bittner L, Boutte C, Burgaud G, de Vargas C, Decelle J, del Campo J, Dolan J, Dunthorn M, Edvardsen B, Holzmann M, Kooistra W, Lara E, Lebescot N, Logares R, Mahé F, Massana R, Montresor M, Morard R, Not F, Pawlowski J, Probert I, Sauvadet A-L, Siano R, Stoeck T, Vaulot D, Zimmermann P & Christen R (2013)
 **The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy.**
 *Nucleic Acids Research*, 41 (D1), D597-D604.


=====================================
configure.ac
=====================================
@@ -2,7 +2,7 @@
 # Process this file with autoconf to produce a configure script.
 
 AC_PREREQ([2.63])
-AC_INIT([vsearch], [2.22.1], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
+AC_INIT([vsearch], [2.23.0], [torognes at ifi.uio.no], [vsearch], [https://github.com/torognes/vsearch])
 AC_CANONICAL_TARGET
 AM_INIT_AUTOMAKE([subdir-objects])
 AC_LANG([C++])


=====================================
man/vsearch.1
=====================================
@@ -1,5 +1,5 @@
 .\" ============================================================================
-.TH vsearch 1 "September 19, 2022" "version 2.22.1" "USER COMMANDS"
+.TH vsearch 1 "July 7, 2023" "version 2.23.0" "USER COMMANDS"
 .\" ============================================================================
 .SH NAME
 vsearch \(em a versatile open-source tool for microbiome analysis,
@@ -123,7 +123,7 @@ Masking:
 .RE
 Orienting:
 .RS
-\fBvsearch\fR \-\-orient \fIfastxfile\fR \-\-db \fIfastafile\fR
+\fBvsearch\fR \-\-orient \fIfastxfile\fR \-\-db \fIfastxfile\fR
 (\-\-fastaout | \-\-fastqout | \-\-notmatched | \-\-tabbedout)
 \fIoutputfile\fR [\fIoptions\fR]
 .PP
@@ -285,12 +285,17 @@ dereplication. To do so, file names can be replaced with:
 .RS
 .IP - 2
 the symbol '-', representing '/dev/stdin' for input files
-or '/dev/stdout' for output files,
+or '/dev/stdout' for output files (with an exception for '\-\-db \-',
+see * below),
 .IP -
 a named pipe created with the command mkfifo,
 .IP -
 a process substitution '<(command)' as input or '>(command)' as
 output.
+.IP *
+\-\-db \- is not accepted, to prevent potential concurrent reads from
+stdin. A workaround for advanced users is to call '\-\-db /dev/stdin'
+directly.
 .RE
 .PP
 \fBvsearch\fR can automatically read compressed gzip or bzip2 files if
@@ -381,6 +386,7 @@ All \fBvsearch\fR operations discard sequences shorter than
 \fIinteger\fR: 1 nucleotide by default for sorting or shuffling, 32
 nucleotides for clustering and dereplication as well as the commands
 \-\-makeudb_usearch, \-\-sintax, and \-\-usearch_global.
+.\" note: minseqlength can be set to zero (keep empty entries)
 .TAG no_progress
 .TP
 .B \-\-no_progress
@@ -488,6 +494,11 @@ as a chimera (less false positives, but also more false negatives).
 Add the chimera score to the headers in the fasta output files for
 chimeras, non-chimeras and borderline sequences, using the
 format ';uchime_denovo=\fIfloat\fR;'.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG mindiffs
 .TP
 .BI \-\-mindiffs\~ "positive integer"
@@ -667,15 +678,20 @@ When using \-\-uchimeout, write chimera detection results using a
 17\-field, tab\-separated uchime\-like format (drop the 5th field of
 \-\-uchimeout), compatible with usearch version 5 and earlier
 versions.
+.TAG xlength
 .TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xn
+.TP
 .BI \-\-xn\~ "strictly positive real number"
 weight of no votes, corresponding to the parameter \fIbeta\fR in the
 scoring function (default value is 8.0). Increasing \-\-xn reduces the
 likelihood of tagging a sequence as a chimera (less false positives,
 but also more false negatives).
-.TP
 .TAG xsize
+.TP
 .B \-\-xsize
 Strip abundance information from the headers when writing the output
 file.
@@ -775,10 +791,20 @@ by decreasing sequence length, unless \-\-usersort is used.
 .BI \-\-cluster_unoise \0filename
 Perform denoising of the fasta sequences in \fIfilename\fR according
 to the UNOISE version 3 algorithm by Robert Edgar, but without the
-chimera removal step. The options \-\-minsize (default 8) and
-\-\-unoise_alpha (default 2.0) may be specified. Chimera removal
-(\fIde novo\fR) should be performed afterwards with
-\-\-uchime3_denovo.
+\fIde novo\fR chimera removal step, which may be performed afterwards
+with \-\-uchime3_denovo. The options \-\-minsize (default 8) and
+\-\-unoise_alpha (default 2.0) may be specified. In the this
+algorithm, clustering of sequences depend on both the sequence
+distance and the abundance ratio. The abundance ratio (skew) is the
+abundance of a new sequence divided by the abundance of the centroid
+sequence. This skew must not be larger than beta if the sequences
+should be clustered together. Beta is calculated as 2 raised to the
+power of minus 1 minus alpha times the sequence distance. The sequence
+distance used is the number of mismatches in the alignment, ignoring
+gaps. This means that the abundance must be exponentially lower as the
+distance increases from the centroid for a new sequence to be included
+in the cluster. Nearer sequences with higher abundances will form
+their own new clusters.
 .TAG clusters
 .TP
 .BI \-\-clusters \0string
@@ -789,11 +815,13 @@ filenames.
 .TP
 .BI \-\-consout \0filename
 Output cluster consensus sequences to \fIfilename\fR. For each
-cluster, a multiple alignment is computed, and a consensus sequence is
+cluster, a center-star multiple sequence alignment is computed with
+the centroid as the center, using a fast algorithm (not accurate when
+using low pairwise identity thresholds). A consensus sequence is
 constructed by taking the majority symbol (nucleotide or gap) from
 each column of the alignment. Columns containing a majority of gaps
-are skipped, except for terminal gaps. If the \-\-sizein
-option is specified, sequence abundances will be taken into account.
+are skipped, except for terminal gaps. If the \-\-sizein option is
+specified, sequence abundances will be taken into account.
 .TAG cons_truncate
 .TP
 .B \-\-cons_truncate
@@ -836,6 +864,11 @@ BLAST definition, equivalent to \-\-iddef 1 in a context of global
 pairwise alignment.
 .RE
 .RE
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG minsize
 .TP
 .BI \-\-minsize\~ "positive integer"
@@ -1027,6 +1060,11 @@ default is 2.0.
 .B \-\-usersort
 When using \-\-cluster_smallmem, allow any sequence input order, not
 just a decreasing length ordering.
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -1072,9 +1110,10 @@ a special UCLUST-like file specified with the \-\-uc option. The
 according to the abundance of each input sequence. Other valid options
 are \-\-fastq_ascii, \-\-fastq_asciiout, \-\-fastq_qmax,
 \-\-fastq_qmaxout, \-\-fastq_qmin, \-\-fastq_qminout,
-\-\-fastq_qout_max, \-\-maxuniquesize, \-\-minuniquesize, \-\-relabel,
-\-\-relabel_keep, \-\-relabel_md5, \-\-relabel_self, \-\-relabel_sha1,
-\-\-sizein, \-\-sizeout, \-\-strand, \-\-topn, and \-\-xsize.
+\-\-fastq_qout_max, \-\-lengthout, \-\-maxuniquesize,
+\-\-minuniquesize, \-\-relabel, \-\-relabel_keep, \-\-relabel_md5,
+\-\-relabel_self, \-\-relabel_sha1, \-\-sizein, \-\-sizeout,
+\-\-strand, \-\-topn, \-\-xlength, and \-\-xsize.
 .PP
 .TAG derep_fulllength
 .TP 9
@@ -1098,15 +1137,16 @@ Merge strictly identical sequences contained in \fIfilename\fR, as
 with the \-\-derep_fulllength command, but using much less memory. The
 output is written to a FASTA file specified with the \-\-fastaout
 option. The output is written in the order that the sequences first
-appear in the input, and not in decending abundance order as with the
+appear in the input, and not in descending abundance order as with the
 other dereplication commands. It can read, but not write FASTQ
 files. This command cannot read from a pipe, it must be a proper file,
 as it is read twice. Dereplication is performed with a 128 bit hash
 function and it is not verified that grouped sequences are identical,
 however the probability that two different sequences are grouped in a
-dataset of 1 000 000 000 unique sequences is approximately 1e-21.
-Multithreading and the options \-\-topn, \-\-uc, or \-\-tabbedout are
-not supported.
+dataset of 1 000 000 000 unique sequences is approximately
+1e-21. Memory footprint is appr. 24 bytes times the number of unique
+sequence. Multithreading and the options \-\-topn, \-\-uc, or
+\-\-tabbedout are not supported.
 .TAG derep_prefix
 .TP
 .BI \-\-derep_prefix \0filename
@@ -1201,6 +1241,12 @@ output files will correspond to the average error probability of the
 nucleotides in the each position. If the \-\-fastq_qout_max option is
 given, the quality score will be the highest (best) quality score
 observed in each position.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA and
+FASTQ format by adding a ";length=\fIinteger\fR" attribute in the
+header.
 .TAG maxuniquesize
 .TP
 .BI \-\-maxuniquesize\~ "positive integer"
@@ -1335,6 +1381,11 @@ Label of the centroid sequence (H), or set to '*' (S, C).
 .RE
 .PP
 .RS
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -1722,10 +1773,32 @@ When using \-\-fastq_filter, \-\-fastq_mergepairs or \-\-fastx_filter,
 discard sequences with an expected error greater than the specified
 number (value ranging from 0.0 to infinity). For a given sequence, the
 expected error is the sum of error probabilities for all the positions
-in the sequence. In practice, the expected error is greater than zero
-(error probabilities can be small but not null), and at most equal to
-the length of the sequence (when all positions have an error
-probability of 1.0).
+in the sequence. Since error probabilities can be small but not null,
+the expected error is always greater than zero, and at most equal to
+the length of the sequence when all positions in the sequence have an
+error probability of 1.0.
+
+Using the expected error as the \fIlambda\fR parameter in the Poisson
+distribution, it is possible to compute the probability of observing
+\fIk\fR errors. For instance, a read with an expected error of 1.0
+has:
+.RS
+.IP - 2
+36.8% chance of having zero error,
+.IP -
+36.8% chance of having one error,
+.IP -
+18.4% chance of having two errors,
+.IP -
+6.1% chance of having three errors,
+.IP -
+1.5% chance of having four errors,
+.IP -
+0.3% chance of having five errors,
+.IP -
+etc.
+.RE
+.PP
 .TAG fastq_maxee_rate
 .TP
 .BI \-\-fastq_maxee_rate\~ real
@@ -1791,7 +1864,7 @@ Other relevant options are: \-\-fastq_ascii, \-\-fastq_maxee,
 .TP
 .BI \-\-fastq_minlen\~ "positive integer"
 When using \-\-fastq_filter, \-\-fastq_mergepairs or \-\-fastx_filter,
-discard sequences with less than the specified number of bases
+discard input sequences with less than the specified number of bases
 (default 1).
 .TAG fastq_minmergelen
 .TP
@@ -2049,6 +2122,12 @@ When running \-\-fastq_join, use the \fIstring\fR as a quality padding
 string. The default is a string of I's equal in length to the sequence
 padding string. The letter I corresponds to a base quality score of 40
 indicating a very high quality base with error probability of 0.0001.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA or
+FASTQ format by adding a ";length=\fIinteger\fR" attribute in the
+header.
 .TAG maxsize
 .TP
 .BI \-\-maxsize\~ "positive integer"
@@ -2114,6 +2193,11 @@ limited using the \-\-fastq_qminout and \-\-fastq_qmaxout options.
 Specifies that the sequences converted by the \-\-sff_convert command
 should be clipped in both ends as indicated in the SFF file. By
 default no clipping is performed.
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -2221,7 +2305,7 @@ only to the \-\-fastx_mask command.
 .BI \-\-fastx_mask \0filename
 Mask regions in sequences contained
 in the specified fasta or fastq file. The default is to mask using
-DUST (use \-\-qmask to modify that behavior). The output files
+DUST (use \-\-qmask to modify that behaviour). The output files
 are specified with the \-\-fastaout and \-\-fastqout options. The
 minimum and maximum percentage of unmasked residues may be specified
 with the \-\-min_unmasked_pct and \-\-max_unmasked_pct options,
@@ -2234,11 +2318,11 @@ replace the masked regions by lower case letters.
 .TAG maskfasta
 .TP
 .BI \-\-maskfasta \0filename
-Mask regions in sequences contained
-in the fasta file \fIfilename\fR. The default is to mask using
-\fIdust\fR (use \-\-qmask to modify that behavior). The output file is
-specified with the \-\-output option. This command is depreciated,
-please use \-\-fastx_mask instead.
+Mask regions in sequences contained in the fasta file
+\fIfilename\fR. The default is to mask using \fIdust\fR (use \-\-qmask
+to modify that behaviour). The output file is specified with the
+\-\-output option. This command is depreciated, please use
+\-\-fastx_mask instead.
 .TAG max_unmasked_pct
 .TP
 .BI \-\-max_unmasked_pct \0real
@@ -2273,12 +2357,12 @@ The \-\-orient command can be used to orient the sequences in a given
 file in either the forward or the reverse complementary direction
 based on a reference database specified with the \-\-db option. The
 two strands of each input sequence are compared to the reference
-database using nucleotide words. If one of the strands share many more
-words with at least one sequence in the database than the other, that
-strand is chosen. The correctly oriented sequences may be written to a
-FASTA file specified with the \-\-fastaout, and to a FASTQ file
+database using nucleotide words. If one of the strands shares many
+more words with at least one sequence in the database than the other,
+that strand is chosen. The correctly oriented sequences may be written
+to a FASTA file specified with the \-\-fastaout, and to a FASTQ file
 specified with the \-\-fastqout option (as long as the input was also
-in FASTA format). If the result is uncertain, because the number of
+in FASTQ format). If the result is uncertain, because the number of
 matching words is too similar, the original sequence is written to the
 file specified with the \-\-notmatched option. The results may also be
 written to a tab-delimited text file specified with the \-\-tabbedout
@@ -2318,11 +2402,11 @@ the original format.
 Orient the sequences in the given file.
 .TAG tabbedout
 .TP
-.BI \-\-tabbedout \0 filename
+.BI \-\-tabbedout \0filename
 Write the resuls to a tab-delimited text file with the specified
-filename. This file will contain the query label, the direction (+, -
-or ?), the number of matching words on the forward strand, and the
-number of matching words on the reverse complementary strand.
+\fIfilename\fR. This file will contain the query label, the direction
+(+, - or ?), the number of matching words on the forward strand, and
+the number of matching words on the reverse complementary strand.
 .RE
 .PP
 .\" ----------------------------------------------------------------------------
@@ -2709,6 +2793,11 @@ useful. The fraction of matching hits required may be adjusted by the
 .TP
 .B \-\-leftjust
 Reject the sequence match if the pairwise alignment begins with gaps.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG match
 .TP
 .BI \-\-match\~ "integer"
@@ -3117,6 +3206,11 @@ words. Memory requirements for a part of the index increase with a
 factor of 4 each time word length increases by one nucleotide, and
 this generally becomes significant for long words (12 or more). The
 default value is 8.
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .RE
 .PP
 .\" ----------------------------------------------------------------------------
@@ -3125,6 +3219,11 @@ Shuffling options:
 .RS
 Fasta entries in the input file are outputted in a pseudo-random
 order.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG output
 .TP 9
 .BI \-\-output \0filename
@@ -3134,7 +3233,7 @@ Write the shuffled sequences to \fIfilename\fR, in fasta format.
 .BI \-\-randseed\~ "positive integer"
 When shuffling sequence order, use \fIinteger\fR as seed. A given seed
 always produces the same output order (useful for replicability). Set
-to 0 to use a pseudo-random seed (default behavior).
+to 0 to use a pseudo-random seed (default behaviour).
 .TAG relabel
 .TP
 .BI \-\-relabel \0string
@@ -3190,6 +3289,11 @@ Pseudo-randomly shuffle the order of sequences contained in
 .BI \-\-topn\~ "positive integer"
 Output only the first \fIinteger\fR sequences after pseudo-random
 reordering.
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -3211,6 +3315,11 @@ sorting performed during chimera checking (\-\-uchime_denovo),
 dereplication (\-\-derep_fulllength), and clustering (\-\-cluster_fast
 and \-\-cluster_size).
 .PP
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG maxsize
 .TP 9
 .BI \-\-maxsize\~ "positive integer"
@@ -3272,6 +3381,11 @@ sequences.
 .BI \-\-topn\~ "positive integer"
 Output only the top \fIinteger\fR sequences (i.e. the longest or the
 most abundant).
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -3342,13 +3456,18 @@ format. Requires input in fastq format.
 .BI \-\-fastx_subsample \0filename
 Perform subsampling from the sequences in the specified input file
 that is in FASTA or FASTQ format.
+.TAG lengthout
+.TP
+.B \-\-lengthout
+Write sequence length information to the output files in FASTA format
+by adding a ";length=\fIinteger\fR" attribute in the header.
 .TAG randseed
 .TP
 .BI \-\-randseed\~ "positive integer"
 Use \fIinteger\fR as a seed for the pseudo-random generator. A given
 seed always produces the same output, which is useful for
 replicability. Set to 0 to use a pseudo-random seed (default
-behavior).
+behaviour).
 .TAG relabel
 .TP
 .BI \-\-relabel \0string
@@ -3406,6 +3525,11 @@ otherwise the abundance of each sequence is considered to be 1.
 .TP
 .B \-\-sizeout
 Write abundance information to the output file.
+.TAG xlength
+.TP
+.B \-\-xlength
+Strip sequence length information from the headers when writing the
+output file. This information is added by the \-\-lengthout option.
 .TAG xsize
 .TP
 .B \-\-xsize
@@ -3431,7 +3555,11 @@ specified with the \-\-tabbedout option. The \-\-sintax_cutoff option
 may be used to set a minimum level of bootstrap support for the
 taxonomic ranks to be reported. The `--randseed` option may be
 included to specify a seed for initialisation of the random number
-generator used by the algorithm.
+generator used by the algorithm. Please note that when using multiple
+threads, the `--randseed` option may not work as intended, because
+sequences may be processed in a random order by different threads. To
+ensure the same results each time, use a single thread (`--threads 1`)
+in combination with a fixed random seed specified with `--randseed`.
 .PP
 Multithreading is supported. Databases in UDB files are supported.
 The strand option may be specified.
@@ -3461,7 +3589,8 @@ UDB format. These sequences need to be annotated with taxonomy.
 Use \fIinteger\fR as seed for the random number generator used in the
 Sintax algorithm. A given seed always produces the same output order
 (useful for replicability). Set to 0 to use a pseudo-random seed
-(default behavior).
+(default behaviour). Does not work correctly with multiple threads;
+please use `--threads 1` to ensure correct behaviour.
 .TAG sintax_cutoff
 .TP
 .BI \-\-sintax_cutoff\~ "real"
@@ -3796,7 +3925,7 @@ dereplication.
 sequences labels as secondary or tertiary keys.
 .PP
 \fBvsearch\fR by default uses the DUST algorithm for masking
-low-complexity regions. Masking behavior is also slightly changed to
+low-complexity regions. Masking behaviour is also slightly changed to
 be more consistent.
 .PP
 .\" ============================================================================
@@ -3954,7 +4083,7 @@ Source code and binaries are available at
 .PP
 .\" ============================================================================
 .SH COPYRIGHT
-Copyright (C) 2014-2021, Torbjørn Rognes, Frédéric Mahé and Tomás
+Copyright (C) 2014-2023, Torbjørn Rognes, Frédéric Mahé and Tomás
 Flouri
 .PP
 All rights reserved.
@@ -4281,7 +4410,7 @@ Fixed bug in aligned sequences produced with \-\-fastapairs and
 \-\-userout (qrow, trow) options.
 .TP
 .BR v1.9.7\~ "released January 12th, 2016"
-Masking behavior is changed somewhat to keep the letter case of the
+Masking behaviour is changed somewhat to keep the letter case of the
 input sequences unchanged when no masking is performed. Masking is now
 performed also during chimera detection. Documentation updated.
 .TP
@@ -4681,6 +4810,18 @@ Add the derep_smallmem command for dereplication using little memory.
 .TP
 .BR v2.22.1\~ "released September 19th, 2022"
 Fix compiler warning.
+.TP
+.BR v2.23.0\~ "released July 7th, 2023"
+Update documentation. Add citation file. Modernize and improve
+code. Fix several minor bugs. Fix compilation with GCC 13. Print stats
+after fastq_mergepairs to log file instead of stderr. Handle sizein
+option correctly with dbmatched option for usearch_global. Allow
+maxseqlength option for makeudb_usearch. Fix memory allocation problem
+with chimera detection. Add lengthout and xlength options. Increase
+precision for eeout option. Add warning about sintax algorithm, random
+seed and multiple threads. Refactor chimera detection code. Add
+undocumented experimental long_chimeras_denovo command. Fix segfault
+with clustering. Add more references.
 .LP
 .\" ============================================================================
 .\" TODO:


=====================================
src/align.cc
=====================================
@@ -84,8 +84,9 @@ inline void pushop(char newop, char ** cigarendp, char * op, int * count)
       *--*cigarendp = *op;
       if (*count > 1)
         {
-          char buf[25];
-          int len = sprintf(buf, "%d", *count);
+          const int size = 25;
+          char buf[size];
+          int len = snprintf(buf, size, "%d", *count);
           *cigarendp -= len;
           memcpy(*cigarendp, buf, (size_t)len);
         }
@@ -101,8 +102,9 @@ inline void finishop(char ** cigarendp, char * op, int * count)
     *--*cigarendp = *op;
     if (*count > 1)
     {
-      char buf[25];
-      int len = sprintf(buf, "%d", *count);
+      const int size = 25;
+      char buf[size];
+      int len = snprintf(buf, size, "%d", *count);
       *cigarendp -= len;
       memcpy(*cigarendp, buf, (size_t)len);
     }


=====================================
src/align_simd.cc
=====================================
@@ -783,8 +783,9 @@ inline void pushop(s16info_s * s, char newop)
       *--s->cigarend = s->op;
       if (s->opcount > 1)
         {
-          char buf[11];
-          int len = sprintf(buf, "%d", s->opcount);
+          const int size = 11;
+          char buf[size];
+          int len = snprintf(buf, size, "%d", s->opcount);
           s->cigarend -= len;
           memcpy(s->cigarend, buf, len);
         }
@@ -800,8 +801,9 @@ inline void finishop(s16info_s * s)
       *--s->cigarend = s->op;
       if (s->opcount > 1)
         {
-          char buf[11];
-          int len = sprintf(buf, "%d", s->opcount);
+          const int size = 11;
+          char buf[size];
+          int len = snprintf(buf, size, "%d", s->opcount);
           s->cigarend -= len;
           memcpy(s->cigarend, buf, len);
         }


=====================================
src/attributes.cc
=====================================
@@ -157,15 +157,23 @@ int64_t header_get_size(char * header, int header_length)
   return abundance;
 }
 
-void header_fprint_strip_size_ee(FILE * fp,
-                                 char * header,
-                                 int header_length,
-                                 bool strip_size,
-                                 bool strip_ee)
+void swap(int * a, int * b)
+{
+  int temp = *a;
+  *a = *b;
+  *b = temp;
+}
+
+void header_fprint_strip(FILE * fp,
+                         char * header,
+                         int header_length,
+                         bool strip_size,
+                         bool strip_ee,
+                         bool strip_length)
 {
   int attributes = 0;
-  int attribute_start[2];
-  int attribute_end[2];
+  int attribute_start[3];
+  int attribute_end[3];
 
   /* look for size attribute */
 
@@ -209,21 +217,43 @@ void header_fprint_strip_size_ee(FILE * fp,
       attributes++;
     }
 
+  /* look for length attribute */
+
+  int length_start = 0;
+  int length_end = 0;
+  bool length_found = false;
+  if (strip_length)
+    {
+      length_found = header_find_attribute(header,
+                                           header_length,
+                                           "length=",
+                                           & length_start,
+                                           & length_end,
+                                           true);
+    }
+  if (length_found)
+    {
+      attribute_start[attributes] = length_start;
+      attribute_end[attributes] = length_end;
+      attributes++;
+    }
+
   /* sort */
 
-  if (attributes > 1)
+  int last_swap = 0;
+  int limit = attributes - 1;
+  while (limit > 0)
     {
-      if (attribute_start[0] > attribute_start[1])
+      for(int i = 0; i < limit; i++)
         {
-          /* swap */
-
-          int s = attribute_start[0];
-          int e = attribute_end[0];
-          attribute_start[0] = attribute_start[1];
-          attribute_end[0] = attribute_end[1];
-          attribute_start[1] = s;
-          attribute_end[1] = e;
+          if (attribute_start[i] > attribute_start[i+1])
+            {
+              swap(attribute_start + i, attribute_start + i + 1);
+              swap(attribute_end   + i, attribute_end   + i + 1);
+              last_swap = i;
+            }
         }
+      limit = last_swap;
     }
 
   /* print */
@@ -256,14 +286,3 @@ void header_fprint_strip_size_ee(FILE * fp,
         }
     }
 }
-
-void header_fprint_strip_size(FILE * fp,
-                              char * header,
-                              int header_length)
-{
-  header_fprint_strip_size_ee(fp,
-                              header,
-                              header_length,
-                              true,
-                              false);
-}


=====================================
src/attributes.h
=====================================
@@ -60,12 +60,9 @@
 
 int64_t header_get_size(char * header, int header_length);
 
-void header_fprint_strip_size(FILE * fp,
-                              char * header,
-                              int header_length);
-
-void header_fprint_strip_size_ee(FILE * fp,
-                                 char * header,
-                                 int header_length,
-                                 bool strip_size,
-                                 bool strip_ee);
+void header_fprint_strip(FILE * fp,
+                         char * header,
+                         int header_length,
+                         bool strip_size,
+                         bool strip_ee,
+                         bool strip_length);


=====================================
src/chimera.cc
=====================================
@@ -67,13 +67,16 @@
   and Rob Knight (2011)
   UCHIME improves sensitivity and speed of chimera detection
   Bioinformatics, 27, 16, 2194-2200
-  http://dx.doi.org/10.1093/bioinformatics/btr381
+  https://doi.org/10.1093/bioinformatics/btr381
 */
 
 /* global constants/data, no need for synchronization */
-const int parts = 4;
+static int parts = 0;
+const int maxparts = 100;
+const int maxparents = 4; /* max, could be fewer */
+const int window = 64;
 const int few = 4;
-const int maxcandidates = few * parts;
+const int maxcandidates = few * maxparts;
 const int rejects = 16;
 const double chimera_id = 0.55;
 static int tophits;
@@ -113,7 +116,7 @@ struct chimera_info_s
   char * query_seq;
   int query_len;
 
-  struct searchinfo_s si[parts];
+  struct searchinfo_s si[maxparts];
 
   unsigned int cand_list[maxcandidates];
   int cand_count;
@@ -133,16 +136,20 @@ struct chimera_info_s
 
   int match_size;
   int * match;
+  int * insert;
   int * smooth;
   int * maxsmooth;
 
-  int best_parents[2];
+  int parents_found;
+  int best_parents[maxparents];
+  int best_start[maxparents];
+  int best_len[maxparents];
 
   int best_target;
   char * best_cigar;
 
   int * maxi;
-  char * paln[2];
+  char * paln[maxparents];
   char * qaln;
   char * diffs;
   char * votes;
@@ -157,6 +164,23 @@ static struct chimera_info_s * cia;
 
 void realloc_arrays(struct chimera_info_s * ci)
 {
+  if (opt_chimeras_denovo)
+    {
+      if (opt_chimeras_parts == 0)
+        parts = (ci->query_len + maxparts - 1) / maxparts;
+      else
+        parts = opt_chimeras_parts;
+      if (parts < 2)
+        parts = 2;
+      else if (parts > maxparts)
+        parts = maxparts;
+    }
+  else
+    {
+      /* default for uchime, uchime2, and uchime3 */
+      parts = 4;
+    }
+
   int maxhlen = MAX(ci->query_head_len,1);
   if (maxhlen > ci->head_alloc)
     {
@@ -166,7 +190,9 @@ void realloc_arrays(struct chimera_info_s * ci)
 
   /* realloc arrays based on query length */
 
-  int maxqlen = MAX(ci->query_len,1);
+  int maxqlen = MAX(ci->query_len, 1);
+  int maxpartlen = (maxqlen + parts - 1) / parts;
+
   if (maxqlen > ci->query_alloc)
     {
       ci->query_alloc = maxqlen;
@@ -176,21 +202,23 @@ void realloc_arrays(struct chimera_info_s * ci)
       for(auto & i
             : ci->si)
         {
-          int maxpartlen = (maxqlen + parts - 1) / parts;
-          i.qsequence = (char*) xrealloc(i.qsequence,
-                                         maxpartlen + 1);
+          i.qsequence = (char*) xrealloc(i.qsequence, maxpartlen + 1);
         }
 
       ci->maxi = (int *) xrealloc(ci->maxi, (maxqlen + 1) * sizeof(int));
       ci->maxsmooth = (int*) xrealloc(ci->maxsmooth, maxqlen * sizeof(int));
       ci->match = (int*) xrealloc(ci->match,
                                   maxcandidates * maxqlen * sizeof(int));
+      ci->insert = (int*) xrealloc(ci->insert,
+                                   maxcandidates * maxqlen * sizeof(int));
       ci->smooth = (int*) xrealloc(ci->smooth,
                                    maxcandidates * maxqlen * sizeof(int));
 
       int maxalnlen = maxqlen + 2 * db_getlongestsequence();
-      ci->paln[0] = (char*) xrealloc(ci->paln[0], maxalnlen+1);
-      ci->paln[1] = (char*) xrealloc(ci->paln[1], maxalnlen+1);
+      for (int f = 0; f < maxparents ; f++)
+        {
+          ci->paln[f] = (char*) xrealloc(ci->paln[f], maxalnlen+1);
+        }
       ci->qaln = (char*) xrealloc(ci->qaln, maxalnlen+1);
       ci->diffs = (char*) xrealloc(ci->diffs, maxalnlen+1);
       ci->votes = (char*) xrealloc(ci->votes, maxalnlen+1);
@@ -199,19 +227,22 @@ void realloc_arrays(struct chimera_info_s * ci)
     }
 }
 
-
-int find_best_parents(struct chimera_info_s * ci)
+void find_matches(struct chimera_info_s * ci)
 {
-  ci->best_parents[0] = -1;
-  ci->best_parents[1] = -1;
-
   /* find the positions with matches for each potential parent */
+  /* also note the positions with inserts in front */
 
   char * qseq = ci->query_seq;
 
-  memset(ci->match, 0, ci->cand_count * ci->query_len * sizeof(int));
+  for (int i = 0; i < ci->cand_count; i++)
+    for (int j = 0; j < ci->query_len; j++)
+      {
+        int x = i * ci->query_len + j;
+        ci->match[x] = 0;
+        ci->insert[x] = 0;
+      }
 
-  for(int i=0; i < ci->cand_count; i++)
+  for(int i = 0; i < ci->cand_count; i++)
     {
       char * tseq = db_getsequence(ci->cand_list[i]);
 
@@ -244,6 +275,7 @@ int find_best_parents(struct chimera_info_s * ci)
               break;
 
             case 'I':
+              ci->insert[i * ci->query_len + qpos] = run;
               tpos += run;
               break;
 
@@ -253,103 +285,197 @@ int find_best_parents(struct chimera_info_s * ci)
             }
         }
     }
+}
 
-  /* Compute smoothed identity score in a window for each candidate,   */
-  /* and record max smoothed score for each position among candidates. */
+struct parents_info_s
+{
+  int cand;
+  int start;
+  int len;
+};
 
-  memset(ci->maxsmooth, 0, ci->query_len * sizeof(int));
+int compare_positions(const void * a, const void * b)
+{
+  const int x = ((const parents_info_s *) a)->start;
+  const int y = ((const parents_info_s *) b)->start;
 
-  const int window = 32;
+  if (x < y)
+    return -1;
+  else if (x > y)
+    return +1;
+  else
+    return 0;
+}
 
-  for(int i = 0; i < ci->cand_count; i++)
+int find_best_parents_long(struct chimera_info_s * ci)
+{
+  /* find parents with longest perfect match regions,
+     excluding regions matched by previously identified parents */
+
+  find_matches(ci);
+
+  struct parents_info_s best_parents[maxparents];
+
+  for (int f = 0; f < maxparents; f++)
     {
-      int sum = 0;
-      for(int qpos = 0; qpos < ci->query_len; qpos++)
-        {
-          int z = i * ci->query_len + qpos;
-          sum += ci->match[z];
-          if (qpos >= window)
-            {
-              sum -= ci->match[z-window];
-            }
-          if (qpos >= window-1)
-            {
-              ci->smooth[z] = sum;
-              if (ci->smooth[z] > ci->maxsmooth[qpos])
-                {
-                  ci->maxsmooth[qpos] = ci->smooth[z];
-                }
-            }
-        }
+      best_parents[f].cand = -1;
+      best_parents[f].start = -1;
     }
 
-  /* find first parent */
-
-  int wins[ci->cand_count];
+  bool position_used[ci->query_len];
+  for (int i = 0; i < ci->query_len; i++)
+    {
+      position_used[i] = false;
+    }
 
-  memset(wins, 0, ci->cand_count * sizeof(int));
+  int pos_remaining = ci->query_len;
+  int parents_found = 0;
 
-  for(int qpos = window-1; qpos < ci->query_len; qpos++)
+  for (int f = 0; f < opt_chimeras_parents_max; f++)
     {
-      if (ci->maxsmooth[qpos] != 0)
+      /* scan each candidate and find longest matching region */
+
+      int best_start = 0;
+      int best_len = 0;
+      int best_cand = -1;
+
+      for (int i = 0; i < ci->cand_count; i++)
         {
-          for(int i=0; i < ci->cand_count; i++)
+          int start = 0;
+          int len = 0;
+
+          for (int j = 0; j < ci->query_len; j++)
             {
-              int z = i * ci->query_len + qpos;
-              if (ci->smooth[z] == ci->maxsmooth[qpos])
+              if ((position_used[j] == false) &&
+                  (ci->match[i * ci->query_len + j] == 1) &&
+                  ((len == 0) || (ci->insert[i * ci->query_len + j] == 0)))
+                {
+                  if (len == 0)
+                    {
+                      start = j;
+                    }
+                  len++;
+                  if (len > best_len)
+                    {
+                      best_cand = i;
+                      best_start = start;
+                      best_len = len;
+                    }
+                }
+              else
                 {
-                  wins[i]++;
+                  len = 0;
                 }
             }
         }
+
+      if (best_len >= opt_chimeras_length_min)
+        {
+          best_parents[f].cand = best_cand;
+          best_parents[f].start = best_start;
+          best_parents[f].len = best_len;
+          parents_found++;
+
+#if 0
+          if (f == 0)
+            printf("\n");
+          printf("Best parents long: %d %d %d %d %s %s\n",
+                 f,
+                 best_cand,
+                 best_start,
+                 best_len,
+                 ci->query_head,
+                 db_getheader(ci->cand_list[best_cand]));
+#endif
+
+          /* mark positions used */
+          for (int j = best_start; j < best_start + best_len; j++)
+            {
+              position_used[j] = true;
+            }
+          pos_remaining -= best_len;
+        }
+      else
+        break;
     }
 
-  int best1_w = -1;
-  int best1_i = -1;
-  int best2_w = -1;
-  int best2_i = -1;
+  /* sort parents by position */
+  qsort(best_parents,
+        parents_found,
+        sizeof(struct parents_info_s),
+        compare_positions);
+
+  ci->parents_found = parents_found;
 
-  for(int i=0; i < ci->cand_count; i++)
+  for (int f = 0; f < parents_found; f++)
     {
-      int w = wins[i];
-      if (w > best1_w)
-        {
-          best1_w = w;
-          best1_i = i;
-        }
+      ci->best_parents[f] = best_parents[f].cand;
+      ci->best_start[f] = best_parents[f].start;
+      ci->best_len[f] = best_parents[f].len;
     }
 
-  if (best1_w >= 0)
+#if 0
+  if (pos_remaining == 0)
+    printf("Fully covered!\n");
+  else
+    printf("Not covered completely (%d).\n", pos_remaining);
+#endif
+
+  return (parents_found > 1) && (pos_remaining == 0);
+}
+
+int find_best_parents(struct chimera_info_s * ci)
+{
+  find_matches(ci);
+
+  int best_parent_cand[maxparents];
+
+  for (int f = 0; f < 2; f++)
     {
-      /* find second parent */
+      best_parent_cand[f] = -1;
+      ci->best_parents[f] = -1;
+    }
 
-      /* wipe out matches in positions covered by first parent */
+  bool cand_selected[ci->cand_count];
 
-      for(int qpos = window - 1; qpos < ci->query_len; qpos++)
+  for (int i = 0; i < ci->cand_count; i++)
+    cand_selected[i] = false;
+
+  for (int f = 0; f < 2; f++)
+    {
+      if (f > 0)
         {
-          int z = best1_i * ci->query_len + qpos;
-          if (ci->smooth[z] == ci->maxsmooth[qpos])
+          /* for all parents except the first */
+
+          /* wipe out matches for all candidates in positions
+             covered by the previous parent */
+
+          for(int qpos = window - 1; qpos < ci->query_len; qpos++)
             {
-              for(int i = qpos + 1 - window; i <= qpos; i++)
+              int z = best_parent_cand[f-1] * ci->query_len + qpos;
+              if (ci->smooth[z] == ci->maxsmooth[qpos])
                 {
-                  for(int j = 0; j < ci->cand_count; j++)
+                  for(int i = qpos + 1 - window; i <= qpos; i++)
                     {
-                      ci->match[j * ci->query_len + i] = 0;
+                      for(int j = 0; j < ci->cand_count; j++)
+                        {
+                          ci->match[j * ci->query_len + i] = 0;
+                        }
                     }
                 }
             }
         }
 
-      /*
-        recompute smoothed identity over window, and record max smoothed
-        score for each position among remaining candidates
-      */
 
-      memset(ci->maxsmooth, 0, ci->query_len * sizeof(int));
+      /* Compute smoothed score in a 32bp window for each candidate. */
+      /* Record max smoothed score for each position among candidates left. */
+
+      for (int j = 0; j < ci->query_len; j++)
+        ci->maxsmooth[j] = 0;
 
       for(int i = 0; i < ci->cand_count; i++)
         {
-          if (i != best1_i)
+          if (! cand_selected[i])
             {
               int sum = 0;
               for(int qpos = 0; qpos < ci->query_len; qpos++)
@@ -372,9 +498,12 @@ int find_best_parents(struct chimera_info_s * ci)
             }
         }
 
-      /* find second parent */
 
-      memset(wins, 0, ci->cand_count * sizeof(int));
+      /* find parent with the most wins */
+
+      int wins[ci->cand_count];
+      for (int i = 0; i < ci->cand_count; i++)
+        wins[i] = 0;
 
       for(int qpos = window-1; qpos < ci->query_len; qpos++)
         {
@@ -382,7 +511,7 @@ int find_best_parents(struct chimera_info_s * ci)
             {
               for(int i=0; i < ci->cand_count; i++)
                 {
-                  if (i != best1_i)
+                  if (! cand_selected[i])
                     {
                       int z = i * ci->query_len + qpos;
                       if (ci->smooth[z] == ci->maxsmooth[qpos])
@@ -394,35 +523,49 @@ int find_best_parents(struct chimera_info_s * ci)
             }
         }
 
-      for(int i=0; i < ci->cand_count; i++)
+      /* select best parent based on most wins */
+
+      int maxwins = 0;
+      for(int i = 0; i < ci->cand_count; i++)
         {
           int w = wins[i];
-          if (w > best2_w)
+          if (w > maxwins)
             {
-              best2_w = w;
-              best2_i = i;
+              maxwins = w;
+              best_parent_cand[f] = i;
             }
         }
+
+      /* terminate loop if no parent found */
+
+      if (best_parent_cand[f] < 0)
+        break;
+
+#if 0
+      printf("Query %d: Best parent (%d) candidate: %d. Wins: %d\n",
+             ci->query_no, f, best_parent_cand[f], maxwins);
+#endif
+
+      ci->best_parents[f] = best_parent_cand[f];
+      cand_selected[best_parent_cand[f]] = true;
     }
 
-  ci->best_parents[0] = best1_i;
-  ci->best_parents[1] = best2_i;
+  /* Check if at least 2 candidates selected */
 
-  return (best1_w >= 0) && (best2_w >= 0);
+  return (best_parent_cand[0] >= 0) && (best_parent_cand[1] >= 0);
 }
 
-int eval_parents(struct chimera_info_s * ci)
-{
-  int status = 1;
-
-  /* create msa */
 
+int find_max_alignment_length(struct chimera_info_s * ci)
+{
   /* find max insertions in front of each position in the query sequence */
-  memset(ci->maxi, 0, (ci->query_len + 1) * sizeof(int));
 
-  for(int best_parent
-        : ci->best_parents)
+  for (int i = 0; i <= ci->query_len; i++)
+    ci->maxi[i] = 0;
+
+  for (int f = 0; f < ci->parents_found; f++)
     {
+      int best_parent = ci->best_parents[f];
       char * p = ci->nwcigar[best_parent];
       char * e = p + strlen(p);
       int pos = 0;
@@ -458,40 +601,27 @@ int eval_parents(struct chimera_info_s * ci)
     }
   alnlen += ci->query_len;
 
-  /* fill in alignment string for query */
-
-  char * q = ci->qaln;
-  int qpos = 0;
-  for (int i=0; i < ci->query_len; i++)
-    {
-      for (int j=0; j < ci->maxi[i]; j++)
-        {
-          *q++ = '-';
-        }
-      *q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
-    }
-  for (int j=0; j < ci->maxi[ci->query_len]; j++)
-    {
-      *q++ = '-';
-    }
-  *q = 0;
+  return alnlen;
+}
 
-  /* fill in alignment strings for the 2 parents */
+void fill_alignment_parents(struct chimera_info_s * ci)
+{
+  /* fill in alignment strings for the parents */
 
-  for(int j=0; j<2; j++)
+  for(int j = 0; j < ci->parents_found; j++)
     {
       int cand = ci->best_parents[j];
       int target_seqno = ci->cand_list[cand];
       char * target_seq = db_getsequence(target_seqno);
 
       int inserted = 0;
-      qpos = 0;
+      int qpos = 0;
       int tpos = 0;
 
       char * t = ci->paln[j];
-
       char * p = ci->nwcigar[cand];
       char * e = p + strlen(p);
+
       while (p < e)
         {
           int run = 1;
@@ -544,7 +674,7 @@ int eval_parents(struct chimera_info_s * ci)
 
       /* add any gaps at the end */
 
-      if (!inserted)
+      if (! inserted)
         {
           for(int x=0; x < ci->maxi[qpos]; x++)
             {
@@ -555,8 +685,317 @@ int eval_parents(struct chimera_info_s * ci)
       /* end of sequence string */
       *t = 0;
     }
+}
+
+
+int eval_parents_long(struct chimera_info_s * ci)
+{
+  /* always chimeric if called */
+  int status = 4;
+
+  int alnlen = find_max_alignment_length(ci);
+
+  fill_alignment_parents(ci);
+
+  /* fill in alignment string for query */
+
+  char * pm = ci->model;
+  int m = 0;
+  char * q = ci->qaln;
+  int qpos = 0;
+  for (int i=0; i < ci->query_len; i++)
+    {
+      if (qpos >= (ci->best_start[m] + ci->best_len[m]))
+        m++;
+      for (int j=0; j < ci->maxi[i]; j++)
+        {
+          *q++ = '-';
+          *pm++ = 'A' + m;
+        }
+      *q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
+      *pm++ = 'A' + m;
+    }
+  for (int j=0; j < ci->maxi[ci->query_len]; j++)
+    {
+      *q++ = '-';
+      *pm++ = 'A' + m;
+    }
+  *q = 0;
+  *pm = 0;
+
+  for(int i = 0; i < alnlen; i++)
+    {
+      unsigned int qsym = chrmap_4bit[(int)(ci->qaln[i])];
+      unsigned int psym[maxparents];
+      for (int f = 0; f < maxparents; f++)
+	psym[f] = 0;
+      for (int f = 0; f < ci->parents_found; f++)
+        psym[f] = chrmap_4bit[(int)(ci->paln[f][i])];
+
+      /* lower case parent symbols that differ from query */
+
+      for (int f = 0; f < ci->parents_found; f++)
+        if (psym[f] && (psym[f] != qsym))
+          ci->paln[f][i] = tolower(ci->paln[f][i]);
+
+      /* compute diffs */
+
+      char diff = ' ';
+
+      bool all_defined = qsym;
+      for (int f = 0; f < ci->parents_found; f++)
+        if (!psym[f])
+          all_defined = false;
+
+      if (all_defined)
+        {
+          bool parents_equal = true;
+          for (int f = 1; f < ci->parents_found; f++)
+            if (psym[f] != psym[0])
+              parents_equal = false;
+
+          if (! parents_equal)
+            diff = ci->model[i];
+        }
+
+      ci->diffs[i] = diff;
+    }
+
+  ci->diffs[alnlen] = 0;
+
+
+  /* count matches */
+
+  int match_QP[maxparents];
+  int cols = 0;
+
+  for(int f = 0; f < ci->parents_found; f++)
+    match_QP[f] = 0;
+
+  for(int i = 0; i < alnlen; i++)
+    {
+      cols++;
+
+      char qsym = chrmap_4bit[(int)(ci->qaln[i])];
+
+      for(int f = 0; f < ci->parents_found; f++)
+	{
+	  char psym = chrmap_4bit[(int)(ci->paln[f][i])];
+	  if (qsym == psym)
+	    match_QP[f]++;
+	}
+    }
+
+
+  int seqno_a = ci->cand_list[ci->best_parents[0]];
+  int seqno_b = ci->cand_list[ci->best_parents[1]];
+  int seqno_c = -1;
+  if (ci->parents_found > 2)
+    seqno_c = ci->cand_list[ci->best_parents[2]];
+
+  double QP[maxparents];
+  double QT = 0.0;
+
+  for (int f = 0; f < maxparents; f++)
+    {
+      if (f < ci->parents_found)
+        QP[f] = 100.0 * match_QP[f] / cols;
+      else
+        QP[f] = 0.0;
+      if (QP[f] > QT)
+        QT = QP[f];
+    }
+
+  double QA = QP[0];
+  double QB = QP[1];
+  double QC = ci->parents_found > 2 ? QP[2] : 0.00;
+  double QM = 100.00;
+  double divfrac = 100.00 * (QM - QT) / QT;
+
+  xpthread_mutex_lock(&mutex_output);
+
+  if (opt_alnout && (status == 4))
+    {
+      fprintf(fp_uchimealns, "\n");
+      fprintf(fp_uchimealns, "----------------------------------------"
+              "--------------------------------\n");
+      fprintf(fp_uchimealns, "Query   (%5d nt) ",
+              ci->query_len);
+      header_fprint_strip(fp_uchimealns,
+                          ci->query_head,
+                          ci->query_head_len,
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
+
+      for (int f = 0; f < ci->parents_found; f++)
+        {
+          int seqno = ci->cand_list[ci->best_parents[f]];
+          fprintf(fp_uchimealns, "\nParent%c (%5" PRIu64 " nt) ",
+                  'A' + f,
+                  db_getsequencelen(seqno));
+          header_fprint_strip(fp_uchimealns,
+                              db_getheader(seqno),
+                              db_getheaderlen(seqno),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
+        }
+
+      fprintf(fp_uchimealns, "\n\n");
+
+
+      int width = opt_alignwidth > 0 ? opt_alignwidth : alnlen;
+      qpos = 0;
+      int ppos[maxparents];
+      for (int f = 0; f < ci->parents_found; f++)
+        ppos[f] = 0;
+      int rest = alnlen;
+
+      for(int i = 0; i < alnlen; i += width)
+        {
+          /* count non-gap symbols on current line */
+
+          int qnt = 0;
+          int pnt[maxparents];
+          for (int f = 0; f < ci->parents_found; f++)
+            pnt[f] = 0;
+
+          int w = MIN(rest, width);
+
+          for(int j=0; j<w; j++)
+            {
+              if (ci->qaln[i+j] != '-')
+                {
+                  qnt++;
+                }
+
+              for (int f = 0; f < ci->parents_found; f++)
+                if (ci->paln[f][i+j] != '-')
+                  {
+                    pnt[f]++;
+                  }
+            }
+
+          fprintf(fp_uchimealns, "Q %5d %.*s %d\n",
+                  qpos+1,  w, ci->qaln+i,    qpos+qnt);
+
+          for (int f = 0; f < ci->parents_found; f++)
+            {
+              fprintf(fp_uchimealns, "%c %5d %.*s %d\n",
+                      'A' + f,
+                      ppos[f] + 1, w, ci->paln[f] + i, ppos[f] + pnt[f]);
+            }
+
+          fprintf(fp_uchimealns, "Diffs   %.*s\n", w, ci->diffs+i);
+          fprintf(fp_uchimealns, "Model   %.*s\n", w, ci->model+i);
+          fprintf(fp_uchimealns, "\n");
+
+          rest -= width;
+          qpos += qnt;
+          for (int f = 0; f < ci->parents_found; f++)
+            ppos[f] += pnt[f];
+        }
+
+      fprintf(fp_uchimealns, "Ids.  QA %.2f%%, QB %.2f%%, QC %.2f%%, "
+              "QT %.2f%%, QModel %.2f%%, Div. %+.2f%%\n",
+              QA, QB, QC, QT, QM, divfrac);
+    }
+
+  if (opt_tabbedout)
+    {
+      fprintf(fp_uchimeout, "%.4f\t", 99.9999);
+
+      header_fprint_strip(fp_uchimeout,
+                          ci->query_head,
+                          ci->query_head_len,
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
+      fprintf(fp_uchimeout, "\t");
+      header_fprint_strip(fp_uchimeout,
+                          db_getheader(seqno_a),
+                          db_getheaderlen(seqno_a),
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
+      fprintf(fp_uchimeout, "\t");
+      header_fprint_strip(fp_uchimeout,
+                          db_getheader(seqno_b),
+                          db_getheaderlen(seqno_b),
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
+      fprintf(fp_uchimeout, "\t");
+      if (seqno_c >= 0)
+        {
+          header_fprint_strip(fp_uchimeout,
+                              db_getheader(seqno_c),
+                              db_getheaderlen(seqno_c),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
+        }
+      else
+        {
+          fprintf(fp_uchimeout, "*");
+        }
+      fprintf(fp_uchimeout, "\t");
+
+      fprintf(fp_uchimeout,
+              "%.2f\t%.2f\t%.2f\t%.2f\t%.2f\t"
+              "%d\t%d\t%d\t%d\t%d\t%d\t%.2f\t%c\n",
+              QM,
+              QA,
+              QB,
+              QC,
+              QT,
+              0, /* ignore, left yes */
+              0, /* ignore, left no */
+              0, /* ignore, left abstain */
+              0, /* ignore, right yes */
+              0, /* ignore, right no */
+              0, /* ignore, right abstain */
+              0.00,
+              status == 4 ? 'Y' : (status == 2 ? 'N' : '?'));
+    }
+
+  xpthread_mutex_unlock(&mutex_output);
+
+  return status;
+}
+
+int eval_parents(struct chimera_info_s * ci)
+{
+  int status = 1;
+  ci->parents_found = 2;
+
+  int alnlen = find_max_alignment_length(ci);
+
+  fill_alignment_parents(ci);
+
+  /* fill in alignment string for query */
+
+  char * q = ci->qaln;
+  int qpos = 0;
+  for (int i=0; i < ci->query_len; i++)
+    {
+      for (int j=0; j < ci->maxi[i]; j++)
+        {
+          *q++ = '-';
+        }
+      *q++ = chrmap_upcase[(int)(ci->query_seq[qpos++])];
+    }
+  for (int j=0; j < ci->maxi[ci->query_len]; j++)
+    {
+      *q++ = '-';
+    }
+  *q = 0;
+
+  /* mark positions to ignore in voting */
 
-  memset(ci->ignore, 0, alnlen);
+  for (int i = 0; i < alnlen; i++)
+    ci->ignore[i] = 0;
 
   for(int i = 0; i < alnlen; i++)
     {
@@ -564,17 +1003,15 @@ int eval_parents(struct chimera_info_s * ci)
       unsigned int p1sym = chrmap_4bit[(int)(ci->paln[0][i])];
       unsigned int p2sym = chrmap_4bit[(int)(ci->paln[1][i])];
 
-      /* mark positions to ignore in voting */
-
       /* ignore gap positions and those next to the gap */
       if ((!qsym) || (!p1sym) || (!p2sym))
         {
           ci->ignore[i] = 1;
-          if (i>0)
+          if (i > 0)
             {
               ci->ignore[i-1] = 1;
             }
-          if (i<alnlen-1)
+          if (i < alnlen - 1)
             {
               ci->ignore[i+1] = 1;
             }
@@ -659,13 +1096,11 @@ int eval_parents(struct chimera_info_s * ci)
             {
               sumA++;
             }
-          else if
-            (diff == 'B')
+          else if (diff == 'B')
             {
               sumB++;
             }
-          else if
-            (diff != ' ')
+          else if (diff != ' ')
             {
               sumN++;
             }
@@ -926,42 +1361,31 @@ int eval_parents(struct chimera_info_s * ci)
                   "--------------------------------\n");
           fprintf(fp_uchimealns, "Query   (%5d nt) ",
                   ci->query_len);
-          if (opt_xsize)
-            {
-              header_fprint_strip_size(fp_uchimealns,
-                                       ci->query_head,
-                                       ci->query_head_len);
-            }
-          else
-            {
-              fprintf(fp_uchimealns, "%s", ci->query_head);
-            }
+
+          header_fprint_strip(fp_uchimealns,
+                              ci->query_head,
+                              ci->query_head_len,
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
 
           fprintf(fp_uchimealns, "\nParentA (%5" PRIu64 " nt) ",
                   db_getsequencelen(seqno_a));
-          if (opt_xsize)
-            {
-              header_fprint_strip_size(fp_uchimealns,
-                                       db_getheader(seqno_a),
-                                       db_getheaderlen(seqno_a));
-            }
-          else
-            {
-              fprintf(fp_uchimealns, "%s", db_getheader(seqno_a));
-            }
+          header_fprint_strip(fp_uchimealns,
+                              db_getheader(seqno_a),
+                              db_getheaderlen(seqno_a),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
 
           fprintf(fp_uchimealns, "\nParentB (%5" PRIu64 " nt) ",
                   db_getsequencelen(seqno_b));
-          if (opt_xsize)
-            {
-              header_fprint_strip_size(fp_uchimealns,
-                                       db_getheader(seqno_b),
-                                       db_getheaderlen(seqno_b));
-            }
-          else
-            {
-              fprintf(fp_uchimealns, "%s", db_getheader(seqno_b));
-            }
+          header_fprint_strip(fp_uchimealns,
+                              db_getheader(seqno_b),
+                              db_getheaderlen(seqno_b),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
           fprintf(fp_uchimealns, "\n\n");
 
           int width = opt_alignwidth > 0 ? opt_alignwidth : alnlen;
@@ -1042,59 +1466,49 @@ int eval_parents(struct chimera_info_s * ci)
         {
           fprintf(fp_uchimeout, "%.4f\t", best_h);
 
-          if (opt_xsize)
-            {
-              header_fprint_strip_size(fp_uchimeout,
-                                       ci->query_head,
-                                       ci->query_head_len);
-              fprintf(fp_uchimeout, "\t");
-              header_fprint_strip_size(fp_uchimeout,
-                                       db_getheader(seqno_a),
-                                       db_getheaderlen(seqno_a));
-              fprintf(fp_uchimeout, "\t");
-              header_fprint_strip_size(fp_uchimeout,
-                                       db_getheader(seqno_b),
-                                       db_getheaderlen(seqno_b));
-              fprintf(fp_uchimeout, "\t");
-            }
-          else
-            {
-              fprintf(fp_uchimeout,
-                      "%s\t%s\t%s\t",
-                      ci->query_head,
-                      db_getheader(seqno_a),
-                      db_getheader(seqno_b));
-            }
+          header_fprint_strip(fp_uchimeout,
+                              ci->query_head,
+                              ci->query_head_len,
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
+          fprintf(fp_uchimeout, "\t");
+          header_fprint_strip(fp_uchimeout,
+                              db_getheader(seqno_a),
+                              db_getheaderlen(seqno_a),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
+          fprintf(fp_uchimeout, "\t");
+          header_fprint_strip(fp_uchimeout,
+                              db_getheader(seqno_b),
+                              db_getheaderlen(seqno_b),
+                              opt_xsize,
+                              opt_xee,
+                              opt_xlength);
+          fprintf(fp_uchimeout, "\t");
 
           if(! opt_uchimeout5)
             {
-              if (opt_xsize)
+              if (QA >= QB)
                 {
-                  if (QA >= QB)
-                    {
-                      header_fprint_strip_size(fp_uchimeout,
-                                               db_getheader(seqno_a),
-                                               db_getheaderlen(seqno_a));
-                    }
-                  else
-                    {
-                      header_fprint_strip_size(fp_uchimeout,
-                                               db_getheader(seqno_b),
-                                               db_getheaderlen(seqno_b));
-                    }
-                  fprintf(fp_uchimeout, "\t");
+                  header_fprint_strip(fp_uchimeout,
+                                      db_getheader(seqno_a),
+                                      db_getheaderlen(seqno_a),
+                                      opt_xsize,
+                                      opt_xee,
+                                      opt_xlength);
                 }
               else
                 {
-                  if (QA >= QB)
-                    {
-                      fprintf(fp_uchimeout, "%s\t", db_getheader(seqno_a));
-                    }
-                  else
-                    {
-                      fprintf(fp_uchimeout, "%s\t", db_getheader(seqno_b));
-                    }
+                  header_fprint_strip(fp_uchimeout,
+                                      db_getheader(seqno_b),
+                                      db_getheaderlen(seqno_b),
+                                      opt_xsize,
+                                      opt_xee,
+                                      opt_xlength);
                 }
+              fprintf(fp_uchimeout, "\t");
             }
 
           fprintf(fp_uchimeout,
@@ -1134,8 +1548,7 @@ void query_init(struct searchinfo_s * si)
   si->qsequence = nullptr;
   si->kmers = nullptr;
   si->hits = (struct hit *) xmalloc(sizeof(struct hit) * tophits);
-  si->kmers = (count_t *) xmalloc(db_getsequencecount() *
-                                  sizeof(count_t) + 32);
+  si->kmers = (count_t *) xmalloc(db_getsequencecount()*sizeof(count_t) + 32);
   si->hit_count = 0;
   si->uh = unique_init();
   si->s = search16_init(opt_match,
@@ -1166,14 +1579,17 @@ void query_exit(struct searchinfo_s * si)
   if (si->qsequence)
     {
       xfree(si->qsequence);
+      si->qsequence = nullptr;
     }
   if (si->hits)
     {
       xfree(si->hits);
+      si->hits = nullptr;
     }
   if (si->kmers)
     {
       xfree(si->kmers);
+      si->kmers = nullptr;
     }
 }
 
@@ -1181,9 +1597,9 @@ void partition_query(struct chimera_info_s * ci)
 {
   int rest = ci->query_len;
   char * p = ci->query_seq;
-  for (int i=0; i<parts; i++)
+  for (int i = 0; i < parts; i++)
     {
-      int len = (rest+(parts-1-i))/(parts-i);
+      int len = (rest + (parts - i - 1)) / (parts - i);
 
       struct searchinfo_s * si = ci->si + i;
 
@@ -1210,16 +1626,20 @@ void chimera_thread_init(struct chimera_info_s * ci)
   ci->maxi = nullptr;
   ci->maxsmooth = nullptr;
   ci->match = nullptr;
+  ci->insert = nullptr;
   ci->smooth = nullptr;
-  ci->paln[0] = nullptr;
-  ci->paln[1] = nullptr;
   ci->qaln = nullptr;
   ci->diffs = nullptr;
   ci->votes = nullptr;
   ci->model = nullptr;
   ci->ignore = nullptr;
 
-  for(int i = 0; i < parts; i++)
+  for (int f = 0; f < maxparents; f++)
+    {
+      ci->paln[f] = nullptr;
+    }
+
+  for(int i = 0; i < maxparts; i++)
     {
       query_init(ci->si + i);
     }
@@ -1244,7 +1664,7 @@ void chimera_thread_exit(struct chimera_info_s * ci)
 {
   search16_exit(ci->s);
 
-  for(int i = 0; i < parts; i++)
+  for(int i = 0; i < maxparts; i++)
     {
       query_exit(ci->si + i);
     }
@@ -1257,6 +1677,10 @@ void chimera_thread_exit(struct chimera_info_s * ci)
     {
       xfree(ci->match);
     }
+  if (ci->insert)
+    {
+      xfree(ci->insert);
+    }
   if (ci->smooth)
     {
       xfree(ci->smooth);
@@ -1285,14 +1709,6 @@ void chimera_thread_exit(struct chimera_info_s * ci)
     {
       xfree(ci->qaln);
     }
-  if (ci->paln[0])
-    {
-      xfree(ci->paln[0]);
-    }
-  if (ci->paln[1])
-    {
-      xfree(ci->paln[1]);
-    }
   if (ci->query_seq)
     {
       xfree(ci->query_seq);
@@ -1301,6 +1717,12 @@ void chimera_thread_exit(struct chimera_info_s * ci)
     {
       xfree(ci->query_head);
     }
+
+  for (int f = 0; f < maxparents; f++)
+    if (ci->paln[f])
+      {
+        xfree(ci->paln[f]);
+      }
 }
 
 uint64_t chimera_thread_core(struct chimera_info_s * ci)
@@ -1434,6 +1856,7 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
           if (allhits_list[i].nwalignment)
             {
               xfree(allhits_list[i].nwalignment);
+              allhits_list[i].nwalignment = nullptr;
             }
         }
 
@@ -1509,13 +1932,28 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
 
       /* find the best pair of parents, then compute score for them */
 
-      if (find_best_parents(ci))
+      if (opt_chimeras_denovo)
         {
-          status = eval_parents(ci);
+          /* long high-quality reads */
+          if (find_best_parents_long(ci))
+            {
+              status = eval_parents_long(ci);
+            }
+          else
+            {
+              status = 0;
+            }
         }
       else
         {
-          status = 0;
+          if (find_best_parents(ci))
+            {
+              status = eval_parents(ci);
+            }
+          else
+            {
+              status = 0;
+            }
         }
 
       /* output results */
@@ -1587,16 +2025,12 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
             {
               fprintf(fp_uchimeout, "0.0000\t");
 
-              if (opt_xsize)
-                {
-                  header_fprint_strip_size(fp_uchimeout,
-                                           ci->query_head,
-                                           ci->query_head_len);
-                }
-              else
-                {
-                  fprintf(fp_uchimeout, "%s", ci->query_head);
-                }
+              header_fprint_strip(fp_uchimeout,
+                                  ci->query_head,
+                                  ci->query_head_len,
+                                  opt_xsize,
+                                  opt_xee,
+                                  opt_xlength);
 
               if (opt_uchimeout5)
                 {
@@ -1610,12 +2044,6 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
                 }
             }
 
-          /* uchime_denovo: add non-chimeras to db */
-          if (opt_uchime_denovo || opt_uchime2_denovo || opt_uchime3_denovo)
-            {
-              dbindex_addsequence(seqno, opt_qmask);
-            }
-
           if (opt_nonchimeras)
             {
               fasta_print_general(fp_nonchimeras,
@@ -1636,13 +2064,21 @@ uint64_t chimera_thread_core(struct chimera_info_s * ci)
             }
         }
 
+      if (status < 3)
+        {
+          /* uchime_denovo: add non-chimeras to db */
+          if (opt_uchime_denovo || opt_uchime2_denovo || opt_uchime3_denovo || opt_chimeras_denovo)
+            {
+              dbindex_addsequence(seqno, opt_qmask);
+            }
+        }
+
       for (int i=0; i < ci->cand_count; i++)
         {
           if (ci->nwcigar[i])
             {
               xfree(ci->nwcigar[i]);
             }
-
         }
 
       if (opt_uchime_ref)
@@ -1727,10 +2163,20 @@ void chimera()
 {
   open_chimera_file(&fp_chimeras, opt_chimeras);
   open_chimera_file(&fp_nonchimeras, opt_nonchimeras);
-  open_chimera_file(&fp_uchimealns, opt_uchimealns);
-  open_chimera_file(&fp_uchimeout, opt_uchimeout);
   open_chimera_file(&fp_borderline, opt_borderline);
 
+  if (opt_chimeras_denovo)
+    {
+      open_chimera_file(&fp_uchimealns, opt_alnout);
+      open_chimera_file(&fp_uchimeout, opt_tabbedout);
+    }
+  else
+    {
+      open_chimera_file(&fp_uchimealns, opt_uchimealns);
+      open_chimera_file(&fp_uchimeout, opt_uchimeout);
+    }
+
+
   /* override any options the user might have set */
   opt_maxaccepts = few;
   opt_maxrejects = rejects;
@@ -1741,7 +2187,7 @@ void chimera()
       fatal("Only --strand plus is allowed with uchime_ref.");
     }
 
-  if (opt_uchime_denovo || opt_uchime2_denovo || opt_uchime3_denovo)
+  if (! opt_uchime_ref)
     {
       opt_self = 1;
       opt_selfid = 1;
@@ -1808,10 +2254,16 @@ void chimera()
         {
           denovo_dbname = opt_uchime2_denovo;
         }
-      else
-        { // opt_uchime3_denovo
+      else if (opt_uchime3_denovo)
+        {
           denovo_dbname = opt_uchime3_denovo;
         }
+      else if (opt_chimeras_denovo)
+        {
+          denovo_dbname = opt_chimeras_denovo;
+        }
+      else
+        fatal("Internal error");
 
       db_read(denovo_dbname, 0);
 
@@ -1852,62 +2304,122 @@ void chimera()
     {
       if (total_count > 0)
         {
-          fprintf(stderr,
-                  "Found %d (%.1f%%) chimeras, "
-                  "%d (%.1f%%) non-chimeras,\n"
-                  "and %d (%.1f%%) borderline sequences "
-                  "in %u unique sequences.\n",
-                  chimera_count,
-                  100.0 * chimera_count / total_count,
-                  nonchimera_count,
-                  100.0 * nonchimera_count / total_count,
-                  borderline_count,
-                  100.0 * borderline_count / total_count,
-                  total_count);
+          if (opt_chimeras_denovo)
+            {
+              fprintf(stderr,
+                      "Found %d (%.1f%%) chimeras and "
+                      "%d (%.1f%%) non-chimeras "
+                      "in %u unique sequences.\n",
+                      chimera_count,
+                      100.0 * chimera_count / total_count,
+                      nonchimera_count,
+                      100.0 * nonchimera_count / total_count,
+                      total_count);
+            }
+          else
+            {
+              fprintf(stderr,
+                      "Found %d (%.1f%%) chimeras, "
+                      "%d (%.1f%%) non-chimeras,\n"
+                      "and %d (%.1f%%) borderline sequences "
+                      "in %u unique sequences.\n",
+                      chimera_count,
+                      100.0 * chimera_count / total_count,
+                      nonchimera_count,
+                      100.0 * nonchimera_count / total_count,
+                      borderline_count,
+                      100.0 * borderline_count / total_count,
+                      total_count);
+            }
         }
       else
         {
-          fprintf(stderr,
-                  "Found %d chimeras, "
-                  "%d non-chimeras,\n"
-                  "and %d borderline sequences "
-                  "in %u unique sequences.\n",
-                  chimera_count,
-                  nonchimera_count,
-                  borderline_count,
-                  total_count);
+          if (opt_chimeras_denovo)
+            {
+              fprintf(stderr,
+                      "Found %d chimeras and "
+                      "%d non-chimeras "
+                      "in %u unique sequences.\n",
+                      chimera_count,
+                      nonchimera_count,
+                      total_count);
+            }
+          else
+            {
+              fprintf(stderr,
+                      "Found %d chimeras, "
+                      "%d non-chimeras,\n"
+                      "and %d borderline sequences "
+                      "in %u unique sequences.\n",
+                      chimera_count,
+                      nonchimera_count,
+                      borderline_count,
+                      total_count);
+            }
         }
 
       if (total_abundance > 0)
         {
-          fprintf(stderr,
-                  "Taking abundance information into account, "
-                  "this corresponds to\n"
-                  "%" PRId64 " (%.1f%%) chimeras, "
-                  "%" PRId64 " (%.1f%%) non-chimeras,\n"
-                  "and %" PRId64 " (%.1f%%) borderline sequences "
-                  "in %" PRId64 " total sequences.\n",
-                  chimera_abundance,
-                  100.0 * chimera_abundance / total_abundance,
-                  nonchimera_abundance,
-                  100.0 * nonchimera_abundance / total_abundance,
-                  borderline_abundance,
-                  100.0 * borderline_abundance / total_abundance,
-                  total_abundance);
+          if (opt_chimeras_denovo)
+            {
+              fprintf(stderr,
+                      "Taking abundance information into account, "
+                      "this corresponds to\n"
+                      "%" PRId64 " (%.1f%%) chimeras and "
+                      "%" PRId64 " (%.1f%%) non-chimeras "
+                      "in %" PRId64 " total sequences.\n",
+                      chimera_abundance,
+                      100.0 * chimera_abundance / total_abundance,
+                      nonchimera_abundance,
+                      100.0 * nonchimera_abundance / total_abundance,
+                      total_abundance);
+            }
+          else
+            {
+              fprintf(stderr,
+                      "Taking abundance information into account, "
+                      "this corresponds to\n"
+                      "%" PRId64 " (%.1f%%) chimeras, "
+                      "%" PRId64 " (%.1f%%) non-chimeras,\n"
+                      "and %" PRId64 " (%.1f%%) borderline sequences "
+                      "in %" PRId64 " total sequences.\n",
+                      chimera_abundance,
+                      100.0 * chimera_abundance / total_abundance,
+                      nonchimera_abundance,
+                      100.0 * nonchimera_abundance / total_abundance,
+                      borderline_abundance,
+                      100.0 * borderline_abundance / total_abundance,
+                      total_abundance);
+            }
         }
       else
         {
-          fprintf(stderr,
-                  "Taking abundance information into account, "
-                  "this corresponds to\n"
-                  "%" PRId64 " chimeras, "
-                  "%" PRId64 " non-chimeras,\n"
-                  "and %" PRId64 " borderline sequences "
-                  "in %" PRId64 " total sequences.\n",
-                  chimera_abundance,
-                  nonchimera_abundance,
-                  borderline_abundance,
-                  total_abundance);
+          if (opt_chimeras_denovo)
+            {
+              fprintf(stderr,
+                      "Taking abundance information into account, "
+                      "this corresponds to\n"
+                      "%" PRId64 " chimeras, "
+                      "%" PRId64 " non-chimeras "
+                      "in %" PRId64 " total sequences.\n",
+                      chimera_abundance,
+                      nonchimera_abundance,
+                      total_abundance);
+            }
+          else
+            {
+              fprintf(stderr,
+                      "Taking abundance information into account, "
+                      "this corresponds to\n"
+                      "%" PRId64 " chimeras, "
+                      "%" PRId64 " non-chimeras,\n"
+                      "and %" PRId64 " borderline sequences "
+                      "in %" PRId64 " total sequences.\n",
+                      chimera_abundance,
+                      nonchimera_abundance,
+                      borderline_abundance,
+                      total_abundance);
+            }
         }
     }
 


=====================================
src/cluster.cc
=====================================
@@ -177,7 +177,7 @@ inline void cluster_query_core(struct searchinfo_s * si)
   /* the main core function for clustering */
 
   /* get sequence etc */
-  int seqno = si->query_no;
+  const int seqno = si->query_no;
   si->query_head_len = db_getheaderlen(seqno);
   si->query_head = db_getheader(seqno);
   si->qsize = db_getabundance(seqno);
@@ -375,13 +375,15 @@ char * relabel_otu(int clusterno, char * sequence, int seqlen)
   char * label = nullptr;
   if (opt_relabel)
     {
-      label = (char*) xmalloc(strlen(opt_relabel) + 21);
-      sprintf(label, "%s%d", opt_relabel, clusterno+1);
+      int size = strlen(opt_relabel) + 21;
+      label = (char*) xmalloc(size);
+      snprintf(label, size, "%s%d", opt_relabel, clusterno + 1);
     }
   else if (opt_relabel_self)
     {
-      label = (char*) xmalloc(seqlen + 1);
-      sprintf(label, "%.*s", seqlen, sequence);
+      int size = seqlen + 1;
+      label = (char*) xmalloc(size);
+      snprintf(label, size, "%.*s", seqlen, sequence);
     }
   else if (opt_relabel_sha1)
     {
@@ -529,11 +531,12 @@ void cluster_core_results_nohit(int clusterno,
   if (opt_uc)
     {
       fprintf(fp_uc, "S\t%d\t%d\t*\t*\t*\t*\t*\t", clusters, qseqlen);
-      header_fprint_strip_size_ee(fp_uc,
-                                  query_head,
-                                  strlen(query_head),
-                                  opt_xsize,
-                                  opt_xee);
+      header_fprint_strip(fp_uc,
+                          query_head,
+                          strlen(query_head),
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
       fprintf(fp_uc, "\t*\n");
     }
 
@@ -591,8 +594,8 @@ void cluster_core_parallel()
   /* create threads and set them in stand-by mode */
   threads_init();
 
-  const int queries_per_thread = 1;
-  int max_queries = queries_per_thread * opt_threads;
+  constexpr static int queries_per_thread = 1;
+  const int max_queries = queries_per_thread * opt_threads;
 
   /* allocate memory for the search information for each query;
      and initialize it */
@@ -1423,9 +1426,11 @@ void cluster(char * dbname,
   /* allocate memory for full file name of the clusters files */
   FILE * fp_clusters = nullptr;
   char * fn_clusters = nullptr;
+  int fn_clusters_size = 0;
   if (opt_clusters)
     {
-      fn_clusters = (char *) xmalloc(strlen(opt_clusters) + 25);
+      fn_clusters_size += strlen(opt_clusters) + 25;
+      fn_clusters = (char *) xmalloc(fn_clusters_size);
     }
 
   int lastcluster = -1;
@@ -1463,11 +1468,12 @@ void cluster(char * dbname,
               fprintf(fp_uc, "C\t%d\t%" PRId64 "\t*\t*\t*\t*\t*\t",
                       clusterno,
                       cluster_abundance[clusterno]);
-              header_fprint_strip_size_ee(fp_uc,
-                                          db_getheader(seqno),
-                                          db_getheaderlen(seqno),
-                                          opt_xsize,
-                                          opt_xee);
+              header_fprint_strip(fp_uc,
+                                  db_getheader(seqno),
+                                  db_getheaderlen(seqno),
+                                  opt_xsize,
+                                  opt_xee,
+                                  opt_xlength);
               fprintf(fp_uc, "\t*\n");
             }
 
@@ -1480,7 +1486,11 @@ void cluster(char * dbname,
                 }
 
               ordinal = 0;
-              sprintf(fn_clusters, "%s%d", opt_clusters, clusterno);
+              snprintf(fn_clusters,
+                       fn_clusters_size,
+                       "%s%d",
+                       opt_clusters,
+                       clusterno);
               fp_clusters = fopen_output(fn_clusters);
               if (!fp_clusters)
                 {


=====================================
src/derep.cc
=====================================
@@ -222,8 +222,8 @@ void rehash(struct bucket * * hashtableref, int64_t alloc_clusters)
   uint64_t new_hash_mask = new_hashtablesize - 1;
 
   auto * new_hashtable =
-    (struct bucket *) xmalloc(sizeof(bucket) * new_hashtablesize);
-  memset(new_hashtable, 0, sizeof(bucket) * new_hashtablesize);
+    (struct bucket *) xmalloc(sizeof(struct bucket) * new_hashtablesize);
+  memset(new_hashtable, 0, sizeof(struct bucket) * new_hashtablesize);
 
   /* rehash all */
   for(uint64_t i = 0; i < old_hashtablesize; i++)
@@ -381,8 +381,8 @@ void derep(char * input_filename, bool use_header)
   uint64_t hashtablesize = 2 * alloc_clusters;
   uint64_t hash_mask = hashtablesize - 1;
   auto * hashtable =
-    (struct bucket *) xmalloc(sizeof(bucket) * hashtablesize);
-  memset(hashtable, 0, sizeof(bucket) * hashtablesize);
+    (struct bucket *) xmalloc(sizeof(struct bucket) * hashtablesize);
+  memset(hashtable, 0, sizeof(struct bucket) * hashtablesize);
 
   show_rusage();
 
@@ -1110,9 +1110,9 @@ void derep_prefix()
   int hash_mask = hashtablesize - 1;
 
   auto * hashtable =
-    (struct bucket *) xmalloc(sizeof(bucket) * hashtablesize);
+    (struct bucket *) xmalloc(sizeof(struct bucket) * hashtablesize);
 
-  memset(hashtable, 0, sizeof(bucket) * hashtablesize);
+  memset(hashtable, 0, sizeof(struct bucket) * hashtablesize);
 
   int64_t clusters = 0;
   int64_t sumsize = 0;
@@ -1294,7 +1294,7 @@ void derep_prefix()
   show_rusage();
 
   progress_init("Sorting", 1);
-  qsort(hashtable, hashtablesize, sizeof(bucket), derep_compare_prefix);
+  qsort(hashtable, hashtablesize, sizeof(struct bucket), derep_compare_prefix);
   progress_done();
 
   if (clusters > 0)


=====================================
src/derepsmallmem.cc
=====================================
@@ -62,13 +62,13 @@
 
 #define HASH hash_cityhash128
 
-struct bucket
+struct sm_bucket
 {
   uint128 hash;
   uint64_t size;
 };
 
-static struct bucket * hashtable = nullptr;
+static struct sm_bucket * hashtable = nullptr;
 static uint64_t hashtablesize = 0;
 
 double find_median()
@@ -166,7 +166,7 @@ void rehash_smallmem()
   /* allocate new hash table, 50% larger */
   uint64_t new_hashtablesize = 3 * hashtablesize / 2;
   auto * new_hashtable =
-    (struct bucket *) xmalloc(sizeof(bucket) * new_hashtablesize);
+    (struct sm_bucket *) xmalloc(sizeof(struct sm_bucket) * new_hashtablesize);
 
   /* zero new hash table */
   for(uint64_t j = 0; j < new_hashtablesize; j++)
@@ -179,7 +179,7 @@ void rehash_smallmem()
   /* rehash all from old to new */
   for(uint64_t i = 0; i < hashtablesize; i++)
     {
-      struct bucket * old_bp = hashtable + i;
+      struct sm_bucket * old_bp = hashtable + i;
       if (old_bp->size)
         {
           uint64_t k = hash2bucket(old_bp->hash, new_hashtablesize);
@@ -187,7 +187,7 @@ void rehash_smallmem()
             {
               k = next_bucket(k, new_hashtablesize);
             }
-          struct bucket * new_bp = new_hashtable + k;
+          struct sm_bucket * new_bp = new_hashtable + k;
           * new_bp = * old_bp;
         }
     }
@@ -244,7 +244,7 @@ void derep_smallmem(char * input_filename)
   /* allocate initial hashtable with 1024 buckets */
 
   hashtablesize = 1024;
-  hashtable = (struct bucket *) xmalloc(sizeof(bucket) * hashtablesize);
+  hashtable = (struct sm_bucket *) xmalloc(sizeof(struct sm_bucket) * hashtablesize);
 
   /* zero hash table */
   for(uint64_t j = 0; j < hashtablesize; j++)
@@ -346,7 +346,7 @@ void derep_smallmem(char * input_filename)
 
       uint128 hash = HASH(seq_up, seqlen);
       uint64_t j =  hash2bucket(hash, hashtablesize);
-      struct bucket * bp = hashtable + j;
+      struct sm_bucket * bp = hashtable + j;
 
       while ((bp->size) && (hash != bp->hash))
         {
@@ -361,7 +361,7 @@ void derep_smallmem(char * input_filename)
 
           uint128 rc_hash = HASH(rc_seq_up, seqlen);
           uint64_t k =  hash2bucket(rc_hash, hashtablesize);
-          struct bucket * rc_bp = hashtable + k;
+          struct sm_bucket * rc_bp = hashtable + k;
 
           while ((rc_bp->size) && (rc_hash != rc_bp->hash))
             {
@@ -562,7 +562,7 @@ void derep_smallmem(char * input_filename)
 
       uint128 hash = HASH(seq_up, seqlen);
       uint64_t j =  hash2bucket(hash, hashtablesize);
-      struct bucket * bp = hashtable + j;
+      struct sm_bucket * bp = hashtable + j;
 
       while ((bp->size) && (hash != bp->hash))
         {
@@ -577,7 +577,7 @@ void derep_smallmem(char * input_filename)
 
           uint128 rc_hash = HASH(rc_seq_up, seqlen);
           uint64_t k =  hash2bucket(rc_hash, hashtablesize);
-          struct bucket * rc_bp = hashtable + k;
+          struct sm_bucket * rc_bp = hashtable + k;
 
           while ((rc_bp->size) && (rc_hash != rc_bp->hash))
             {


=====================================
src/fasta.cc
=====================================
@@ -389,11 +389,13 @@ void fasta_print_general(FILE * fp,
     {
       bool xsize = opt_xsize || (opt_sizeout && (abundance > 0));
       bool xee = opt_xee || ((opt_eeout || opt_fastq_eeout) && (ee >= 0.0));
-      header_fprint_strip_size_ee(fp,
-                                  header,
-                                  header_len,
-                                  xsize,
-                                  xee);
+      bool xlength = opt_xlength || opt_lengthout;
+      header_fprint_strip(fp,
+                          header,
+                          header_len,
+                          xsize,
+                          xee,
+                          xlength);
     }
 
   if (opt_label_suffix)
@@ -423,7 +425,31 @@ void fasta_print_general(FILE * fp,
 
   if ((opt_eeout || opt_fastq_eeout) && (ee >= 0.0))
     {
-      fprintf(fp, ";ee=%.4lf", ee);
+      if (ee < 0.000000001)
+	fprintf(fp, ";ee=%.13lf", ee);
+      else if (ee < 0.00000001)
+	fprintf(fp, ";ee=%.12lf", ee);
+      else if (ee < 0.0000001)
+	fprintf(fp, ";ee=%.11lf", ee);
+      else if (ee < 0.000001)
+	fprintf(fp, ";ee=%.10lf", ee);
+      else if (ee < 0.00001)
+	fprintf(fp, ";ee=%.9lf", ee);
+      else if (ee < 0.0001)
+	fprintf(fp, ";ee=%.8lf", ee);
+      else if (ee < 0.001)
+	fprintf(fp, ";ee=%.7lf", ee);
+      else if (ee < 0.01)
+	fprintf(fp, ";ee=%.6lf", ee);
+      else if (ee < 0.1)
+	fprintf(fp, ";ee=%.5lf", ee);
+      else
+        fprintf(fp, ";ee=%.4lf", ee);
+    }
+
+  if (opt_lengthout)
+    {
+      fprintf(fp, ";length=%d", len);
     }
 
   if (score_name)


=====================================
src/fastq.cc
=====================================
@@ -545,11 +545,13 @@ void fastq_print_general(FILE * fp,
     {
       bool xsize = opt_xsize || (opt_sizeout && (abundance > 0));
       bool xee = opt_xee || ((opt_eeout || opt_fastq_eeout) && (ee >= 0.0));
-      header_fprint_strip_size_ee(fp,
-                                  header,
-                                  header_len,
-                                  xsize,
-                                  xee);
+      bool xlength = opt_xlength || opt_lengthout;
+      header_fprint_strip(fp,
+                          header,
+                          header_len,
+                          xsize,
+                          xee,
+                          xlength);
     }
 
   if (opt_label_suffix)
@@ -569,7 +571,31 @@ void fastq_print_general(FILE * fp,
 
   if ((opt_eeout || opt_fastq_eeout) && (ee >= 0.0))
     {
-      fprintf(fp, ";ee=%.4lf", ee);
+      if (ee < 0.000000001)
+        fprintf(fp, ";ee=%.13lf", ee);
+      else if (ee < 0.00000001)
+        fprintf(fp, ";ee=%.12lf", ee);
+      else if (ee < 0.0000001)
+        fprintf(fp, ";ee=%.11lf", ee);
+      else if (ee < 0.000001)
+        fprintf(fp, ";ee=%.10lf", ee);
+      else if (ee < 0.00001)
+        fprintf(fp, ";ee=%.9lf", ee);
+      else if (ee < 0.0001)
+        fprintf(fp, ";ee=%.8lf", ee);
+      else if (ee < 0.001)
+        fprintf(fp, ";ee=%.7lf", ee);
+      else if (ee < 0.01)
+        fprintf(fp, ";ee=%.6lf", ee);
+      else if (ee < 0.1)
+        fprintf(fp, ";ee=%.5lf", ee);
+      else
+        fprintf(fp, ";ee=%.4lf", ee);
+    }
+
+  if (opt_lengthout)
+    {
+      fprintf(fp, ";length=%d", len);
     }
 
   if (opt_relabel_keep &&


=====================================
src/getseq.cc
=====================================
@@ -174,7 +174,7 @@ bool test_label_match(fastx_handle h)
           field_buffer_size += labels_longest;
         }
       field_buffer = (char *) xmalloc(field_buffer_size);
-      sprintf(field_buffer, "%s=", opt_label_field);
+      snprintf(field_buffer, field_buffer_size, "%s=", opt_label_field);
     }
 
   if (opt_label)


=====================================
src/mergepairs.cc
=====================================
@@ -1368,238 +1368,166 @@ void pair_all()
   chunks = nullptr;
 }
 
-void fastq_mergepairs()
+void print_stats(FILE * fp)
 {
-  /* fatal error if specified overlap is too small */
-
-  if (opt_fastq_minovlen < 5)
-    {
-      fatal("Overlap specified with --fastq_minovlen must be at least 5");
-    }
-
-  /* relax default parameters in case of short overlaps */
-
-  if (opt_fastq_minovlen < 9)
-    {
-      merge_mindiagcount = opt_fastq_minovlen - 4;
-      merge_minscore = 1.6 * opt_fastq_minovlen;
-    }
-
-  /* open input files */
-
-  fastq_fwd = fastq_open(opt_fastq_mergepairs);
-  fastq_rev = fastq_open(opt_reverse);
-
-  /* open output files */
-
-  if (opt_fastqout)
-    {
-      fp_fastqout = fileopenw(opt_fastqout);
-    }
-  if (opt_fastaout)
-    {
-      fp_fastaout = fileopenw(opt_fastaout);
-    }
-  if (opt_fastqout_notmerged_fwd)
-    {
-      fp_fastqout_notmerged_fwd = fileopenw(opt_fastqout_notmerged_fwd);
-    }
-  if (opt_fastqout_notmerged_rev)
-    {
-      fp_fastqout_notmerged_rev = fileopenw(opt_fastqout_notmerged_rev);
-    }
-  if (opt_fastaout_notmerged_fwd)
-    {
-      fp_fastaout_notmerged_fwd = fileopenw(opt_fastaout_notmerged_fwd);
-    }
-  if (opt_fastaout_notmerged_rev)
-    {
-      fp_fastaout_notmerged_rev = fileopenw(opt_fastaout_notmerged_rev);
-    }
-  if (opt_eetabbedout)
-    {
-      fp_eetabbedout = fileopenw(opt_eetabbedout);
-    }
-
-  /* precompute merged quality values */
-
-  precompute_qual();
-
-  /* main */
-
-  uint64_t filesize = fastq_get_size(fastq_fwd);
-  progress_init("Merging reads", filesize);
-
-  if (! fastq_fwd->is_empty)
-    {
-      pair_all();
-    }
-
-  progress_done();
-
-  if (fastq_next(fastq_rev, true, chrmap_upcase))
-    {
-      fatal("More reverse reads than forward reads");
-    }
-
-  fprintf(stderr,
+  fprintf(fp,
           "%10" PRIu64 "  Pairs\n",
           total);
 
-  fprintf(stderr,
+  fprintf(fp,
           "%10" PRIu64 "  Merged",
           merged);
   if (total > 0)
     {
-      fprintf(stderr,
+      fprintf(fp,
               " (%.1lf%%)",
               100.0 * merged / total);
     }
-  fprintf(stderr, "\n");
+  fprintf(fp, "\n");
 
-  fprintf(stderr,
+  fprintf(fp,
           "%10" PRIu64 "  Not merged",
           notmerged);
   if (total > 0)
     {
-      fprintf(stderr,
+      fprintf(fp,
               " (%.1lf%%)",
               100.0 * notmerged / total);
     }
-  fprintf(stderr, "\n");
+  fprintf(fp, "\n");
 
   if (notmerged > 0)
     {
-      fprintf(stderr, "\nPairs that failed merging due to various reasons:\n");
+      fprintf(fp, "\nPairs that failed merging due to various reasons:\n");
     }
 
   if (failed_undefined)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  undefined reason\n",
               failed_undefined);
     }
 
   if (failed_minlen)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  reads too short (after truncation)\n",
               failed_minlen);
     }
 
   if (failed_maxlen)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  reads too long (after truncation)\n",
               failed_maxlen);
     }
 
   if (failed_maxns)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  too many N's\n",
               failed_maxns);
     }
 
   if (failed_nokmers)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  too few kmers found on same diagonal\n",
               failed_nokmers);
     }
 
   if (failed_repeat)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  multiple potential alignments\n",
               failed_repeat);
     }
 
   if (failed_maxdiffs)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  too many differences\n",
               failed_maxdiffs);
     }
 
   if (failed_maxdiffpct)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  too high percentage of differences\n",
               failed_maxdiffpct);
     }
 
   if (failed_minscore)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  alignment score too low, or score drop too high\n",
               failed_minscore);
     }
 
   if (failed_minovlen)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  overlap too short\n",
               failed_minovlen);
     }
 
   if (failed_maxee)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  expected error too high\n",
               failed_maxee);
     }
 
   if (failed_minmergelen)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  merged fragment too short\n",
               failed_minmergelen);
     }
 
   if (failed_maxmergelen)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  merged fragment too long\n",
               failed_maxmergelen);
     }
 
   if (failed_staggered)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  staggered read pairs\n",
               failed_staggered);
     }
 
   if (failed_indel)
     {
-      fprintf(stderr,
+      fprintf(fp,
               "%10" PRIu64 "  indel errors\n",
               failed_indel);
     }
 
-  fprintf(stderr, "\n");
+  fprintf(fp, "\n");
 
   if (total > 0)
     {
-      fprintf(stderr, "Statistics of all reads:\n");
+      fprintf(fp, "Statistics of all reads:\n");
 
       double mean_read_length = sum_read_length / (2.0 * pairs_read);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean read length\n",
               mean_read_length);
     }
 
   if (merged > 0)
     {
-      fprintf(stderr, "\n");
+      fprintf(fp, "\n");
 
-      fprintf(stderr, "Statistics of merged reads:\n");
+      fprintf(fp, "Statistics of merged reads:\n");
 
       double mean = sum_fragment_length / merged;
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean fragment length\n",
               mean);
 
@@ -1608,34 +1536,114 @@ void fastq_mergepairs()
                            + mean * mean * merged)
                           / (merged + 0.0));
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Standard deviation of fragment length\n",
               stdev);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean expected error in forward sequences\n",
               sum_ee_fwd / merged);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean expected error in reverse sequences\n",
               sum_ee_rev / merged);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean expected error in merged sequences\n",
               sum_ee_merged / merged);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean observed errors in merged region of forward sequences\n",
               1.0 * sum_errors_fwd / merged);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean observed errors in merged region of reverse sequences\n",
               1.0 * sum_errors_rev / merged);
 
-      fprintf(stderr,
+      fprintf(fp,
               "%10.2f  Mean observed errors in merged region\n",
               1.0 * (sum_errors_fwd + sum_errors_rev) / merged);
     }
+}
+
+void fastq_mergepairs()
+{
+  /* fatal error if specified overlap is too small */
+
+  if (opt_fastq_minovlen < 5)
+    {
+      fatal("Overlap specified with --fastq_minovlen must be at least 5");
+    }
+
+  /* relax default parameters in case of short overlaps */
+
+  if (opt_fastq_minovlen < 9)
+    {
+      merge_mindiagcount = opt_fastq_minovlen - 4;
+      merge_minscore = 1.6 * opt_fastq_minovlen;
+    }
+
+  /* open input files */
+
+  fastq_fwd = fastq_open(opt_fastq_mergepairs);
+  fastq_rev = fastq_open(opt_reverse);
+
+  /* open output files */
+
+  if (opt_fastqout)
+    {
+      fp_fastqout = fileopenw(opt_fastqout);
+    }
+  if (opt_fastaout)
+    {
+      fp_fastaout = fileopenw(opt_fastaout);
+    }
+  if (opt_fastqout_notmerged_fwd)
+    {
+      fp_fastqout_notmerged_fwd = fileopenw(opt_fastqout_notmerged_fwd);
+    }
+  if (opt_fastqout_notmerged_rev)
+    {
+      fp_fastqout_notmerged_rev = fileopenw(opt_fastqout_notmerged_rev);
+    }
+  if (opt_fastaout_notmerged_fwd)
+    {
+      fp_fastaout_notmerged_fwd = fileopenw(opt_fastaout_notmerged_fwd);
+    }
+  if (opt_fastaout_notmerged_rev)
+    {
+      fp_fastaout_notmerged_rev = fileopenw(opt_fastaout_notmerged_rev);
+    }
+  if (opt_eetabbedout)
+    {
+      fp_eetabbedout = fileopenw(opt_eetabbedout);
+    }
+
+  /* precompute merged quality values */
+
+  precompute_qual();
+
+  /* main */
+
+  uint64_t filesize = fastq_get_size(fastq_fwd);
+  progress_init("Merging reads", filesize);
+
+  if (! fastq_fwd->is_empty)
+    {
+      pair_all();
+    }
+
+  progress_done();
+
+  if (fastq_next(fastq_rev, true, chrmap_upcase))
+    {
+      fatal("More reverse reads than forward reads");
+    }
+
+  if (fp_log)
+    print_stats(fp_log);
+  else
+    print_stats(stderr);
 
   /* clean up */
 


=====================================
src/msa.cc
=====================================
@@ -159,7 +159,8 @@ void msa(FILE * fp_msaout, FILE * fp_consout, FILE * fp_profile,
 
   /* allocate memory for profile (for consensus) and aligned seq */
   profile = (prof_type *) xmalloc(PROFSIZE * sizeof(prof_type) * alnlen);
-  memset(profile, 0, PROFSIZE * sizeof(prof_type) * alnlen);
+  for (int i=0; i < PROFSIZE * alnlen; i++)
+    profile[i] = 0;
   aln = (char *) xmalloc(alnlen+1);
   char * cons = (char *) xmalloc(alnlen+1);
 


=====================================
src/results.cc
=====================================
@@ -288,17 +288,19 @@ void results_show_uc_one(FILE * fp,
               hp->id,
               hp->strand ? '-' : '+',
               perfect ? "=" : hp->nwalignment);
-      header_fprint_strip_size_ee(fp,
-                                  query_head,
-                                  strlen(query_head),
-                                  opt_xsize,
-                                  opt_xee);
+      header_fprint_strip(fp,
+                          query_head,
+                          strlen(query_head),
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
       fprintf(fp, "\t");
-      header_fprint_strip_size_ee(fp,
-                                  db_getheader(hp->target),
-                                  db_getheaderlen(hp->target),
-                                  opt_xsize,
-                                  opt_xee);
+      header_fprint_strip(fp,
+                          db_getheader(hp->target),
+                          db_getheaderlen(hp->target),
+                          opt_xsize,
+                          opt_xee,
+                          opt_xlength);
       fprintf(fp, "\n");
     }
   else


=====================================
src/search.cc
=====================================
@@ -77,7 +77,7 @@ static int qmatches;
 static uint64 qmatches_abundance;
 static int queries;
 static uint64 queries_abundance;
-static int * dbmatched;
+static uint64 * dbmatched;
 static FILE * fp_samout = nullptr;
 static FILE * fp_alnout = nullptr;
 static FILE * fp_userout = nullptr;
@@ -306,7 +306,7 @@ void search_output_results(int hit_count,
     {
       if (hits[i].accepted)
         {
-          dbmatched[hits[i].target]++;
+          dbmatched[hits[i].target] += opt_sizein ? qsize : 1;
         }
     }
 
@@ -693,16 +693,14 @@ void search_prep(char * cmdline, char * progheader)
   if (is_udb)
     {
       udb_read(opt_db, true, true);
+      results_show_samheader(fp_samout, cmdline, opt_db);
+      show_rusage();
+      seqcount = db_getsequencecount();
     }
   else
     {
       db_read(opt_db, 0);
-    }
-
-  results_show_samheader(fp_samout, cmdline, opt_db);
-
-  if (!is_udb)
-    {
+      results_show_samheader(fp_samout, cmdline, opt_db);
       if (opt_dbmask == MASK_DUST)
         {
           dust_all();
@@ -711,14 +709,8 @@ void search_prep(char * cmdline, char * progheader)
         {
           hardmask_all();
         }
-    }
-
-  show_rusage();
-
-  seqcount = db_getsequencecount();
-
-  if (!is_udb)
-    {
+      show_rusage();
+      seqcount = db_getsequencecount();
       dbindex_prepare(1, opt_dbmask);
       dbindex_addallsequences(opt_dbmask);
     }
@@ -819,8 +811,8 @@ void usearch_global(char * cmdline, char * progheader)
         }
     }
 
-  dbmatched = (int*) xmalloc(seqcount * sizeof(int*));
-  memset(dbmatched, 0, seqcount * sizeof(int*));
+  dbmatched = (uint64*) xmalloc(seqcount * sizeof(uint64*));
+  memset(dbmatched, 0, seqcount * sizeof(uint64*));
 
   otutable_init();
 


=====================================
src/searchcore.cc
=====================================
@@ -219,7 +219,7 @@ void search_topscores(struct searchinfo_s * si)
   */
 
   /* count kmer hits in the database sequences */
-  int indexed_count = dbindex_getcount();
+  const int indexed_count = dbindex_getcount();
 
   /* zero counts */
   memset(si->kmers, 0, indexed_count * sizeof(count_t));
@@ -259,7 +259,7 @@ void search_topscores(struct searchinfo_s * si)
         }
     }
 
-  int minmatches = MIN(opt_minwordmatches, si->kmersamplecount);
+  const int minmatches = MIN(opt_minwordmatches, si->kmersamplecount);
 
   for(int i=0; i < indexed_count; i++)
     {
@@ -285,8 +285,8 @@ int seqncmp(char * a, char * b, uint64_t n)
 {
   for(unsigned int i = 0; i<n; i++)
     {
-      int x = chrmap_4bit[(int)(a[i])];
-      int y = chrmap_4bit[(int)(b[i])];
+      const int x = chrmap_4bit[(int)(a[i])];
+      const int y = chrmap_4bit[(int)(b[i])];
       if (x < y)
         {
           return -1;
@@ -423,16 +423,16 @@ void align_trim(struct hit * hit)
     }
 }
 
-int search_acceptable_unaligned(struct searchinfo_s * si,
-                                int target)
+auto search_acceptable_unaligned(struct searchinfo_s * si,
+                                 int target) -> bool
 {
-  /* consider whether a hit satisfy accept criteria before alignment */
+  /* consider whether a hit satisfies accept criteria before alignment */
 
   char * qseq = si->qsequence;
   char * dlabel = db_getheader(target);
   char * dseq = db_getsequence(target);
-  int64_t dseqlen = db_getsequencelen(target);
-  int64_t tsize = db_getabundance(target);
+  const int64_t dseqlen = db_getsequencelen(target);
+  const int64_t tsize = db_getabundance(target);
 
   if (
       /* maxqsize */
@@ -485,17 +485,17 @@ int search_acceptable_unaligned(struct searchinfo_s * si,
       )
     {
       /* needs further consideration */
-      return 1;
+      return true;
     }
   else
     {
       /* reject */
-      return 0;
+      return false;
     }
 }
 
-int search_acceptable_aligned(struct searchinfo_s * si,
-                              struct hit * hit)
+auto search_acceptable_aligned(struct searchinfo_s * si,
+                               struct hit * hit) -> bool
 {
   if (/* weak_id */
       (hit->id >= 100.0 * opt_weak_id) &&
@@ -525,23 +525,23 @@ int search_acceptable_aligned(struct searchinfo_s * si,
     {
       if (opt_cluster_unoise)
         {
-          int d = hit->mismatches;
-          double skew = 1.0 * si->qsize / db_getabundance(hit->target);
-          double beta = 1.0 / pow(2, 1.0 * opt_unoise_alpha * d + 1);
+          const auto mismatches = hit->mismatches;
+          const double skew = 1.0 * si->qsize / db_getabundance(hit->target);
+          const double beta = 1.0 / pow(2, 1.0 * opt_unoise_alpha * mismatches + 1);
 
-          if (skew <= beta || d == 0)
+          if (skew <= beta || mismatches == 0)
             {
               /* accepted */
               hit->accepted = true;
               hit->weak = false;
-              return 1;
+              return true;
             }
           else
             {
               /* rejected, but weak hit */
               hit->rejected = true;
               hit->weak = true;
-              return 0;
+              return false;
             }
         }
       else
@@ -551,14 +551,14 @@ int search_acceptable_aligned(struct searchinfo_s * si,
               /* accepted */
               hit->accepted = true;
               hit->weak = false;
-              return 1;
+              return true;
             }
           else
             {
               /* rejected, but weak hit */
               hit->rejected = true;
               hit->weak = true;
-              return 0;
+              return false;
             }
         }
     }
@@ -567,7 +567,7 @@ int search_acceptable_aligned(struct searchinfo_s * si,
       /* rejected */
       hit->rejected = true;
       hit->weak = false;
-      return 0;
+      return false;
     }
 }
 


=====================================
src/searchcore.h
=====================================
@@ -58,13 +58,13 @@
 
 */
 
-//#define COMPARENONVECTORIZED
+#include <array>
 
 /* the number of alignments that can be delayed */
-#define MAXDELAYED 8
+constexpr auto MAXDELAYED = 8U;
 
 /* Default minimum number of word matches for word lengths 3-15 */
-const int minwordmatches_defaults[] =
+constexpr std::array<int, 16> minwordmatches_defaults =
   { -1, -1, -1, 18, 17, 16, 15, 14, 12, 11, 10,  9,  8,  7,  5,  3 };
 
 struct hit
@@ -152,10 +152,11 @@ struct hit * search_findbest2_byid(struct searchinfo_s * si_p,
 struct hit * search_findbest2_bysize(struct searchinfo_s * si_p,
                                      struct searchinfo_s * si_m);
 
-int search_acceptable_unaligned(struct searchinfo_s * si, int target);
+auto search_acceptable_unaligned(struct searchinfo_s * si,
+                                 int target) -> bool;
 
-int search_acceptable_aligned(struct searchinfo_s * si,
-                              struct hit * hit);
+auto search_acceptable_aligned(struct searchinfo_s * si,
+                               struct hit * hit) -> bool;
 
 void align_trim(struct hit * hit);
 


=====================================
src/searchexact.cc
=====================================
@@ -74,8 +74,10 @@ static fastx_handle query_fasta_h;
 static pthread_mutex_t mutex_input;
 static pthread_mutex_t mutex_output;
 static int qmatches;
+static uint64 qmatches_abundance;
 static int queries;
-static int * dbmatched;
+static uint64 queries_abundance;
+static uint64 * dbmatched;
 static FILE * fp_samout = nullptr;
 static FILE * fp_alnout = nullptr;
 static FILE * fp_userout = nullptr;
@@ -368,7 +370,7 @@ void search_exact_output_results(int hit_count,
     {
       if (hits[i].accepted)
         {
-          dbmatched[hits[i].target]++;
+          dbmatched[hits[i].target] += opt_sizein ? qsize : 1;
         }
     }
 
@@ -493,11 +495,13 @@ void search_exact_thread_run(int64_t t)
 
           /* update stats */
           queries++;
+	  queries_abundance += qsize;
 
           if (match)
             {
               qmatches++;
-            }
+	      qmatches_abundance += qsize;
+	    }
 
           /* show progress */
           progress_update(progress);
@@ -745,8 +749,8 @@ void search_exact_prep(char * cmdline, char * progheader)
   /* tophits = the maximum number of hits we need to store */
   tophits = seqcount;
 
-  dbmatched = (int*) xmalloc(seqcount * sizeof(int*));
-  memset(dbmatched, 0, seqcount * sizeof(int*));
+  dbmatched = (uint64*) xmalloc(seqcount * sizeof(uint64*));
+  memset(dbmatched, 0, seqcount * sizeof(uint64*));
 
   dbhash_open(seqcount);
   dbhash_add_all();
@@ -822,7 +826,9 @@ void search_exact(char * cmdline, char * progheader)
 
   /* prepare reading of queries */
   qmatches = 0;
+  qmatches_abundance = 0;
   queries = 0;
+  queries_abundance = 0;
   query_fasta_h = fasta_open(opt_search_exact);
 
   /* allocate memory for thread info */
@@ -862,22 +868,48 @@ void search_exact(char * cmdline, char * progheader)
 
   if (!opt_quiet)
     {
-      fprintf(stderr, "Matching query sequences: %d of %d", qmatches, queries);
+      fprintf(stderr, "Matching unique query sequences: %d of %d",
+              qmatches, queries);
       if (queries > 0)
         {
           fprintf(stderr, " (%.2f%%)", 100.0 * qmatches / queries);
         }
       fprintf(stderr, "\n");
+      if (opt_sizein)
+	{
+          fprintf(stderr, "Matching total query sequences: %" PRIu64 " of %"
+	          PRIu64,
+                  qmatches_abundance, queries_abundance);
+          if (queries_abundance > 0)
+            {
+              fprintf(stderr, " (%.2f%%)",
+                      100.0 * qmatches_abundance / queries_abundance);
+            }
+          fprintf(stderr, "\n");
+        }
     }
 
   if (opt_log)
     {
-      fprintf(fp_log, "Matching query sequences: %d of %d", qmatches, queries);
+      fprintf(fp_log, "Matching unique query sequences: %d of %d",
+              qmatches, queries);
       if (queries > 0)
         {
           fprintf(fp_log, " (%.2f%%)", 100.0 * qmatches / queries);
         }
       fprintf(fp_log, "\n");
+      if (opt_sizein)
+        {
+          fprintf(fp_log, "Matching total query sequences: %" PRIu64 " of %"
+                  PRIu64,
+                  qmatches_abundance, queries_abundance);
+          if (queries_abundance > 0)
+            {
+              fprintf(fp_log, " (%.2f%%)",
+                      100.0 * qmatches_abundance / queries_abundance);
+            }
+          fprintf(fp_log, "\n");
+        }
     }
 
   if (fp_biomout)


=====================================
src/udb.cc
=====================================
@@ -158,8 +158,8 @@ auto udb_detect_isudb(const char * filename) -> bool
     It must be an uncompressed regular file, not a pipe.
   */
 
-  constexpr uint32_t udb_file_signature {0x55444246};
-  constexpr uint64_t expected_n_bytes {sizeof(uint32_t)};
+  constexpr static uint32_t udb_file_signature {0x55444246};
+  constexpr static uint64_t expected_n_bytes {sizeof(uint32_t)};
 
   xstat_t fs;
 


=====================================
src/vsearch.cc
=====================================
@@ -70,10 +70,11 @@ bool opt_fasta_score;
 bool opt_fastq_allowmergestagger;
 bool opt_fastq_eeout;
 bool opt_fastq_nostagger;
+bool opt_fastq_qout_max;
 bool opt_gzip_decompress;
 bool opt_label_substr_match;
+bool opt_lengthout;
 bool opt_no_progress;
-bool opt_fastq_qout_max;
 bool opt_quiet;
 bool opt_relabel_keep;
 bool opt_relabel_md5;
@@ -81,8 +82,11 @@ bool opt_relabel_self;
 bool opt_relabel_sha1;
 bool opt_samheader;
 bool opt_sff_clip;
+bool opt_sizein;
 bool opt_sizeorder;
+bool opt_sizeout;
 bool opt_xee;
+bool opt_xlength;
 bool opt_xsize;
 char * opt_allpairs_global;
 char * opt_alnout;
@@ -91,6 +95,8 @@ char * opt_blast6out;
 char * opt_borderline;
 char * opt_centroids;
 char * opt_chimeras;
+char * opt_chimeras_alnout;
+char * opt_chimeras_denovo;
 char * opt_cluster_fast;
 char * opt_cluster_size;
 char * opt_cluster_smallmem;
@@ -117,8 +123,8 @@ char * opt_fastaout_rev;
 char * opt_fastapairs;
 char * opt_fastq_chars;
 char * opt_fastq_convert;
-char * opt_fastq_eestats;
 char * opt_fastq_eestats2;
+char * opt_fastq_eestats;
 char * opt_fastq_filter;
 char * opt_fastq_join;
 char * opt_fastq_mergepairs;
@@ -140,11 +146,11 @@ char * opt_fastx_uniques;
 char * opt_join_padgap;
 char * opt_join_padgapq;
 char * opt_label;
-char * opt_labels;
+char * opt_label_field;
 char * opt_label_suffix;
 char * opt_label_word;
 char * opt_label_words;
-char * opt_label_field;
+char * opt_labels;
 char * opt_lcaout;
 char * opt_log;
 char * opt_makeudb_usearch;
@@ -174,16 +180,16 @@ char * opt_sortbylength;
 char * opt_sortbysize;
 char * opt_tabbedout;
 char * opt_tsegout;
-char * opt_udb2fasta;
-char * opt_udbinfo;
-char * opt_udbstats;
 char * opt_uc;
-char * opt_uchime_denovo;
 char * opt_uchime2_denovo;
 char * opt_uchime3_denovo;
+char * opt_uchime_denovo;
 char * opt_uchime_ref;
 char * opt_uchimealns;
 char * opt_uchimeout;
+char * opt_udb2fasta;
+char * opt_udbinfo;
+char * opt_udbstats;
 char * opt_usearch_global;
 char * opt_userout;
 double * opt_ee_cutoffs_values;
@@ -216,6 +222,9 @@ double opt_weak_id;
 double opt_xn;
 int opt_acceptall;
 int opt_alignwidth;
+int opt_chimeras_length_min;
+int opt_chimeras_parents_max;
+int opt_chimeras_parts;
 int opt_cons_truncate;
 int opt_ee_cutoffs_count;
 int opt_gap_extension_query_interior;
@@ -231,9 +240,9 @@ int opt_gap_open_target_interior;
 int opt_gap_open_target_left;
 int opt_gap_open_target_right;
 int opt_help;
-int opt_length_cutoffs_shortest;
-int opt_length_cutoffs_longest;
 int opt_length_cutoffs_increment;
+int opt_length_cutoffs_longest;
+int opt_length_cutoffs_shortest;
 int opt_mindiffs;
 int opt_slots;
 int opt_uchimeout5;
@@ -293,11 +302,9 @@ int64_t opt_rowlen;
 int64_t opt_sample_size;
 int64_t opt_self;
 int64_t opt_selfid;
-int64_t opt_sizein;
-int64_t opt_sizeout;
 int64_t opt_strand;
-int64_t opt_subseq_start;
 int64_t opt_subseq_end;
+int64_t opt_subseq_start;
 int64_t opt_threads;
 int64_t opt_top_hits_only;
 int64_t opt_topn;
@@ -710,7 +717,7 @@ int64_t args_getlong(char * arg)
 {
   int len = 0;
   int64_t temp = 0;
-  int ret = sscanf(arg, "%" PRId64 "%n", &temp, &len);
+  const int ret = sscanf(arg, "%" PRId64 "%n", &temp, &len);
   if ((ret == 0) || (((unsigned int)(len)) < strlen(arg)))
     {
       fatal("Illegal option argument");
@@ -722,7 +729,7 @@ double args_getdouble(char * arg)
 {
   int len = 0;
   double temp = 0;
-  int ret = sscanf(arg, "%lf%n", &temp, &len);
+  const int ret = sscanf(arg, "%lf%n", &temp, &len);
   if ((ret == 0) || (((unsigned int)(len)) < strlen(arg)))
     {
       fatal("Illegal option argument");
@@ -736,17 +743,21 @@ void args_init(int argc, char **argv)
 
   progname = argv[0];
 
-  opt_abskew = -1.0;
+  opt_abskew = 0.0;
   opt_acceptall = 0;
   opt_alignwidth = 80;
   opt_allpairs_global = nullptr;
   opt_alnout = nullptr;
-  opt_blast6out = nullptr;
   opt_biomout = nullptr;
+  opt_blast6out = nullptr;
   opt_borderline = nullptr;
   opt_bzip2_decompress = false;
   opt_centroids = nullptr;
   opt_chimeras = nullptr;
+  opt_chimeras_denovo = nullptr;
+  opt_chimeras_length_min = 10;
+  opt_chimeras_parents_max = 3;
+  opt_chimeras_parts = 0;
   opt_cluster_fast = nullptr;
   opt_cluster_size = nullptr;
   opt_cluster_smallmem = nullptr;
@@ -775,13 +786,13 @@ void args_init(int argc, char **argv)
   opt_eeout = false;
   opt_eetabbedout = nullptr;
   opt_fasta2fastq = nullptr;
-  opt_fastaout_notmerged_fwd = nullptr;
-  opt_fastaout_notmerged_rev = nullptr;
   opt_fasta_score = false;
   opt_fasta_width = 80;
   opt_fastaout = nullptr;
   opt_fastaout_discarded = nullptr;
   opt_fastaout_discarded_rev = nullptr;
+  opt_fastaout_notmerged_fwd = nullptr;
+  opt_fastaout_notmerged_rev = nullptr;
   opt_fastaout_rev = nullptr;
   opt_fastapairs = nullptr;
   opt_fastq_allowmergestagger = false;
@@ -806,8 +817,6 @@ void args_init(int argc, char **argv)
   opt_fastq_minmergelen = 0;
   opt_fastq_minovlen = 10;
   opt_fastq_nostagger = true;
-  opt_fastqout_notmerged_fwd = nullptr;
-  opt_fastqout_notmerged_rev = nullptr;
   opt_fastq_qmax = 41;
   opt_fastq_qmaxout = 41;
   opt_fastq_qmin = 0;
@@ -824,8 +833,13 @@ void args_init(int argc, char **argv)
   opt_fastqout = nullptr;
   opt_fastqout_discarded = nullptr;
   opt_fastqout_discarded_rev = nullptr;
+  opt_fastqout_notmerged_fwd = nullptr;
+  opt_fastqout_notmerged_rev = nullptr;
   opt_fastqout_rev = nullptr;
   opt_fastx_filter = nullptr;
+  opt_fastx_getseq = nullptr;
+  opt_fastx_getseqs = nullptr;
+  opt_fastx_getsubseq = nullptr;
   opt_fastx_mask = nullptr;
   opt_fastx_revcomp = nullptr;
   opt_fastx_subsample = nullptr;
@@ -842,9 +856,6 @@ void args_init(int argc, char **argv)
   opt_gap_open_target_interior=20;
   opt_gap_open_target_left=2;
   opt_gap_open_target_right=2;
-  opt_fastx_getseq = nullptr;
-  opt_fastx_getseqs = nullptr;
-  opt_fastx_getsubseq = nullptr;
   opt_gzip_decompress = false;
   opt_hardmask = 0;
   opt_help = 0;
@@ -855,18 +866,19 @@ void args_init(int argc, char **argv)
   opt_join_padgap = nullptr;
   opt_join_padgapq = nullptr;
   opt_label = nullptr;
+  opt_label_field = nullptr;
   opt_label_substr_match = false;
   opt_label_suffix = nullptr;
-  opt_labels = nullptr;
-  opt_label_field = nullptr;
   opt_label_word = nullptr;
   opt_label_words = nullptr;
+  opt_labels = nullptr;
+  opt_lca_cutoff = 1.0;
+  opt_lcaout = nullptr;
   opt_leftjust = 0;
   opt_length_cutoffs_increment = 50;
   opt_length_cutoffs_longest = INT_MAX;
   opt_length_cutoffs_shortest = 50;
-  opt_lca_cutoff = 1.0;
-  opt_lcaout = nullptr;
+  opt_lengthout = false;
   opt_log = nullptr;
   opt_makeudb_usearch = nullptr;
   opt_maskfasta = nullptr;
@@ -937,38 +949,38 @@ void args_init(int argc, char **argv)
   opt_search_exact = nullptr;
   opt_self = 0;
   opt_selfid = 0;
-  opt_sff_convert = nullptr;
   opt_sff_clip = false;
+  opt_sff_convert = nullptr;
   opt_shuffle = nullptr;
   opt_sintax = nullptr;
   opt_sintax_cutoff = 0.0;
-  opt_sizein = 0;
+  opt_sizein = false;
   opt_sizeorder = false;
-  opt_sizeout = 0;
+  opt_sizeout = false;
   opt_slots = 0;
   opt_sortbylength = nullptr;
   opt_sortbysize = nullptr;
   opt_strand = 1;
-  opt_subseq_start = 1;
   opt_subseq_end = LONG_MAX;
+  opt_subseq_start = 1;
   opt_tabbedout = nullptr;
   opt_target_cov = 0.0;
   opt_threads = 0;
   opt_top_hits_only = 0;
   opt_topn = LONG_MAX;
   opt_tsegout = nullptr;
-  opt_udb2fasta = nullptr;
-  opt_udbinfo = nullptr;
-  opt_udbstats = nullptr;
   opt_uc = nullptr;
   opt_uc_allhits = 0;
-  opt_uchime_denovo = nullptr;
   opt_uchime2_denovo = nullptr;
   opt_uchime3_denovo = nullptr;
+  opt_uchime_denovo = nullptr;
   opt_uchime_ref = nullptr;
   opt_uchimealns = nullptr;
   opt_uchimeout = nullptr;
   opt_uchimeout5 = 0;
+  opt_udb2fasta = nullptr;
+  opt_udbinfo = nullptr;
+  opt_udbstats = nullptr;
   opt_unoise_alpha = 2.0;
   opt_usearch_global = nullptr;
   opt_userout = nullptr;
@@ -976,9 +988,10 @@ void args_init(int argc, char **argv)
   opt_version = 0;
   opt_weak_id = 10.0;
   opt_wordlength = 0;
+  opt_xee = false;
+  opt_xlength = false;
   opt_xn = 8.0;
   opt_xsize = false;
-  opt_xee = false;
 
   opterr = 1;
 
@@ -996,6 +1009,10 @@ void args_init(int argc, char **argv)
       option_bzip2_decompress,
       option_centroids,
       option_chimeras,
+      option_chimeras_denovo,
+      option_chimeras_length_min,
+      option_chimeras_parents_max,
+      option_chimeras_parts,
       option_cluster_fast,
       option_cluster_size,
       option_cluster_smallmem,
@@ -1103,6 +1120,7 @@ void args_init(int argc, char **argv)
       option_lcaout,
       option_leftjust,
       option_length_cutoffs,
+      option_lengthout,
       option_log,
       option_makeudb_usearch,
       option_maskfasta,
@@ -1217,6 +1235,7 @@ void args_init(int argc, char **argv)
       option_wordlength,
       option_xdrop_nw,
       option_xee,
+      option_xlength,
       option_xn,
       option_xsize
     };
@@ -1235,6 +1254,10 @@ void args_init(int argc, char **argv)
       {"bzip2_decompress",      no_argument,       nullptr, 0 },
       {"centroids",             required_argument, nullptr, 0 },
       {"chimeras",              required_argument, nullptr, 0 },
+      {"chimeras_denovo",       required_argument, nullptr, 0 },
+      {"chimeras_length_min",   required_argument, nullptr, 0 },
+      {"chimeras_parents_max",  required_argument, nullptr, 0 },
+      {"chimeras_parts",        required_argument, nullptr, 0 },
       {"cluster_fast",          required_argument, nullptr, 0 },
       {"cluster_size",          required_argument, nullptr, 0 },
       {"cluster_smallmem",      required_argument, nullptr, 0 },
@@ -1342,6 +1365,7 @@ void args_init(int argc, char **argv)
       {"lcaout",                required_argument, nullptr, 0 },
       {"leftjust",              no_argument,       nullptr, 0 },
       {"length_cutoffs",        required_argument, nullptr, 0 },
+      {"lengthout",             no_argument,       nullptr, 0 },
       {"log",                   required_argument, nullptr, 0 },
       {"makeudb_usearch",       required_argument, nullptr, 0 },
       {"maskfasta",             required_argument, nullptr, 0 },
@@ -1456,9 +1480,10 @@ void args_init(int argc, char **argv)
       {"wordlength",            required_argument, nullptr, 0 },
       {"xdrop_nw",              required_argument, nullptr, 0 },
       {"xee",                   no_argument,       nullptr, 0 },
+      {"xlength",               no_argument,       nullptr, 0 },
       {"xn",                    required_argument, nullptr, 0 },
       {"xsize",                 no_argument,       nullptr, 0 },
-      { nullptr,                      0,                 nullptr, 0 }
+      { nullptr,                0,                 nullptr, 0 }
     };
 
   const int options_count = (sizeof(long_options) / sizeof(struct option)) - 1;
@@ -1619,7 +1644,7 @@ void args_init(int argc, char **argv)
           break;
 
         case option_sizeout:
-          opt_sizeout = 1;
+          opt_sizeout = true;
           break;
 
         case option_derep_fulllength:
@@ -1640,6 +1665,10 @@ void args_init(int argc, char **argv)
 
         case option_topn:
           opt_topn = args_getlong(optarg);
+          if (opt_topn == 0)
+            {
+              fatal("The argument to --topn must be greater than zero");
+            }
           break;
 
         case option_maxseqlength:
@@ -1647,7 +1676,7 @@ void args_init(int argc, char **argv)
           break;
 
         case option_sizein:
-          opt_sizein = 1;
+          opt_sizein = true;
           break;
 
         case option_sortbylength:
@@ -2483,6 +2512,30 @@ void args_init(int argc, char **argv)
           opt_derep_smallmem = optarg;
           break;
 
+        case option_lengthout:
+          opt_lengthout = true;
+          break;
+
+        case option_xlength:
+          opt_xlength = true;
+          break;
+
+        case option_chimeras_denovo:
+          opt_chimeras_denovo = optarg;
+          break;
+
+        case option_chimeras_length_min:
+          opt_chimeras_length_min = args_getlong(optarg);
+          break;
+
+        case option_chimeras_parts:
+          opt_chimeras_parts = args_getlong(optarg);
+          break;
+
+        case option_chimeras_parents_max:
+          opt_chimeras_parents_max = args_getlong(optarg);
+          break;
+
         default:
           fatal("Internal error in option parsing");
         }
@@ -2505,6 +2558,7 @@ void args_init(int argc, char **argv)
   int command_options[] =
     {
       option_allpairs_global,
+      option_chimeras_denovo,
       option_cluster_fast,
       option_cluster_size,
       option_cluster_smallmem,
@@ -2562,7 +2616,7 @@ void args_init(int argc, char **argv)
     The first line is the command and the lines below are the valid options.
   */
 
-  const int valid_options[][96] =
+  const int valid_options[][98] =
     {
       {
         option_allpairs_global,
@@ -2585,6 +2639,7 @@ void args_init(int argc, char **argv)
         option_idsuffix,
         option_label_suffix,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -2645,6 +2700,43 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
+        option_xsize,
+        -1 },
+
+      { option_chimeras_denovo,
+        option_abskew,
+        option_alignwidth,
+        option_alnout,
+        option_chimeras,
+        option_chimeras_length_min,
+        option_chimeras_parents_max,
+        option_chimeras_parts,
+        option_fasta_width,
+        option_gapext,
+        option_gapopen,
+        option_hardmask,
+        option_label_suffix,
+        option_log,
+        option_match,
+        option_mismatch,
+        option_no_progress,
+        option_nonchimeras,
+        option_notrunclabels,
+        option_qmask,
+        option_quiet,
+        option_relabel,
+        option_relabel_keep,
+        option_relabel_md5,
+        option_relabel_self,
+        option_relabel_sha1,
+        option_sample,
+        option_sizein,
+        option_sizeout,
+        option_tabbedout,
+        option_threads,
+        option_xee,
+        option_xn,
         option_xsize,
         -1 },
 
@@ -2674,6 +2766,7 @@ void args_init(int argc, char **argv)
         option_idsuffix,
         option_label_suffix,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -2740,6 +2833,7 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -2769,6 +2863,7 @@ void args_init(int argc, char **argv)
         option_idsuffix,
         option_label_suffix,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -2835,6 +2930,7 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -2864,6 +2960,7 @@ void args_init(int argc, char **argv)
         option_idsuffix,
         option_label_suffix,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -2931,6 +3028,7 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -2960,6 +3058,7 @@ void args_init(int argc, char **argv)
         option_idsuffix,
         option_label_suffix,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -3028,6 +3127,7 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3041,6 +3141,7 @@ void args_init(int argc, char **argv)
         option_fastaout_rev,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notrunclabels,
@@ -3054,6 +3155,7 @@ void args_init(int argc, char **argv)
         option_sizein,
         option_sizeout,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3061,6 +3163,7 @@ void args_init(int argc, char **argv)
         option_bzip2_decompress,
         option_fasta_width,
         option_gzip_decompress,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxuniquesize,
@@ -3083,6 +3186,7 @@ void args_init(int argc, char **argv)
         option_topn,
         option_uc,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3091,6 +3195,7 @@ void args_init(int argc, char **argv)
         option_fasta_width,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxuniquesize,
@@ -3113,6 +3218,7 @@ void args_init(int argc, char **argv)
         option_topn,
         option_uc,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3121,6 +3227,7 @@ void args_init(int argc, char **argv)
         option_fasta_width,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxuniquesize,
@@ -3143,6 +3250,7 @@ void args_init(int argc, char **argv)
         option_topn,
         option_uc,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3155,6 +3263,7 @@ void args_init(int argc, char **argv)
         option_fastq_qmin,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxuniquesize,
@@ -3174,6 +3283,7 @@ void args_init(int argc, char **argv)
         option_strand,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3184,6 +3294,7 @@ void args_init(int argc, char **argv)
         option_fastqout,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_quiet,
@@ -3197,6 +3308,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3221,6 +3333,7 @@ void args_init(int argc, char **argv)
         option_fastqout,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_quiet,
@@ -3234,6 +3347,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3294,6 +3408,7 @@ void args_init(int argc, char **argv)
         option_fastqout_rev,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxsize,
         option_minsize,
@@ -3310,6 +3425,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3325,6 +3441,7 @@ void args_init(int argc, char **argv)
         option_join_padgap,
         option_join_padgapq,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_quiet,
@@ -3338,6 +3455,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3372,6 +3490,7 @@ void args_init(int argc, char **argv)
         option_fastqout_notmerged_rev,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_quiet,
@@ -3386,6 +3505,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3431,6 +3551,7 @@ void args_init(int argc, char **argv)
         option_fastqout_rev,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxsize,
         option_minsize,
@@ -3448,6 +3569,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3463,6 +3585,7 @@ void args_init(int argc, char **argv)
         option_label,
         option_label_substr_match,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notmatched,
@@ -3479,6 +3602,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3498,6 +3622,7 @@ void args_init(int argc, char **argv)
         option_label_word,
         option_label_words,
         option_labels,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notmatched,
@@ -3514,6 +3639,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3529,6 +3655,7 @@ void args_init(int argc, char **argv)
         option_label,
         option_label_substr_match,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notmatched,
@@ -3547,6 +3674,7 @@ void args_init(int argc, char **argv)
         option_subseq_start,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3561,6 +3689,7 @@ void args_init(int argc, char **argv)
         option_gzip_decompress,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_max_unmasked_pct,
         option_min_unmasked_pct,
@@ -3578,6 +3707,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3591,6 +3721,7 @@ void args_init(int argc, char **argv)
         option_fastqout,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notrunclabels,
@@ -3605,6 +3736,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3620,6 +3752,7 @@ void args_init(int argc, char **argv)
         option_fastqout_discarded,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notrunclabels,
@@ -3637,6 +3770,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3654,6 +3788,7 @@ void args_init(int argc, char **argv)
         option_fastqout,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxuniquesize,
@@ -3676,6 +3811,7 @@ void args_init(int argc, char **argv)
         option_topn,
         option_uc,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3697,6 +3833,7 @@ void args_init(int argc, char **argv)
         option_gzip_decompress,
         option_hardmask,
         option_log,
+        option_maxseqlength,
         option_minseqlength,
         option_no_progress,
         option_notrunclabels,
@@ -3712,6 +3849,7 @@ void args_init(int argc, char **argv)
         option_gzip_decompress,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_max_unmasked_pct,
         option_maxseqlength,
@@ -3732,6 +3870,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3744,6 +3883,7 @@ void args_init(int argc, char **argv)
         option_fastqout,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notmatched,
@@ -3762,6 +3902,7 @@ void args_init(int argc, char **argv)
         option_threads,
         option_wordlength,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3770,6 +3911,7 @@ void args_init(int argc, char **argv)
         option_fasta_width,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_notrunclabels,
@@ -3785,6 +3927,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3804,6 +3947,7 @@ void args_init(int argc, char **argv)
         option_label_suffix,
         option_lca_cutoff,
         option_lcaout,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -3850,6 +3994,7 @@ void args_init(int argc, char **argv)
         option_userfields,
         option_userout,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3859,6 +4004,7 @@ void args_init(int argc, char **argv)
         option_fastq_qminout,
         option_fastqout,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_quiet,
@@ -3881,6 +4027,7 @@ void args_init(int argc, char **argv)
         option_fastq_qmin,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_minseqlength,
@@ -3900,6 +4047,7 @@ void args_init(int argc, char **argv)
         option_threads,
         option_topn,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3934,6 +4082,7 @@ void args_init(int argc, char **argv)
         option_fastq_qmin,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_minseqlength,
@@ -3952,6 +4101,7 @@ void args_init(int argc, char **argv)
         option_threads,
         option_topn,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3963,6 +4113,7 @@ void args_init(int argc, char **argv)
         option_fastq_qmin,
         option_gzip_decompress,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_maxseqlength,
         option_maxsize,
@@ -3983,6 +4134,7 @@ void args_init(int argc, char **argv)
         option_threads,
         option_topn,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -3998,6 +4150,7 @@ void args_init(int argc, char **argv)
         option_gapopen,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_match,
         option_mindiffs,
@@ -4022,6 +4175,7 @@ void args_init(int argc, char **argv)
         option_uchimeout,
         option_uchimeout5,
         option_xee,
+        option_xlength,
         option_xn,
         option_xsize,
         -1 },
@@ -4038,6 +4192,7 @@ void args_init(int argc, char **argv)
         option_gapopen,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_match,
         option_mindiffs,
@@ -4062,6 +4217,7 @@ void args_init(int argc, char **argv)
         option_uchimeout,
         option_uchimeout5,
         option_xee,
+        option_xlength,
         option_xn,
         option_xsize,
         -1 },
@@ -4078,6 +4234,7 @@ void args_init(int argc, char **argv)
         option_gapopen,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_match,
         option_mindiffs,
@@ -4102,6 +4259,7 @@ void args_init(int argc, char **argv)
         option_uchimeout,
         option_uchimeout5,
         option_xee,
+        option_xlength,
         option_xn,
         option_xsize,
         -1 },
@@ -4120,6 +4278,7 @@ void args_init(int argc, char **argv)
         option_gapopen,
         option_hardmask,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_match,
         option_mindiffs,
@@ -4147,6 +4306,7 @@ void args_init(int argc, char **argv)
         option_uchimeout,
         option_uchimeout5,
         option_xee,
+        option_xlength,
         option_xn,
         option_xsize,
         -1 },
@@ -4154,6 +4314,7 @@ void args_init(int argc, char **argv)
       { option_udb2fasta,
         option_fasta_width,
         option_label_suffix,
+        option_lengthout,
         option_log,
         option_no_progress,
         option_output,
@@ -4168,6 +4329,7 @@ void args_init(int argc, char **argv)
         option_sizeout,
         option_threads,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -4210,6 +4372,7 @@ void args_init(int argc, char **argv)
         option_lca_cutoff,
         option_lcaout,
         option_leftjust,
+        option_lengthout,
         option_log,
         option_match,
         option_matched,
@@ -4274,6 +4437,7 @@ void args_init(int argc, char **argv)
         option_wordlength,
         option_xdrop_nw,
         option_xee,
+        option_xlength,
         option_xsize,
         -1 },
 
@@ -4313,8 +4477,7 @@ void args_init(int argc, char **argv)
     {
       /* check if any options are specified */
       bool any_options = false;
-      for (bool i
-             : options_selected)
+      for (bool i: options_selected)
         {
           if (i)
             {
@@ -4405,6 +4568,10 @@ void args_init(int argc, char **argv)
         }
       opt_threads = 1;
     }
+  if (opt_sintax && opt_randseed && (opt_threads > 1))
+    {
+      fprintf(stderr, "WARNING: Using the --sintax command with the --randseed option may not work as intended with multiple threads. Use a single thread (--threads 1) to ensure reproducible results.\n");
+    }
 
   if (opt_cluster_unoise)
     {
@@ -4603,6 +4770,28 @@ void args_init(int argc, char **argv)
       fatal("The argument to maxhits cannot be negative");
     }
 
+  if (opt_chimeras_length_min < 1)
+    {
+      fatal("The argument to chimeras_length_min must be at least 1");
+    }
+
+  if ((opt_chimeras_parents_max < 2) || (opt_chimeras_parents_max > 4))
+    {
+      fatal("The argument to chimeras_parents_max must be in the range 2 to 4");
+    }
+
+  if (options_selected[option_chimeras_parts] &&
+      ((opt_chimeras_parts < 2) || (opt_chimeras_parts > 100)))
+    {
+      fatal("The argument to chimeras_parts must be in the range 2 to 100");
+    }
+
+  if (opt_chimeras_denovo)
+    {
+      if (! options_selected[option_alignwidth])
+        opt_alignwidth = 60;
+    }
+
 
   /* TODO: check valid range of gap penalties */
 
@@ -4657,9 +4846,13 @@ void args_init(int argc, char **argv)
     }
 
   /* set default opt_abskew depending on command */
-  if (opt_abskew < 0.0)
+  if (! options_selected[option_abskew])
     {
-      if (opt_uchime3_denovo)
+      if (opt_chimeras_denovo)
+        {
+          opt_abskew = 1.0;
+        }
+      else if (opt_uchime3_denovo)
         {
           opt_abskew = 16.0;
         }
@@ -4787,7 +4980,29 @@ void cmd_help()
               "  --threads INT               number of threads to use, zero for all cores (0)\n"
               "  --version | -v              display version information\n"
               "\n"
-              "Chimera detection\n"
+              "Chimera detection with new algorithm\n"
+              "  --chimeras_denovo FILENAME  detect chimeras de novo in long exact sequences\n"
+              " Parameters\n"
+              "  --abskew REAL               minimum abundance ratio (1.0)\n"
+              "  --chimeras_length_min       minimum length of each chimeric region (10)\n"
+              "  --chimeras_parents_max      maximum number of parent sequences (3)\n"
+              "  --chimeras_parts            number of parts to divide sequences (length/100)\n"
+              "  --sizein                    propagate abundance annotation from input\n"
+              " Output\n"
+              "  --alignwidth INT            width of alignments in alignment output file (60)\n"
+              "  --alnout FILENAME           output chimera alignments to file\n"
+              "  --chimeras FILENAME         output chimeric sequences to file\n"
+              "  --nonchimeras FILENAME      output non-chimeric sequences to file\n"
+              "  --relabel STRING            relabel nonchimeras with this prefix string\n"
+              "  --relabel_keep              keep the old label after the new when relabelling\n"
+              "  --relabel_md5               relabel with md5 digest of normalized sequence\n"
+              "  --relabel_self              relabel with the sequence itself as label\n"
+              "  --relabel_sha1              relabel with sha1 digest of normalized sequence\n"
+              "  --sizeout                   include abundance information when relabelling\n"
+              "  --tabbedout FILENAME        output chimera info to tab-separated file\n"
+              "  --xsize                     strip abundance information in output\n"
+              "\n"
+              "Chimera detection with UCHIME algorithms\n"
               "  --uchime_denovo FILENAME    detect chimeras de novo\n"
               "  --uchime2_denovo FILENAME   detect chimeras de novo in denoised amplicons\n"
               "  --uchime3_denovo FILENAME   detect chimeras de novo in denoised amplicons\n"
@@ -4808,7 +5023,7 @@ void cmd_help()
               "  --alignwidth INT            width of alignment in uchimealn output (80)\n"
               "  --borderline FILENAME       output borderline chimeric sequences to file\n"
               "  --chimeras FILENAME         output chimeric sequences to file\n"
-              "  --fasta_score               include chimera score in fasta output\n"
+              "  --fasta_score               include chimera score in FASTA output\n"
               "  --nonchimeras FILENAME      output non-chimeric sequences to file\n"
               "  --relabel STRING            relabel nonchimeras with this prefix string\n"
               "  --relabel_keep              keep the old label after the new when relabelling\n"
@@ -5225,10 +5440,10 @@ void cmd_allpairs_global()
 {
   /* check options */
 
-  if ((!opt_alnout) && (!opt_userout) &&
-      (!opt_uc) && (!opt_blast6out) &&
-      (!opt_matched) && (!opt_notmatched) &&
-      (!opt_samout) && (!opt_fastapairs))
+  if ((! opt_alnout) && (! opt_userout) &&
+      (! opt_uc) && (! opt_blast6out) &&
+      (! opt_matched) && (! opt_notmatched) &&
+      (! opt_samout) && (! opt_fastapairs))
     {
       fatal("No output files specified");
     }
@@ -5245,18 +5460,18 @@ void cmd_usearch_global()
 {
   /* check options */
 
-  if ((!opt_alnout) && (!opt_userout) &&
-      (!opt_uc) && (!opt_blast6out) &&
-      (!opt_matched) && (!opt_notmatched) &&
-      (!opt_dbmatched) && (!opt_dbnotmatched) &&
-      (!opt_samout) && (!opt_otutabout) &&
-      (!opt_biomout) && (!opt_mothur_shared_out) &&
-      (!opt_fastapairs) && (!opt_lcaout))
+  if ((! opt_alnout) && (! opt_userout) &&
+      (! opt_uc) && (! opt_blast6out) &&
+      (! opt_matched) && (! opt_notmatched) &&
+      (! opt_dbmatched) && (! opt_dbnotmatched) &&
+      (! opt_samout) && (! opt_otutabout) &&
+      (! opt_biomout) && (! opt_mothur_shared_out) &&
+      (! opt_fastapairs) && (! opt_lcaout))
     {
       fatal("No output files specified");
     }
 
-  if (!opt_db)
+  if (! opt_db)
     {
       fatal("Database filename not specified with --db");
     }
@@ -5273,18 +5488,18 @@ void cmd_search_exact()
 {
   /* check options */
 
-  if ((!opt_alnout) && (!opt_userout) &&
-      (!opt_uc) && (!opt_blast6out) &&
-      (!opt_matched) && (!opt_notmatched) &&
-      (!opt_dbmatched) && (!opt_dbnotmatched) &&
-      (!opt_samout) && (!opt_otutabout) &&
-      (!opt_biomout) && (!opt_mothur_shared_out) &&
-      (!opt_fastapairs) && (!opt_lcaout))
+  if ((! opt_alnout) && (! opt_userout) &&
+      (! opt_uc) && (! opt_blast6out) &&
+      (! opt_matched) && (! opt_notmatched) &&
+      (! opt_dbmatched) && (! opt_dbnotmatched) &&
+      (! opt_samout) && (! opt_otutabout) &&
+      (! opt_biomout) && (! opt_mothur_shared_out) &&
+      (! opt_fastapairs) && (! opt_lcaout))
     {
       fatal("No output files specified");
     }
 
-  if (!opt_db)
+  if (! opt_db)
     {
       fatal("Database filename not specified with --db");
     }
@@ -5294,7 +5509,7 @@ void cmd_search_exact()
 
 void cmd_subsample()
 {
-  if ((!opt_fastaout) && (!opt_fastqout))
+  if ((! opt_fastaout) && (! opt_fastqout))
     {
       fatal("Specify output files for subsampling with --fastaout and/or --fastqout");
     }
@@ -5355,19 +5570,19 @@ void cmd_none()
 
 void cmd_cluster()
 {
-  if ((!opt_alnout) && (!opt_userout) &&
-      (!opt_uc) && (!opt_blast6out) &&
-      (!opt_matched) && (!opt_notmatched) &&
-      (!opt_centroids) && (!opt_clusters) &&
-      (!opt_consout) && (!opt_msaout) &&
-      (!opt_samout) && (!opt_profile) &&
-      (!opt_otutabout) && (!opt_biomout) &&
-      (!opt_mothur_shared_out))
+  if ((! opt_alnout) && (! opt_userout) &&
+      (! opt_uc) && (! opt_blast6out) &&
+      (! opt_matched) && (! opt_notmatched) &&
+      (! opt_centroids) && (! opt_clusters) &&
+      (! opt_consout) && (! opt_msaout) &&
+      (! opt_samout) && (! opt_profile) &&
+      (! opt_otutabout) && (! opt_biomout) &&
+      (! opt_mothur_shared_out))
     {
       fatal("No output files specified");
     }
 
-  if (!opt_cluster_unoise)
+  if (! opt_cluster_unoise)
     {
       if ((opt_id < 0.0) || (opt_id > 1.0))
         {
@@ -5393,10 +5608,10 @@ void cmd_cluster()
     }
 }
 
-void cmd_uchime()
+void cmd_chimera()
 {
-  if ((!opt_chimeras)  && (!opt_nonchimeras) &&
-      (!opt_uchimeout) && (!opt_uchimealns))
+  if ((! opt_chimeras)  && (! opt_nonchimeras) &&
+      (! opt_uchimeout) && (! opt_uchimealns))
     {
       fatal("No output files specified");
     }
@@ -5416,7 +5631,7 @@ void cmd_uchime()
       fatal("Argument to --dn must be > 0");
     }
 
-  if ((!opt_uchime2_denovo) && (!opt_uchime3_denovo))
+  if ((! opt_uchime2_denovo) && (! opt_uchime3_denovo))
     {
       if (opt_mindiffs <= 0)
         {
@@ -5434,27 +5649,22 @@ void cmd_uchime()
         }
     }
 
-#if 0
-  if (opt_abskew <= 1.0)
-    fatal("Argument to --abskew must be > 1");
-#endif
-
   chimera();
 }
 
 void cmd_fastq_mergepairs()
 {
-  if (!opt_reverse)
+  if (! opt_reverse)
     {
       fatal("No reverse reads file specified with --reverse");
     }
-  if ((!opt_fastqout) &&
-      (!opt_fastaout) &&
-      (!opt_fastqout_notmerged_fwd) &&
-      (!opt_fastqout_notmerged_rev) &&
-      (!opt_fastaout_notmerged_fwd) &&
-      (!opt_fastaout_notmerged_rev) &&
-      (!opt_eetabbedout))
+  if ((! opt_fastqout) &&
+      (! opt_fastaout) &&
+      (! opt_fastqout_notmerged_fwd) &&
+      (! opt_fastqout_notmerged_rev) &&
+      (! opt_fastaout_notmerged_fwd) &&
+      (! opt_fastaout_notmerged_rev) &&
+      (! opt_eetabbedout))
     {
       fatal("No output files specified");
     }
@@ -5464,7 +5674,7 @@ void cmd_fastq_mergepairs()
 
 void fillheader()
 {
-  constexpr double one_gigabyte {1024.0 * 1024.0 * 1024.0};
+  constexpr static double one_gigabyte {1024 * 1024 * 1024};
   snprintf(progheader, 80,
            "%s v%s_%s, %.1fGB RAM, %ld cores",
            PROG_NAME, PROG_VERSION, PROG_ARCH,
@@ -5476,7 +5686,7 @@ void fillheader()
 void getentirecommandline(int argc, char** argv)
 {
   int len = 0;
-  for (int i=0; i<argc; i++)
+  for (int i = 0; i < argc; i++)
     {
       len += strlen(argv[i]);
     }
@@ -5484,9 +5694,9 @@ void getentirecommandline(int argc, char** argv)
   cmdline = (char*) xmalloc(len+argc);
   cmdline[0] = 0;
 
-  for (int i=0; i<argc; i++)
+  for (int i = 0; i < argc; i++)
     {
-      if (i>0)
+      if (i > 0)
         {
           strcat(cmdline, " ");
         }
@@ -5517,7 +5727,7 @@ int main(int argc, char** argv)
   if (opt_log)
     {
       fp_log = fopen_output(opt_log);
-      if (!fp_log)
+      if (! fp_log)
         {
           fatal("Unable to open log file for writing");
         }
@@ -5596,9 +5806,9 @@ int main(int argc, char** argv)
     {
       cmd_cluster();
     }
-  else if (opt_uchime_denovo || opt_uchime_ref || opt_uchime2_denovo || opt_uchime3_denovo)
+  else if (opt_uchime_denovo || opt_uchime_ref || opt_uchime2_denovo || opt_uchime3_denovo || opt_chimeras_denovo)
     {
-      cmd_uchime();
+      cmd_chimera();
     }
   else if (opt_fastq_chars)
     {
@@ -5734,7 +5944,7 @@ int main(int argc, char** argv)
         }
       else
         {
-          fprintf(fp_log, "Max memory %.1lfGB\n", maxmem/1024.0);
+          fprintf(fp_log, "Max memory %.1lfGB\n", maxmem / 1024.0);
         }
       fclose(fp_log);
     }


=====================================
src/vsearch.h
=====================================
@@ -270,10 +270,11 @@ extern bool opt_fasta_score;
 extern bool opt_fastq_allowmergestagger;
 extern bool opt_fastq_eeout;
 extern bool opt_fastq_nostagger;
+extern bool opt_fastq_qout_max;
 extern bool opt_gzip_decompress;
 extern bool opt_label_substr_match;
+extern bool opt_lengthout;
 extern bool opt_no_progress;
-extern bool opt_fastq_qout_max;
 extern bool opt_quiet;
 extern bool opt_relabel_keep;
 extern bool opt_relabel_md5;
@@ -281,8 +282,11 @@ extern bool opt_relabel_self;
 extern bool opt_relabel_sha1;
 extern bool opt_samheader;
 extern bool opt_sff_clip;
+extern bool opt_sizein;
 extern bool opt_sizeorder;
+extern bool opt_sizeout;
 extern bool opt_xee;
+extern bool opt_xlength;
 extern bool opt_xsize;
 extern char * opt_allpairs_global;
 extern char * opt_alnout;
@@ -291,6 +295,7 @@ extern char * opt_blast6out;
 extern char * opt_borderline;
 extern char * opt_centroids;
 extern char * opt_chimeras;
+extern char * opt_chimeras_denovo;
 extern char * opt_cluster_fast;
 extern char * opt_cluster_size;
 extern char * opt_cluster_smallmem;
@@ -415,6 +420,9 @@ extern double opt_weak_id;
 extern double opt_xn;
 extern int opt_acceptall;
 extern int opt_alignwidth;
+extern int opt_chimeras_length_min;
+extern int opt_chimeras_parents_max;
+extern int opt_chimeras_parts;
 extern int opt_cons_truncate;
 extern int opt_ee_cutoffs_count;
 extern int opt_gap_extension_query_interior;
@@ -492,8 +500,6 @@ extern int64_t opt_rowlen;
 extern int64_t opt_sample_size;
 extern int64_t opt_self;
 extern int64_t opt_selfid;
-extern int64_t opt_sizein;
-extern int64_t opt_sizeout;
 extern int64_t opt_strand;
 extern int64_t opt_subseq_start;
 extern int64_t opt_subseq_end;


=====================================
src/xstring.h
=====================================
@@ -134,7 +134,7 @@ class xstring
         alloc = length + needed + 1;
         string = (char*) xrealloc(string, alloc);
       }
-    sprintf(string + length, "%d", d);
+    snprintf(string + length, needed + 1, "%d", d);
     length += needed;
   }
 



View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/e00e6bba669d33a3b3ddaa7ea897656b50aa9ec2

-- 
View it on GitLab: https://salsa.debian.org/med-team/vsearch/-/commit/e00e6bba669d33a3b3ddaa7ea897656b50aa9ec2
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20230713/7b4ff8c1/attachment-0001.htm>


More information about the debian-med-commit mailing list