[med-svn] [Git][med-team/diamond-aligner][master] 4 commits: New upstream version 2.0.13
Nilesh Patra (@nilesh)
gitlab at salsa.debian.org
Sun Oct 31 09:57:42 GMT 2021
Nilesh Patra pushed to branch master at Debian Med / diamond-aligner
bfb476e4 by Nilesh Patra at 2021-10-31T15:04:29+05:30
New upstream version 2.0.13
- - - - -
58ec570b by Nilesh Patra at 2021-10-31T15:04:33+05:30
Update upstream source from tag 'upstream/2.0.13'
Update to upstream version '2.0.13'
with Debian dir 8d452a2192dc7e4620359fb4a00b4694926a199d
- - - - -
6655e449 by Nilesh Patra at 2021-10-31T09:46:26+00:00
Update manpage
- - - - -
4f9716c6 by Nilesh Patra at 2021-10-31T09:47:02+00:00
Upload to unstable
- - - - -
9 changed files:
- − .travis.yml
- debian/changelog
- + debian/createmanpages
- debian/diamond.1
- src/ChangeLog
- src/align/gapped_score.cpp
- src/basic/basic.cpp
- src/basic/const.h
- src/dp/swipe/banded_3frame_swipe.cpp
.travis.yml deleted
@@ -1,36 +0,0 @@
- - linux
- - osx
-# - xenial
- - trusty
-language: c++
-# apt:
-# packages:
-# - gdb
-# - libclang-common-6.0-dev
- - amd64
- - arm64
- - ppc64le
- - s390x
- - g++
-# - clang
-install: skip
- - mkdir build
- - cd build
- - cmake -DCMAKE_BUILD_TYPE=Release ..
- - make
-# - gdb --ex=r -return-child-result -batch -ex bt --args ./diamond test
-# - lldb --batch --one-line r --one-line-on-crash bt -- ./diamond test
- - ./diamond test
\ No newline at end of file
@@ -1,3 +1,11 @@
+diamond-aligner (2.0.13-1) unstable; urgency=medium
+ * Team Upload.
+ * New upstream version 2.0.13
+ * Update manpage
+ -- Nilesh Patra <nilesh at debian.org> Sun, 31 Oct 2021 15:04:59 +0530
diamond-aligner (2.0.12-1) unstable; urgency=medium
* Team upload.
@@ -0,0 +1,57 @@
+set -e
+if [ ! -x /usr/bin/help2man ]; then
+ echo "E: Missing /usr/bin/help2man, please install it from the cognate package."
+ exit 1
+if [ ! -n "$NAME" ]; then
+ NAME=`grep "^Description:" debian/control | sed 's/^Description: *//' | head -n1`
+if [ ! -n "$VERSION" ]; then
+ VERSION=`dpkg-parsechangelog | awk '/^Version:/ {print $2}' | sed -e 's/^[0-9]*://' -e 's/-.*//' -e 's/[+~]dfsg$//'`
+if [ ! -n "$PROGNAME" ]; then
+ PROGNAME=`grep "^Package:" debian/control | sed 's/^Package: *//' | head -n1`
+echo "NAME: '$NAME'"
+echo "MANDIR: '$MANDIR'"
+mkdir -p $MANDIR
+This manpage was written by $DEBFULLNAME for the Debian distribution and\n \
+can be used for any other usage of the program.\
+# If program name is different from package name or title should be
+# different from package short description change this here
+help2man --no-info --no-discard-stderr --help-option="$HELPOPTION" \
+ --name="$NAME" \
+ --version-string="$VERSION" ${progname} > $MANDIR/${progname}.1
+echo $AUTHOR >> $MANDIR/${progname}.1
+echo "$MANDIR/*.1" > debian/manpages
+cat <<EOT
+Please enhance the help2man output in '$MANDIR/${progname}.1'.
+To inspect it, try 'nroff -man $MANDIR/${progname}.1'.
+If very unhappy, try passing the HELPOPTION as an environment variable.
+The following web page might be helpful in doing so:
+ http://liw.fi/manpages/
@@ -1,287 +1,192 @@
-.TH DIAMOND "1" "January 2017" "diamond 0.8.31" "User Commands"
+.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.48.5.
+.TH DIAMOND "1" "October 2021" "diamond 2.0.13" "User Commands"
diamond \- accelerated BLAST compatible local sequence aligner
-.B diamond
- DIAMOND is a sequence aligner for protein and translated DNA searches
- and functions as a drop-in replacement for the NCBI BLAST software
- tools. It is suitable for protein-protein search as well as DNA-protein
- search on short reads and longer sequences including contigs and
- assemblies, providing a speedup of BLAST ranging up to x20,000.
-Build DIAMOND database from a FASTA file
-Align amino acid query sequences against a protein reference database
-Align DNA query sequences against a protein reference database
-View DIAMOND alignment archive (DAA) formatted file
-Produce help message
-Display version information
-Retrieve sequences from a DIAMOND database file
-.SS General options:
-\fB\-\-threads\fR (\fB\-p\fR)
-number of CPU threads
-\fB\-\-db\fR (\fB\-d\fR)
-database file
-\fB\-\-out\fR (\fB\-o\fR)
-output file
-\fB\-\-outfmt\fR (\fB\-f\fR)
-output format
-0 = BLAST pairwise
-6 = BLAST tabular
-100 = DIAMOND alignment archive (DAA)
-101 = SAM
+diamond v2.0.13.151 (C) Max Planck Society for the Advancement of Science
+Documentation, support and updates available at http://www.diamondsearch.org
+Please cite: http://dx.doi.org/10.1038/s41592\-021\-01101\-x Nature Methods (2021)
+Syntax: diamond COMMAND [OPTIONS]
+makedb Build DIAMOND database from a FASTA file
+blastp Align amino acid query sequences against a protein reference database
+blastx Align DNA query sequences against a protein reference database
+view View DIAMOND alignment archive (DAA) formatted file
+help Produce help message
+version Display version information
+getseq Retrieve sequences from a DIAMOND database file
+dbinfo Print information about a DIAMOND database file
+test Run regression tests
+makeidx Make database index
+General options:
+\fB\-\-threads\fR (\fB\-p\fR) number of CPU threads
+\fB\-\-db\fR (\fB\-d\fR) database file
+\fB\-\-out\fR (\fB\-o\fR) output file
+\fB\-\-outfmt\fR (\fB\-f\fR) output format
+= BLAST pairwise
+= BLAST tabular
+100 = DIAMOND alignment archive (DAA)
+101 = SAM
Value 6 may be followed by a space\-separated list of these keywords:
-qseqid means Query Seq \- id
-qlen means Query sequence length
-sseqid means Subject Seq \- id
+qseqid means Query Seq \- id
+qlen means Query sequence length
+sseqid means Subject Seq \- id
sallseqid means All subject Seq \- id(s), separated by a ';'
slen means Subject sequence length
qstart means Start of alignment in query
qend means End of alignment in query
sstart means Start of alignment in subject
send means End of alignment in subject
qseq means Aligned part of query sequence
+qseq_translated means Aligned part of query sequence (translated)
+full_qseq means Query sequence
+full_qseq_mate means Query sequence of the mate
sseq means Aligned part of subject sequence
+full_sseq means Subject sequence
evalue means Expect value
bitscore means Bit score
score means Raw score
length means Alignment length
pident means Percentage of identical matches
nident means Number of identical matches
mismatch means Number of mismatches
positive means Number of positive \- scoring matches
gapopen means Number of gap openings
gaps means Total number of gaps
ppos means Percentage of positive \- scoring matches
qframe means Query frame
btop means Blast traceback operations(BTOP)
+cigar means CIGAR string
+staxids means unique Subject Taxonomy ID(s), separated by a ';' (in numerical order)
+sscinames means unique Subject Scientific Name(s), separated by a ';'
+sskingdoms means unique Subject Super Kingdom(s), separated by a ';'
+skingdoms means unique Subject Kingdom(s), separated by a ';'
+sphylums means unique Subject Phylum(s), separated by a ';'
stitle means Subject Title
salltitles means All Subject Title(s), separated by a '<>'
qcovhsp means Query Coverage Per HSP
+scovhsp means Subject Coverage Per HSP
qtitle means Query title
+qqual means Query quality values for the aligned part of the query
+full_qqual means Query quality values
+qstrand means Query strand
Default: qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore
-\fB\-\-verbose\fR (\fB\-v\fR)
-verbose console output
-enable debug log
-disable console output
-.SS Makedb options:
-input reference file in FASTA format
-.SS Aligner options:
-\fB\-\-query\fR (\fB\-q\fR)
-input query file
-file for unaligned queries
-report unaligned queries (0=no, 1=yes)
-\fB\-\-max\-target\-seqs\fR (\fB\-k\fR)
-maximum number of target sequences to report alignments for
-report alignments within this percentage range of top alignment score (overrides \fB\-\-max\-target\-seqs\fR)
-compression for output files (0=none, 1=gzip)
-\fB\-\-evalue\fR (\fB\-e\fR)
-maximum e\-value to report alignments
-minimum bit score to report alignments (overrides e\-value setting)
-minimum identity% to report an alignment
-minimum query cover% to report an alignment
-minimum subject cover% to report an alignment
-enable sensitive mode (default: fast)
-enable more sensitive mode (default: fast)
-\fB\-\-block\-size\fR (\fB\-b\fR)
-sequence block size in billions of letters (default=2.0)
-\fB\-\-index\-chunks\fR (\fB\-c\fR)
-number of chunks for index processing
-\fB\-\-tmpdir\fR (\fB\-t\fR)
-directory for temporary files
-gap open penalty (default=11 for protein)
-gap extension penalty (default=1 for protein)
-score matrix for protein alignment (default=BLOSUM62)
-file containing custom scoring matrix
-lambda parameter for custom matrix
-K parameter for custom matrix
-enable composition based statistics (0/1=default)
-enable SEG masking of queries (yes/no)
-genetic code to use to translate query (see user manual)
-print full subject titles in output files
-suppress reporting of identical self hits
-.SS Advanced options:
-\fB\-\-min\-orf\fR (\fB\-l\fR)
-ignore translated sequences without an open reading frame of at least this length
-number of standard deviations for ignoring frequent seeds
-minimum number of identities for stage 1 hit
-\fB\-\-window\fR (\fB\-w\fR)
-window size for local hit search
-\fB\-\-xdrop\fR (\fB\-x\fR)
-xdrop for ungapped alignment
-minimum alignment score to continue local extension
-band for hit verification
-minimum score to keep a tentative alignment
-\fB\-\-gapped\-xdrop\fR (\fB\-X\fR)
-xdrop for gapped alignment in bits
-band for dynamic programming computation
-\fB\-\-shapes\fR (\fB\-s\fR)
-number of seed shapes (0 = all available)
-seed shapes
-index mode (0=4x12, 1=16x9)
-trace point fetch size
-include subjects within this range of max\-target\-seqs
-include subjects within this ratio of last hit
-maximum number of HSPs per subject sequence to save for each query
-effective database size (in letters)
-disable auto appending of DAA and DMND file extensions
-number of target sequences to fetch for seed extension
-.SS View options
-\fB\-\-daa\fR (\fB\-a\fR)
-DIAMOND alignment archive (DAA) file
-only show alignments of forward strand
-.SS Getseq options
-Sequence numbers to display.
+\fB\-\-verbose\fR (\fB\-v\fR) verbose console output
+\fB\-\-log\fR enable debug log
+\fB\-\-quiet\fR disable console output
+\fB\-\-header\fR Write header lines to blast tabular format.
+Makedb options:
+\fB\-\-in\fR input reference file in FASTA format
+\fB\-\-taxonmap\fR protein accession to taxid mapping file
+\fB\-\-taxonnodes\fR taxonomy nodes.dmp from NCBI
+\fB\-\-taxonnames\fR taxonomy names.dmp from NCBI
+Aligner options:
+\fB\-\-query\fR (\fB\-q\fR) input query file
+\fB\-\-strand\fR query strands to search (both/minus/plus)
+\fB\-\-un\fR file for unaligned queries
+\fB\-\-al\fR file or aligned queries
+\fB\-\-unfmt\fR format of unaligned query file (fasta/fastq)
+\fB\-\-alfmt\fR format of aligned query file (fasta/fastq)
+\fB\-\-unal\fR report unaligned queries (0=no, 1=yes)
+\fB\-\-max\-target\-seqs\fR (\fB\-k\fR) maximum number of target sequences to report alignments for (default=25)
+\fB\-\-top\fR report alignments within this percentage range of top alignment score (overrides \fB\-\-max\-target\-seqs\fR)
+\fB\-\-max\-hsps\fR maximum number of HSPs per target sequence to report for each query (default=1)
+\fB\-\-range\-culling\fR restrict hit culling to overlapping query ranges
+\fB\-\-compress\fR compression for output files (0=none, 1=gzip, zstd)
+\fB\-\-evalue\fR (\fB\-e\fR) maximum e\-value to report alignments (default=0.001)
+\fB\-\-min\-score\fR minimum bit score to report alignments (overrides e\-value setting)
+\fB\-\-id\fR minimum identity% to report an alignment
+\fB\-\-query\-cover\fR minimum query cover% to report an alignment
+\fB\-\-subject\-cover\fR minimum subject cover% to report an alignment
+\fB\-\-fast\fR enable fast mode
+\fB\-\-mid\-sensitive\fR enable mid\-sensitive mode
+\fB\-\-sensitive\fR enable sensitive mode)
+\fB\-\-more\-sensitive\fR enable more sensitive mode
+\fB\-\-very\-sensitive\fR enable very sensitive mode
+\fB\-\-ultra\-sensitive\fR enable ultra sensitive mode
+\fB\-\-iterate\fR iterated search with increasing sensitivity
+\fB\-\-global\-ranking\fR (\fB\-g\fR) number of targets for global ranking
+\fB\-\-block\-size\fR (\fB\-b\fR) sequence block size in billions of letters (default=2.0)
+\fB\-\-index\-chunks\fR (\fB\-c\fR) number of chunks for index processing (default=4)
+\fB\-\-tmpdir\fR (\fB\-t\fR) directory for temporary files
+\fB\-\-parallel\-tmpdir\fR directory for temporary files used by multiprocessing
+\fB\-\-gapopen\fR gap open penalty
+\fB\-\-gapextend\fR gap extension penalty
+\fB\-\-frameshift\fR (\fB\-F\fR) frame shift penalty (default=disabled)
+\fB\-\-long\-reads\fR short for \fB\-\-range\-culling\fR \fB\-\-top\fR 10 \fB\-F\fR 15
+\fB\-\-matrix\fR score matrix for protein alignment (default=BLOSUM62)
+\fB\-\-custom\-matrix\fR file containing custom scoring matrix
+\fB\-\-comp\-based\-stats\fR composition based statistics mode (0\-4)
+\fB\-\-masking\fR masking algorithm (none, seg, tantan=default)
+\fB\-\-query\-gencode\fR genetic code to use to translate query (see user manual)
+\fB\-\-salltitles\fR include full subject titles in DAA file
+\fB\-\-sallseqid\fR include all subject ids in DAA file
+\fB\-\-no\-self\-hits\fR suppress reporting of identical self hits
+\fB\-\-taxonlist\fR restrict search to list of taxon ids (comma\-separated)
+\fB\-\-taxon\-exclude\fR exclude list of taxon ids (comma\-separated)
+\fB\-\-seqidlist\fR filter the database by list of accessions
+\fB\-\-skip\-missing\-seqids\fR ignore accessions missing in the database
+Advanced options:
+\fB\-\-algo\fR Seed search algorithm (0=double\-indexed/1=query\-indexed/ctg=contiguous\-seed)
+\fB\-\-bin\fR number of query bins for seed search
+\fB\-\-min\-orf\fR (\fB\-l\fR) ignore translated sequences without an open reading frame of at least this length
+\fB\-\-seed\-cut\fR cutoff for seed complexity
+\fB\-\-freq\-masking\fR mask seeds based on frequency
+\fB\-\-freq\-sd\fR number of standard deviations for ignoring frequent seeds
+\fB\-\-motif\-masking\fR softmask abundant motifs (0/1)
+\fB\-\-id2\fR minimum number of identities for stage 1 hit
+\fB\-\-xdrop\fR (\fB\-x\fR) xdrop for ungapped alignment
+\fB\-\-gapped\-filter\-evalue\fR E\-value threshold for gapped filter (auto)
+\fB\-\-band\fR band for dynamic programming computation
+\fB\-\-shapes\fR (\fB\-s\fR) number of seed shapes (default=all available)
+\fB\-\-shape\-mask\fR seed shapes
+\fB\-\-multiprocessing\fR enable distributed\-memory parallel processing
+\fB\-\-mp\-init\fR initialize multiprocessing run
+\fB\-\-mp\-recover\fR enable continuation of interrupted multiprocessing run
+\fB\-\-mp\-query\-chunk\fR process only a single query chunk as specified
+\fB\-\-ext\-chunk\-size\fR chunk size for adaptive ranking (default=auto)
+\fB\-\-no\-ranking\fR disable ranking heuristic
+\fB\-\-ext\fR Extension mode (banded\-fast/banded\-slow/full)
+\fB\-\-culling\-overlap\fR minimum range overlap with higher scoring hit to delete a hit (default=50%)
+\fB\-\-taxon\-k\fR maximum number of targets to report per species
+\fB\-\-range\-cover\fR percentage of query range to be covered for range culling (default=50%)
+\fB\-\-dbsize\fR effective database size (in letters)
+\fB\-\-no\-auto\-append\fR disable auto appending of DAA and DMND file extensions
+\fB\-\-xml\-blord\-format\fR Use gnl|BL_ORD_ID| style format in XML output
+\fB\-\-stop\-match\-score\fR Set the match score of stop codons against each other.
+\fB\-\-tantan\-minMaskProb\fR minimum repeat probability for masking (default=0.9)
+\fB\-\-file\-buffer\-size\fR file buffer size in bytes (default=67108864)
+\fB\-\-memory\-limit\fR (\fB\-M\fR) Memory limit for extension stage in GB
+\fB\-\-no\-unlink\fR Do not unlink temporary files.
+\fB\-\-target\-indexed\fR Enable target\-indexed mode
+\fB\-\-ignore\-warnings\fR Ignore warnings
+View options:
+\fB\-\-daa\fR (\fB\-a\fR) DIAMOND alignment archive (DAA) file
+\fB\-\-forwardonly\fR only show alignments of forward strand
+Getseq options:
+\fB\-\-seq\fR Space\-separated list of sequence numbers to display.
+Online documentation at http://www.diamondsearch.org
-This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
+ This manpage was written by Nilesh Patra for the Debian distribution and
+ can be used for any other usage of the program.
@@ -1,3 +1,6 @@
+- Fixed a bug that caused invalid bit scores in frameshift alignment mode.
- Fixed an error when using HSP filter settings together with a BLAST database.
- Optimized the performance of alignment traceback.
@@ -106,7 +106,7 @@ static void add_dp_targets(const WorkTarget& target, int target_idx, const Seque
if (target.hsp[frame].empty())
- int d0 = INT_MAX, d1 = INT_MIN, j0 = INT_MAX, j1 = INT_MIN, bits = 0;
+ int d0 = INT_MAX, d1 = INT_MIN, score = 0;
for (const Hsp_traits &hsp : target.hsp[frame]) {
const int b0 = std::max(hsp.d_min - band, -(slen - 1)),
@@ -115,24 +115,23 @@ static void add_dp_targets(const WorkTarget& target, int target_idx, const Seque
if (overlap / (d1 - d0) > config.min_band_overlap || overlap / (b1 - b0) > config.min_band_overlap) {
d0 = std::min(d0, b0);
d1 = std::max(d1, b1);
- j0 = std::min(j0, hsp.subject_range.begin_);
- j1 = std::max(j1, hsp.subject_range.end_);
- const int64_t dp_size = (int64_t)DpTarget::banded_cols(qlen, slen, d0, d1) * int64_t(d1 - d0);
- bits = std::max(bits, (int)DP::BandedSwipe::bin(hsp_values, d1 - d0, 0, hsp.score, dp_size, score_width, 0));
+ score = std::max(score, hsp.score);
else {
- if (d0 != INT_MAX)
- dp_targets[frame][bits].emplace_back(target.seq, slen, d0, d1, target_idx, qlen, matrix);
+ if (d0 != INT_MAX) {
+ const int64_t dp_size = (int64_t)DpTarget::banded_cols(qlen, slen, d0, d1) * int64_t(d1 - d0);
+ const auto bin = DP::BandedSwipe::bin(hsp_values, d1 - d0, 0, score, dp_size, score_width, 0);
+ dp_targets[frame][bin].emplace_back(target.seq, slen, d0, d1, target_idx, qlen, matrix);
+ }
d0 = b0;
d1 = b1;
- j0 = hsp.subject_range.begin_;
- j1 = hsp.subject_range.end_;
- const int64_t dp_size = (int64_t)DpTarget::banded_cols(qlen, slen, d0, d1) * int64_t(d1 - d0);
- bits = (int)DP::BandedSwipe::bin(hsp_values, d1 - d0, 0, hsp.score, dp_size, score_width, 0);
+ score = hsp.score;
- dp_targets[frame][bits].emplace_back(target.seq, slen, d0, d1, target_idx, qlen, matrix);
+ const int64_t dp_size = (int64_t)DpTarget::banded_cols(qlen, slen, d0, d1) * int64_t(d1 - d0);
+ const auto bin = DP::BandedSwipe::bin(hsp_values, d1 - d0, 0, score, dp_size, score_width, 0);
+ dp_targets[frame][bin].emplace_back(target.seq, slen, d0, d1, target_idx, qlen, matrix);
@@ -29,7 +29,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>.
#include "../util/util.h"
#include "../stats/standard_matrix.h"
-const char* Const::version_string = "2.0.12";
+const char* Const::version_string = "2.0.13";
const char* Const::program_name = "diamond";
Align_mode::Align_mode(unsigned mode) :
@@ -25,7 +25,7 @@ struct Const
enum {
- build_version = 150,
+ build_version = 151,
seedp_bits = 0,
@@ -337,6 +337,7 @@ Hsp traceback(Sequence *query, Strand strand, int dna_len, const Banded3FrameSwi
Hsp out(true);
out.swipe_target = target.target_idx;
out.score = ScoreTraits<_sv>::int_score(max_score) * config.cbs_matrix_scale;
+ out.bit_score = score_matrix.bitscore(out.score);
out.evalue = evalue;
out.transcript.reserve(size_t(out.score * config.transcript_len_estimate));
@@ -378,6 +379,7 @@ Hsp traceback(Sequence *query, Strand strand, int dna_len, const Banded3FrameSwi
const int j0 = i1 - (target.d_end - 1);
out.swipe_target = target.target_idx;
out.score = ScoreTraits<_sv>::int_score(max_score) * config.cbs_matrix_scale;
+ out.bit_score = score_matrix.bitscore(out.score);
out.evalue = evalue;
out.query_range.end_ = std::min(i0 + max_col + (int)dp.band() / 3 / 2, (int)query[0].length());
out.query_range.begin_ = std::max(out.query_range.end_ - (j0 + max_col), 0);
View it on GitLab: https://salsa.debian.org/med-team/diamond-aligner/-/compare/f68bee9e79c1abb3c34db94ee7fc11ba3b13eb74...4f9716c6d5a0e2676905d5a5f244e390c6773187
View it on GitLab: https://salsa.debian.org/med-team/diamond-aligner/-/compare/f68bee9e79c1abb3c34db94ee7fc11ba3b13eb74...4f9716c6d5a0e2676905d5a5f244e390c6773187
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20211031/9539587f/attachment-0001.htm>
More information about the debian-med-commit
mailing list