[med-svn] [Git][med-team/tantan][master] 5 commits: New upstream version 31
Nilesh Patra (@nilesh)
gitlab at salsa.debian.org
Sat May 28 08:34:06 BST 2022
Nilesh Patra pushed to branch master at Debian Med / tantan
Commits:
8b9fdd19 by Nilesh Patra at 2022-05-28T12:52:56+05:30
New upstream version 31
- - - - -
ba34a247 by Nilesh Patra at 2022-05-28T12:52:57+05:30
Update upstream source from tag 'upstream/31'
Update to upstream version '31'
with Debian dir acdb49353d1c255231926e952a0ebc6d3b1c7c75
- - - - -
6d744e71 by Nilesh Patra at 2022-05-28T12:52:58+05:30
Bump Standards-Version to 4.6.1 (no changes needed)
- - - - -
bdd75ef1 by Nilesh Patra at 2022-05-28T13:00:50+05:30
Re-diff patch
- - - - -
c9b5da30 by Nilesh Patra at 2022-05-28T13:00:50+05:30
Upload to unstable
- - - - -
10 changed files:
- + .gitignore
- Makefile
- README.rst
- debian/changelog
- debian/control
- debian/patches/buildflags.patch
- src/Makefile
- + src/mcf_simd.hh
- src/tantan.cc
- test/tantan_test.sh
Changes:
=====================================
.gitignore
=====================================
@@ -0,0 +1,2 @@
+src/version.hh
+bin/tantan
=====================================
Makefile
=====================================
@@ -1,4 +1,4 @@
-CXXFLAGS = -O3
+CXXFLAGS = -msse4 -O3 -g
all:
@cd src && ${MAKE} CXXFLAGS="${CXXFLAGS}"
@@ -7,7 +7,7 @@ exec_prefix = ${prefix}
bindir = ${exec_prefix}/bin
install: all
mkdir -p ${bindir}
- cp src/tantan ${bindir}
+ cp bin/tantan ${bindir}
clean:
@cd src && ${MAKE} clean
=====================================
README.rst
=====================================
@@ -1,41 +1,40 @@
tantan
======
-Introduction
-------------
-
-tantan is a tool for masking simple regions (low complexity and
-short-period tandem repeats) in biological sequences.
-
-The aim of tantan is to prevent false predictions when searching for
-homologous regions between two sequences. Simple repeats often align
-strongly to each other, causing false homology predictions.
+tantan identifies simple regions / low complexity / tandem repeats in
+DNA or protein sequences. Its main aim is to prevent false homology
+predictions between sequences. Simple repeats often align strongly to
+each other, causing false homology predictions.
Setup
-----
-You need to have a C++ compiler. On Linux, you might need to install
-a package called "g++". On Mac, you might need to install
-command-line developer tools. On Windows, you might need to install
-Cygwin.
-
-Using the command line, go into the tantan directory. To compile it,
-type::
+Please download the highest version number from
+https://gitlab.com/mcfrith/tantan/-/tags. Using the command line, go
+into the downloaded directory and type::
make
-Optionally, copy tantan to a standard "bin" directory (using "sudo" to
-request administrator permissions)::
+This assumes you have a C++ compiler. On Linux, you might need to
+install a package called "g++". On Mac, you might need to install
+command-line developer tools. On Windows, you might need to install
+Cygwin.
+
+This puts ``tantan`` in a ``bin`` directory. For convenient usage,
+set up your computer to find it automatically. Some possible ways:
- sudo make install
+* Copy ``tantan`` to a standard directory: ``sudo make install``
+ (using "sudo" to request administrator permissions).
-Or copy it to your personal bin directory::
+* Copy it to your personal bin directory: ``make install prefix=~``
- make install prefix=~
+* Adjust your `PATH variable`_.
You might need to log out and back in for your computer to recognize
the new program.
+**Alternative:** Install tantan from bioconda_.
+
Normal usage
------------
@@ -133,6 +132,8 @@ repeats, so it's easy to lift the masking after determining homology.
Options
-------
+-h, --help just show a help message, with default option values
+--version just show version information
-p interpret the sequences as proteins
-x letter to use for masking, instead of lowercase
-c preserve uppercase/lowercase in non-masked regions
@@ -149,8 +150,6 @@ Options
-n minimum copy number, affects -f4 only
-f output type: 0=masked sequence, 1=repeat probabilities,
2=repeat counts, 3=BED, 4=tandem repeats
--h, --help show help message, then exit
---version show version information, then exit
Advanced issues
---------------
@@ -176,7 +175,7 @@ align it on the other strand::
Finding straightforward tandem repeats
--------------------------------------
-Option -f4 runs tantan in a different mode, where it finds
+Option ``-f4`` runs tantan in a different mode, where it finds
straightforward tandem repeats only. (Technically, it uses a Viterbi
algorithm instead of a Forward-Backward algorithm.) This is *not*
recommended for avoiding false homologs! But it might be useful for
@@ -187,16 +186,14 @@ studying tandem repeats. The output looks like this::
mySeq 1278353 1278369 3 6.5 TCA TCA,TCA,TCA,TC-,TC,TC,T
mySeq 3616084 3616100 3 5.33333 TGG TGA,TGA,TGG,TGG,TGG,T
-The first 3 columns show the start and end coordinates of the
-repetitive region, in `BED
-<https://genome.ucsc.edu/FAQ/FAQformat.html#format1>`_ format. Column
-4 shows the length of the repeating unit (which might vary due to
-insertions and deletions, so this column shows the most common
-length). Column 5 shows the number of repeat units. Column 6 shows
-the repeating unit (which again might vary, so this is just a
-representative). Column 7 shows the whole repeat: lowercase letters
-are insertions relative to the previous repeat unit, and dashes are
-deletions relative to the previous repeat unit.
+The first 3 columns show the start and end coordinates of the repeat,
+in BED_ format. Column 4 shows the length of the repeating unit
+(which might vary due to insertions and deletions, so this column
+shows the most common length). Column 5 shows the number of repeat
+units. Column 6 shows the repeating unit (which again might vary, so
+this is just a representative). Column 7 shows the whole repeat:
+lowercase letters are insertions relative to the previous repeat unit,
+and dashes are deletions relative to the previous repeat unit.
Miscellaneous
-------------
@@ -208,3 +205,7 @@ details, see COPYING.txt.
If you use tantan in your research, please cite:
"A new repeat-masking method enables specific detection of homologous
sequences", MC Frith, Nucleic Acids Research 2011 39(4):e23.
+
+.. _BED: https://genome.ucsc.edu/FAQ/FAQformat.html#format1
+.. _PATH variable: https://en.wikipedia.org/wiki/PATH_(variable)
+.. _bioconda: https://bioconda.github.io/
=====================================
debian/changelog
=====================================
@@ -1,3 +1,11 @@
+tantan (31-1) unstable; urgency=medium
+
+ * Team upload.
+ * New upstream version 31
+ * Bump Standards-Version to 4.6.1 (no changes needed)
+
+ -- Nilesh Patra <nilesh at debian.org> Sat, 28 May 2022 12:53:07 +0530
+
tantan (26-1) unstable; urgency=medium
* Team Upload.
=====================================
debian/control
=====================================
@@ -4,7 +4,7 @@ Uploaders: Sascha Steinbiss <satta at debian.org>
Section: science
Priority: optional
Build-Depends: debhelper-compat (= 13)
-Standards-Version: 4.5.1
+Standards-Version: 4.6.1
Vcs-Browser: https://salsa.debian.org/med-team/tantan
Vcs-Git: https://salsa.debian.org/med-team/tantan.git
Homepage: https://gitlab.com/mcfrith/tantan
=====================================
debian/patches/buildflags.patch
=====================================
@@ -3,26 +3,27 @@ Description: add buildflags
Author: Sascha Steinbiss <sascha at steinbiss.name>
--- a/src/Makefile
+++ b/src/Makefile
-@@ -1,9 +1,12 @@
--CXXFLAGS = -O3 -Wall
-+#CXXFLAGS = -O3 -Wall
+@@ -1,10 +1,13 @@
+-CXXFLAGS = -msse4 -O3 -Wall -g
++#CXXFLAGS = -O3 -Wall -g
- all: tantan
+ all: ../bin/tantan
--tantan: *.cc *.hh version.hh Makefile
-- $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -o $@ *.cc
+-../bin/tantan: *.cc *.hh version.hh Makefile
+CCSRCS = $(sort $(wildcard *.cc))
+CCHDRS = $(sort $(wildcard *.hh))
+
-+tantan: $(CCSRCS) $(CCHDRS) version.hh Makefile
++../bin/tantan: $(CCSRCS) $(CCHDRS) version.hh Makefile
+ mkdir -p ../bin
+- $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -o $@ *.cc
+ $(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -o $@ $(CCSRCS)
clean:
- rm -f tantan
+ rm -f ../bin/tantan
--- a/Makefile
+++ b/Makefile
@@ -1,6 +1,5 @@
--CXXFLAGS = -O3
+-CXXFLAGS = -msse4 -O3 -g
all:
- @cd src && ${MAKE} CXXFLAGS="${CXXFLAGS}"
+ @cd src && ${MAKE}
=====================================
src/Makefile
=====================================
@@ -1,15 +1,16 @@
-CXXFLAGS = -O3 -Wall
+CXXFLAGS = -msse4 -O3 -Wall -g
-all: tantan
+all: ../bin/tantan
-tantan: *.cc *.hh version.hh Makefile
+../bin/tantan: *.cc *.hh version.hh Makefile
+ mkdir -p ../bin
$(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -o $@ *.cc
clean:
- rm -f tantan
+ rm -f ../bin/tantan
VERSION1 = git describe --dirty
-VERSION2 = echo ' (HEAD -> main, tag: 26) ' | sed -e 's/.*tag: *//' -e 's/[,) ].*//'
+VERSION2 = echo ' (HEAD -> main, tag: 31) ' | sed -e 's/.*tag: *//' -e 's/[,) ].*//'
VERSION = \"`test -e ../.git && $(VERSION1) || $(VERSION2)`\"
=====================================
src/mcf_simd.hh
=====================================
@@ -0,0 +1,538 @@
+// Author: Martin C. Frith 2019
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef MCF_SIMD_HH
+#define MCF_SIMD_HH
+
+#if defined __SSE4_1__
+#include <immintrin.h>
+#elif defined __ARM_NEON
+#include <arm_neon.h>
+#endif
+
+#include <stddef.h> // size_t
+
+namespace mcf {
+
+#if defined __AVX2__
+
+typedef __m256i SimdInt;
+typedef __m256i SimdUint1;
+typedef __m256d SimdDbl;
+
+const int simdBytes = 32;
+
+static inline SimdInt simdZero() {
+ return _mm256_setzero_si256();
+}
+
+static inline SimdInt simdZero1() {
+ return _mm256_setzero_si256();
+}
+
+static inline SimdDbl simdZeroDbl() {
+ return _mm256_setzero_pd();
+}
+
+static inline SimdInt simdOnes1() {
+ return _mm256_set1_epi32(-1);
+}
+
+static inline SimdInt simdLoad(const void *p) {
+ return _mm256_loadu_si256((const SimdInt *)p);
+}
+
+static inline SimdInt simdLoad1(const void *p) {
+ return _mm256_loadu_si256((const SimdInt *)p);
+}
+
+static inline SimdDbl simdLoadDbl(const double *p) {
+ return _mm256_loadu_pd(p);
+}
+
+static inline void simdStore(void *p, SimdInt x) {
+ _mm256_storeu_si256((SimdInt *)p, x);
+}
+
+static inline void simdStore1(void *p, SimdInt x) {
+ _mm256_storeu_si256((SimdInt *)p, x);
+}
+
+static inline void simdStoreDbl(double *p, SimdDbl x) {
+ _mm256_storeu_pd(p, x);
+}
+
+static inline SimdInt simdOr1(SimdInt x, SimdInt y) {
+ return _mm256_or_si256(x, y);
+}
+
+static inline SimdInt simdBlend(SimdInt x, SimdInt y, SimdInt mask) {
+ return _mm256_blendv_epi8(x, y, mask);
+}
+
+const int simdLen = 8;
+const int simdDblLen = 4;
+
+static inline SimdInt simdSet(int i7, int i6, int i5, int i4,
+ int i3, int i2, int i1, int i0) {
+ return _mm256_set_epi32(i7, i6, i5, i4, i3, i2, i1, i0);
+}
+
+static inline SimdInt simdSet1(char jF, char jE, char jD, char jC,
+ char jB, char jA, char j9, char j8,
+ char j7, char j6, char j5, char j4,
+ char j3, char j2, char j1, char j0,
+ char iF, char iE, char iD, char iC,
+ char iB, char iA, char i9, char i8,
+ char i7, char i6, char i5, char i4,
+ char i3, char i2, char i1, char i0) {
+ return _mm256_set_epi8(jF, jE, jD, jC, jB, jA, j9, j8,
+ j7, j6, j5, j4, j3, j2, j1, j0,
+ iF, iE, iD, iC, iB, iA, i9, i8,
+ i7, i6, i5, i4, i3, i2, i1, i0);
+}
+
+static inline SimdDbl simdSetDbl(double i3, double i2, double i1, double i0) {
+ return _mm256_set_pd(i3, i2, i1, i0);
+}
+
+static inline SimdInt simdFill(int x) {
+ return _mm256_set1_epi32(x);
+}
+
+static inline SimdInt simdFill1(char x) {
+ return _mm256_set1_epi8(x);
+}
+
+static inline SimdDbl simdFillDbl(double x) {
+ return _mm256_set1_pd(x);
+}
+
+static inline SimdInt simdGt(SimdInt x, SimdInt y) {
+ return _mm256_cmpgt_epi32(x, y);
+}
+
+static inline SimdInt simdGe1(SimdInt x, SimdInt y) {
+ return _mm256_cmpeq_epi8(_mm256_min_epu8(x, y), y);
+}
+
+static inline SimdInt simdAdd(SimdInt x, SimdInt y) {
+ return _mm256_add_epi32(x, y);
+}
+
+static inline SimdInt simdAdd1(SimdInt x, SimdInt y) {
+ return _mm256_add_epi8(x, y);
+}
+
+static inline SimdInt simdAdds1(SimdInt x, SimdInt y) {
+ return _mm256_adds_epu8(x, y);
+}
+
+static inline SimdDbl simdAddDbl(SimdDbl x, SimdDbl y) {
+ return _mm256_add_pd(x, y);
+}
+
+static inline SimdInt simdSub(SimdInt x, SimdInt y) {
+ return _mm256_sub_epi32(x, y);
+}
+
+static inline SimdInt simdSub1(SimdInt x, SimdInt y) {
+ return _mm256_sub_epi8(x, y);
+}
+
+static inline SimdDbl simdMulDbl(SimdDbl x, SimdDbl y) {
+ return _mm256_mul_pd(x, y);
+}
+
+static inline SimdInt simdQuadruple1(SimdInt x) {
+ return _mm256_slli_epi32(x, 2);
+}
+
+static inline SimdInt simdMax(SimdInt x, SimdInt y) {
+ return _mm256_max_epi32(x, y);
+}
+
+static inline SimdInt simdMin1(SimdInt x, SimdInt y) {
+ return _mm256_min_epu8(x, y);
+}
+
+static inline int simdHorizontalMax(SimdInt x) {
+ __m128i z = _mm256_castsi256_si128(x);
+ z = _mm_max_epi32(z, _mm256_extracti128_si256(x, 1));
+ z = _mm_max_epi32(z, _mm_shuffle_epi32(z, 0x4E));
+ z = _mm_max_epi32(z, _mm_shuffle_epi32(z, 0xB1));
+ return _mm_cvtsi128_si32(z);
+}
+
+static inline int simdHorizontalMin1(SimdInt x) {
+ __m128i z = _mm256_castsi256_si128(x);
+ z = _mm_min_epu8(z, _mm256_extracti128_si256(x, 1));
+ z = _mm_min_epu8(z, _mm_srli_epi16(z, 8));
+ z = _mm_minpos_epu16(z);
+ return _mm_extract_epi16(z, 0);
+}
+
+static inline double simdHorizontalAddDbl(SimdDbl x) {
+ __m128d z = _mm256_castpd256_pd128(x);
+ z = _mm_add_pd(z, _mm256_extractf128_pd(x, 1));
+ return _mm_cvtsd_f64(_mm_hadd_pd(z, z));
+}
+
+static inline SimdInt simdChoose1(SimdInt items, SimdInt choices) {
+ return _mm256_shuffle_epi8(items, choices);
+}
+
+#elif defined __SSE4_1__
+
+typedef __m128i SimdInt;
+typedef __m128i SimdUint1;
+typedef __m128d SimdDbl;
+
+const int simdBytes = 16;
+
+static inline SimdInt simdZero() {
+ return _mm_setzero_si128();
+}
+
+static inline SimdInt simdZero1() {
+ return _mm_setzero_si128();
+}
+
+static inline SimdDbl simdZeroDbl() {
+ return _mm_setzero_pd();
+}
+
+static inline SimdInt simdOnes1() {
+ return _mm_set1_epi32(-1);
+}
+
+static inline SimdInt simdLoad(const void *p) {
+ return _mm_loadu_si128((const SimdInt *)p);
+}
+
+static inline SimdInt simdLoad1(const void *p) {
+ return _mm_loadu_si128((const SimdInt *)p);
+}
+
+static inline SimdDbl simdLoadDbl(const double *p) {
+ return _mm_loadu_pd(p);
+}
+
+static inline void simdStore(void *p, SimdInt x) {
+ _mm_storeu_si128((SimdInt *)p, x);
+}
+
+static inline void simdStore1(void *p, SimdInt x) {
+ _mm_storeu_si128((SimdInt *)p, x);
+}
+
+static inline void simdStoreDbl(double *p, SimdDbl x) {
+ _mm_storeu_pd(p, x);
+}
+
+static inline SimdInt simdOr1(SimdInt x, SimdInt y) {
+ return _mm_or_si128(x, y);
+}
+
+static inline SimdInt simdBlend(SimdInt x, SimdInt y, SimdInt mask) {
+ return _mm_blendv_epi8(x, y, mask); // SSE4.1
+}
+
+const int simdLen = 4;
+const int simdDblLen = 2;
+
+static inline SimdInt simdSet(int i3, int i2, int i1, int i0) {
+ return _mm_set_epi32(i3, i2, i1, i0);
+}
+
+static inline SimdInt simdSet1(char iF, char iE, char iD, char iC,
+ char iB, char iA, char i9, char i8,
+ char i7, char i6, char i5, char i4,
+ char i3, char i2, char i1, char i0) {
+ return _mm_set_epi8(iF, iE, iD, iC, iB, iA, i9, i8,
+ i7, i6, i5, i4, i3, i2, i1, i0);
+}
+
+static inline SimdDbl simdSetDbl(double i1, double i0) {
+ return _mm_set_pd(i1, i0);
+}
+
+static inline SimdInt simdFill(int x) {
+ return _mm_set1_epi32(x);
+}
+
+static inline SimdInt simdFill1(char x) {
+ return _mm_set1_epi8(x);
+}
+
+static inline SimdDbl simdFillDbl(double x) {
+ return _mm_set1_pd(x);
+}
+
+static inline SimdInt simdGt(SimdInt x, SimdInt y) {
+ return _mm_cmpgt_epi32(x, y);
+}
+
+static inline SimdInt simdGe1(SimdInt x, SimdInt y) {
+ return _mm_cmpeq_epi8(_mm_min_epu8(x, y), y);
+}
+
+static inline SimdInt simdAdd(SimdInt x, SimdInt y) {
+ return _mm_add_epi32(x, y);
+}
+
+static inline SimdInt simdAdd1(SimdInt x, SimdInt y) {
+ return _mm_add_epi8(x, y);
+}
+
+static inline SimdInt simdAdds1(SimdInt x, SimdInt y) {
+ return _mm_adds_epu8(x, y);
+}
+
+static inline SimdDbl simdAddDbl(SimdDbl x, SimdDbl y) {
+ return _mm_add_pd(x, y);
+}
+
+static inline SimdInt simdSub(SimdInt x, SimdInt y) {
+ return _mm_sub_epi32(x, y);
+}
+
+static inline SimdInt simdSub1(SimdInt x, SimdInt y) {
+ return _mm_sub_epi8(x, y);
+}
+
+static inline SimdDbl simdMulDbl(SimdDbl x, SimdDbl y) {
+ return _mm_mul_pd(x, y);
+}
+
+static inline SimdInt simdQuadruple1(SimdInt x) {
+ return _mm_slli_epi32(x, 2);
+}
+
+static inline SimdInt simdMax(SimdInt x, SimdInt y) {
+ return _mm_max_epi32(x, y); // SSE4.1
+}
+
+static inline SimdInt simdMin1(SimdInt x, SimdInt y) {
+ return _mm_min_epu8(x, y);
+}
+
+static inline int simdHorizontalMax(SimdInt x) {
+ x = simdMax(x, _mm_shuffle_epi32(x, 0x4E));
+ x = simdMax(x, _mm_shuffle_epi32(x, 0xB1));
+ return _mm_cvtsi128_si32(x);
+}
+
+static inline int simdHorizontalMin1(SimdInt x) {
+ x = _mm_min_epu8(x, _mm_srli_epi16(x, 8));
+ x = _mm_minpos_epu16(x); // SSE4.1
+ return _mm_extract_epi16(x, 0);
+}
+
+static inline double simdHorizontalAddDbl(SimdDbl x) {
+ return _mm_cvtsd_f64(_mm_hadd_pd(x, x));
+}
+
+static inline SimdInt simdChoose1(SimdInt items, SimdInt choices) {
+ return _mm_shuffle_epi8(items, choices); // SSSE3
+}
+
+#elif defined __ARM_NEON
+
+typedef int32x4_t SimdInt;
+typedef uint32x4_t SimdUint;
+typedef uint8x16_t SimdUint1;
+typedef float64x2_t SimdDbl;
+
+const int simdBytes = 16;
+
+static inline SimdInt simdZero() {
+ return vdupq_n_s32(0);
+}
+
+static inline SimdUint1 simdZero1() {
+ return vdupq_n_u8(0);
+}
+
+static inline SimdDbl simdZeroDbl() {
+ return vdupq_n_f64(0);
+}
+
+static inline SimdUint1 simdOnes1() {
+ return vdupq_n_u8(-1);
+}
+
+static inline SimdInt simdLoad(const int *p) {
+ return vld1q_s32(p);
+}
+
+static inline SimdUint1 simdLoad1(const unsigned char *p) {
+ return vld1q_u8(p);
+}
+
+static inline SimdDbl simdLoadDbl(const double *p) {
+ return vld1q_f64(p);
+}
+
+static inline void simdStore(int *p, SimdInt x) {
+ vst1q_s32(p, x);
+}
+
+static inline void simdStore1(unsigned char *p, SimdUint1 x) {
+ vst1q_u8(p, x);
+}
+
+static inline void simdStoreDbl(double *p, SimdDbl x) {
+ vst1q_f64(p, x);
+}
+
+static inline SimdUint1 simdOr1(SimdUint1 x, SimdUint1 y) {
+ return vorrq_u8(x, y);
+}
+
+static inline SimdInt simdBlend(SimdInt x, SimdInt y, SimdUint mask) {
+ return vbslq_s32(mask, y, x);
+}
+
+const int simdLen = 4;
+const int simdDblLen = 2;
+
+static inline SimdInt simdSet(unsigned i3, unsigned i2,
+ unsigned i1, unsigned i0) {
+ size_t lo = i1;
+ size_t hi = i3;
+ return
+ vcombine_s32(vcreate_s32((lo << 32) | i0), vcreate_s32((hi << 32) | i2));
+}
+
+static inline SimdUint1 simdSet1(unsigned char iF, unsigned char iE,
+ unsigned char iD, unsigned char iC,
+ unsigned char iB, unsigned char iA,
+ unsigned char i9, unsigned char i8,
+ unsigned char i7, unsigned char i6,
+ unsigned char i5, unsigned char i4,
+ unsigned char i3, unsigned char i2,
+ unsigned char i1, unsigned char i0) {
+ size_t lo =
+ (size_t)i0 | (size_t)i1 << 8 | (size_t)i2 << 16 | (size_t)i3 << 24 |
+ (size_t)i4 << 32 | (size_t)i5 << 40 | (size_t)i6 << 48 | (size_t)i7 << 56;
+
+ size_t hi =
+ (size_t)i8 | (size_t)i9 << 8 | (size_t)iA << 16 | (size_t)iB << 24 |
+ (size_t)iC << 32 | (size_t)iD << 40 | (size_t)iE << 48 | (size_t)iF << 56;
+
+ return vcombine_u8(vcreate_u8(lo), vcreate_u8(hi));
+}
+
+static inline SimdDbl simdSetDbl(double i1, double i0) {
+ return vcombine_f64(vdup_n_f64(i0), vdup_n_f64(i1));
+}
+
+static inline SimdInt simdFill(int x) {
+ return vdupq_n_s32(x);
+}
+
+static inline SimdUint1 simdFill1(unsigned char x) {
+ return vdupq_n_u8(x);
+}
+
+static inline SimdDbl simdFillDbl(double x) {
+ return vdupq_n_f64(x);
+}
+
+static inline SimdUint simdGt(SimdInt x, SimdInt y) {
+ return vcgtq_s32(x, y);
+}
+
+static inline SimdUint1 simdGe1(SimdUint1 x, SimdUint1 y) {
+ return vcgeq_u8(x, y);
+}
+
+static inline SimdInt simdAdd(SimdInt x, SimdInt y) {
+ return vaddq_s32(x, y);
+}
+
+static inline SimdUint1 simdAdd1(SimdUint1 x, SimdUint1 y) {
+ return vaddq_u8(x, y);
+}
+
+static inline SimdUint1 simdAdds1(SimdUint1 x, SimdUint1 y) {
+ return vqaddq_u8(x, y);
+}
+
+static inline SimdDbl simdAddDbl(SimdDbl x, SimdDbl y) {
+ return vaddq_f64(x, y);
+}
+
+static inline SimdInt simdSub(SimdInt x, SimdInt y) {
+ return vsubq_s32(x, y);
+}
+
+static inline SimdUint1 simdSub1(SimdUint1 x, SimdUint1 y) {
+ return vsubq_u8(x, y);
+}
+
+static inline SimdDbl simdMulDbl(SimdDbl x, SimdDbl y) {
+ return vmulq_f64(x, y);
+}
+
+static inline SimdUint1 simdQuadruple1(SimdUint1 x) {
+ return vshlq_n_u8(x, 2);
+}
+
+static inline SimdInt simdMax(SimdInt x, SimdInt y) {
+ return vmaxq_s32(x, y);
+}
+
+static inline SimdUint1 simdMin1(SimdUint1 x, SimdUint1 y) {
+ return vminq_u8(x, y);
+}
+
+static inline int simdHorizontalMax(SimdInt x) {
+ return vmaxvq_s32(x);
+}
+
+static inline int simdHorizontalMin1(SimdUint1 x) {
+ return vminvq_u8(x);
+}
+
+static inline double simdHorizontalAddDbl(SimdDbl x) {
+ return vaddvq_f64(x);
+}
+
+static inline SimdUint1 simdChoose1(SimdUint1 items, SimdUint1 choices) {
+ return vqtbl1q_u8(items, choices);
+}
+
+#else
+
+typedef int SimdInt;
+typedef double SimdDbl;
+const int simdBytes = 1;
+const int simdLen = 1;
+const int simdDblLen = 1;
+static inline int simdZero() { return 0; }
+static inline double simdZeroDbl() { return 0; }
+static inline int simdSet(int x) { return x; }
+static inline double simdSetDbl(double x) { return x; }
+static inline int simdFill(int x) { return x; }
+static inline int simdLoad(const int *p) { return *p; }
+static inline double simdLoadDbl(const double *p) { return *p; }
+static inline void simdStore(int *p, int x) { *p = x; }
+static inline void simdStoreDbl(double *p, double x) { *p = x; }
+static inline double simdFillDbl(double x) { return x; }
+static inline int simdGt(int x, int y) { return x > y; }
+static inline int simdAdd(int x, int y) { return x + y; }
+static inline double simdAddDbl(double x, double y) { return x + y; }
+static inline int simdSub(int x, int y) { return x - y; }
+static inline double simdMulDbl(double x, double y) { return x * y; }
+static inline int simdMax(int x, int y) { return x > y ? x : y; }
+static inline int simdBlend(int x, int y, int mask) { return mask ? y : x; }
+static inline int simdHorizontalMax(int a) { return a; }
+static inline double simdHorizontalAddDbl(double x) { return x; }
+
+#endif
+
+}
+
+#endif
=====================================
src/tantan.cc
=====================================
@@ -1,6 +1,7 @@
// Copyright 2010 Martin C. Frith
#include "tantan.hh"
+#include "mcf_simd.hh"
#include <algorithm> // fill, max
#include <cassert>
@@ -14,6 +15,8 @@
namespace tantan {
+using namespace mcf;
+
void multiplyAll(std::vector<double> &v, double factor) {
for (std::vector<double>::iterator i = v.begin(); i < v.end(); ++i)
*i *= factor;
@@ -308,15 +311,37 @@ struct Tantan {
}
double b = backgroundProb;
- double fromForeground = 0;
- double *foregroundBeg = BEG(foregroundProbs);
+ const double *b2f = BEG(b2fProbs);
+ double *fp = BEG(foregroundProbs);
const double *lrRow = likelihoodRatioMatrix[*seqPtr];
int maxOffset = maxOffsetInTheSequence();
-
- for (int i = 0; i < maxOffset; ++i) {
- double f = foregroundBeg[i];
+ const uchar *sp = seqPtr;
+
+ SimdDbl bV = simdFillDbl(b);
+ SimdDbl tV = simdFillDbl(f2f0);
+ SimdDbl sV = simdZeroDbl();
+
+ int i = 0;
+ for (; i <= maxOffset - simdDblLen; i += simdDblLen) {
+ SimdDbl rV = simdSetDbl(
+#if defined __SSE4_1__ || defined __ARM_NEON
+#ifdef __AVX2__
+ lrRow[sp[-i-4]],
+ lrRow[sp[-i-3]],
+#endif
+ lrRow[sp[-i-2]],
+#endif
+ lrRow[sp[-i-1]]);
+ SimdDbl fV = simdLoadDbl(fp+i);
+ sV = simdAddDbl(sV, fV);
+ SimdDbl xV = simdMulDbl(bV, simdLoadDbl(b2f+i));
+ simdStoreDbl(fp+i, simdMulDbl(simdAddDbl(xV, simdMulDbl(fV, tV)), rV));
+ }
+ double fromForeground = simdHorizontalAddDbl(sV);
+ for (; i < maxOffset; ++i) {
+ double f = fp[i];
fromForeground += f;
- foregroundBeg[i] = (b * b2fProbs[i] + f * f2f0) * lrRow[seqPtr[-i-1]];
+ fp[i] = (b * b2f[i] + f * f2f0) * lrRow[sp[-i-1]];
}
backgroundProb = b * b2b + fromForeground * f2b;
@@ -330,15 +355,36 @@ struct Tantan {
}
double toBackground = f2b * backgroundProb;
- double toForeground = 0;
- double *foregroundBeg = BEG(foregroundProbs);
+ const double *b2f = BEG(b2fProbs);
+ double *fp = BEG(foregroundProbs);
const double *lrRow = likelihoodRatioMatrix[*seqPtr];
int maxOffset = maxOffsetInTheSequence();
-
- for (int i = 0; i < maxOffset; ++i) {
- double f = foregroundBeg[i] * lrRow[seqPtr[-i-1]];
- toForeground += b2fProbs[i] * f;
- foregroundBeg[i] = toBackground + f2f0 * f;
+ const uchar *sp = seqPtr;
+
+ SimdDbl bV = simdFillDbl(toBackground);
+ SimdDbl tV = simdFillDbl(f2f0);
+ SimdDbl sV = simdZeroDbl();
+
+ int i = 0;
+ for (; i <= maxOffset - simdDblLen; i += simdDblLen) {
+ SimdDbl rV = simdSetDbl(
+#if defined __SSE4_1__ || defined __ARM_NEON
+#ifdef __AVX2__
+ lrRow[sp[-i-4]],
+ lrRow[sp[-i-3]],
+#endif
+ lrRow[sp[-i-2]],
+#endif
+ lrRow[sp[-i-1]]);
+ SimdDbl fV = simdMulDbl(simdLoadDbl(fp+i), rV);
+ sV = simdAddDbl(sV, simdMulDbl(simdLoadDbl(b2f+i), fV));
+ simdStoreDbl(fp+i, simdAddDbl(bV, simdMulDbl(tV, fV)));
+ }
+ double toForeground = simdHorizontalAddDbl(sV);
+ for (; i < maxOffset; ++i) {
+ double f = fp[i] * lrRow[sp[-i-1]];
+ toForeground += b2f[i] * f;
+ fp[i] = toBackground + f2f0 * f;
}
backgroundProb = b2b * backgroundProb + toForeground;
=====================================
test/tantan_test.sh
=====================================
@@ -5,7 +5,7 @@
cd $(dirname $0)
# Make sure we use this version of tantan:
-PATH=../src:$PATH
+PATH=../bin:$PATH
countLowercaseLetters () {
grep -v '^>' "$@" | tr -cd a-z | wc -c | tr -d ' '
View it on GitLab: https://salsa.debian.org/med-team/tantan/-/compare/9bce2420a9e2f8e322dd685bfe73d82337b2d441...c9b5da3096c0e42186405a47d164108a4c976e72
--
View it on GitLab: https://salsa.debian.org/med-team/tantan/-/compare/9bce2420a9e2f8e322dd685bfe73d82337b2d441...c9b5da3096c0e42186405a47d164108a4c976e72
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20220528/b420da7d/attachment-0001.htm>
More information about the debian-med-commit
mailing list