[med-svn] [Git][med-team/andi][upstream] New upstream version 0.14
Fabian Klötzl (@kloetzl-guest)
gitlab at salsa.debian.org
Thu Dec 30 12:22:13 GMT 2021
Fabian Klötzl pushed to branch upstream at Debian Med / andi
Commits:
70820b09 by Fabian Klötzl at 2021-12-30T13:21:32+01:00
New upstream version 0.14
- - - - -
17 changed files:
- .travis.yml
- README.md
- andi-manual.pdf
- configure.ac
- docs/andi.1.in
- docs/manual/andi-manual.tex
- docs/manual/references.bib
- libs/pfasta.c
- libs/pfasta.h
- scripts/_andi
- src/Makefile.am
- src/andi.c
- src/global.h
- src/io.c
- src/model.c
- src/model.h
- test/Makefile.am
Changes:
=====================================
.travis.yml
=====================================
@@ -2,6 +2,9 @@ language: cpp
compiler:
- gcc
- clang
+arch:
+ - amd64
+ - ppc64le
sudo: false
addons:
apt:
=====================================
README.md
=====================================
@@ -15,7 +15,7 @@ For Debian and Ubuntu:
sudo apt-get install andi
-For OS X with Homebrew:
+For macOS with Homebrew:
brew tap brewsci/bio
brew install andi
@@ -68,7 +68,7 @@ The release of this software is accompanied by a paper from [Haubold et al.](htt
## License
-Copyright © 2014 - 2020 Fabian Klötzl
+Copyright © 2014 - 2021 Fabian Klötzl
License GPLv3+: GNU GPL version 3 or later.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. The full license text is available at <http://gnu.org/licenses/gpl.html>.
=====================================
andi-manual.pdf
=====================================
Binary files a/andi-manual.pdf and b/andi-manual.pdf differ
=====================================
configure.ac
=====================================
@@ -1,4 +1,4 @@
-AC_INIT([andi], [0.13])
+AC_INIT([andi], [0.14])
AM_INIT_AUTOMAKE([-Wall foreign])
AC_CONFIG_MACRO_DIR([m4])
=====================================
docs/andi.1.in
=====================================
@@ -23,7 +23,7 @@ Use this mode if each of your \fIFASTA\fR files represents one assembly with num
In multithreaded mode, \fBandi\fR requires memory linear to the amount of threads. The low memory mode changes this to a constant demand independent from the used number of threads. Unfortunately, this comes at a significant runtime cost.
.TP
\fB\-m\fR \fIMODEL\fR, \fB\-\-model\fR=\fIMODEL\fR
-Set the nucleotide evolution model to one of 'Raw', 'JC', or 'Kimura'. By default the Jukes-Cantor correction is used.
+Set the nucleotide evolution model to one of 'Raw', 'JC', 'Kimura', or 'LogDet'. By default the Jukes-Cantor correction is used.
.TP
\fB\-p\fR \fIFLOAT\fR
Significance of an anchor; default: 0.025.
@@ -48,7 +48,7 @@ Prints the synopsis and an explanation of available options.
\fB\-\-version\fR
Outputs version information and acknowledgments.
.SH COPYRIGHT
-Copyright \(co 2014 - 2020 Fabian Klötzl
+Copyright \(co 2014 - 2021 Fabian Klötzl
License GPLv3+: GNU GPL version 3 or later.
.br
This is free software: you are free to change and redistribute it.
=====================================
docs/manual/andi-manual.tex
=====================================
@@ -18,6 +18,7 @@
\usepackage{caption}
\usepackage{subcaption}
\usepackage{xspace}
+\usepackage{microtype}
\bibliographystyle{alpha}
@@ -134,16 +135,16 @@ To build \andi from source, download the latest \href{https://github.com/EvolBio
Once you have downloaded the package, unzip it and change into the newly created directory.
\begin{lstlisting}
-~ % tar -xzvf andi-0.12.tar.gz
-~ % cd andi-0.12
+~ % tar -xzvf andi-0.14.tar.gz
+~ % cd andi-0.14
\end{lstlisting}
\noindent Now build and install \andi.
\begin{lstlisting}
-~/andi-0.12 % ./configure
-~/andi-0.12 % make
-~/andi-0.12 % sudo make install
+~/andi-0.14 % ./configure
+~/andi-0.14 % make
+~/andi-0.14 % sudo make install
\end{lstlisting}
\noindent This installs \andi for all users on your system. If you do not have root privileges, you will find a working copy of \andi in the \lstinline$src$ subdirectory. For the rest of this documentation, it is assumed, that \andi is in your \textdollar\lstinline!PATH!.
@@ -151,15 +152,16 @@ Once you have downloaded the package, unzip it and change into the newly created
Now \andi should be ready for use. Try invoking the help.
\begin{lstlisting}
-~/andi-0.12 % Usage: andi [OPTIONS...] FILES...
- FILES... can be any sequence of FASTA files.
- Use '-' as file name to read from stdin.
+~/andi-0.14 % ~/andi
+Usage: andi [OPTIONS...] FILES...
+ FILES... can be any sequence of FASTA files.
+ Use '-' as file name to read from stdin.
Options:
-b, --bootstrap=INT Print additional bootstrap matrices
--file-of-filenames=FILE Read additional filenames from FILE; one per line
-j, --join Treat all sequences from one file as a single genome
-l, --low-memory Use less memory at the cost of speed
- -m, --model=MODEL Pick an evolutionary model of 'Raw', 'JC', 'Kimura'; default: JC
+ -m, --model=MODEL Pick an evolutionary model of 'Raw', 'JC', 'Kimura', 'LogDet'; default: JC
-p FLOAT Significance of an anchor; default: 0.025
--progress=WHEN Print a progress bar 'always', 'never', or 'auto'; default: auto
-t, --threads=INT Set the number of threads; by default, all processors are used
@@ -276,7 +278,7 @@ S2 0.0995 0.0000
In the above examples the runtime dropped from \SI{0.613}{\second}, to \SI{0.362}{\second} using two threads. Giving \andi more threads than input genomes leads to no further speed improvement. \, The other important option is \lstinline$--join$ (see Section~\ref{sec:join}).
-By default, the distances computed by \andi are \emph{Jukes-Cantor} corrected \cite{jukescantor}. Other evolutionary models are also implemented (Kimura, raw). The \lstinline$--model$ parameter can be used to switch between them.
+By default, the distances computed by \andi are \emph{Jukes-Cantor} corrected \cite{jukescantor}. Other evolutionary models are also implemented (Kimura \cite{kimura}, LogDet \cite{logdet}, raw). The \lstinline$--model$ parameter can be used to switch between them.
Since version 0.9.4 \andi includes a bootstrapping method. It can be activated via the \lstinline$--bootstrap$ or \lstinline$-b$ switch. This option takes a numeric argument representing the number of matrices to create. The output can then be piped into \algo{phylip}. For more information on computing support values from distance matrices see \cite{afra}.
@@ -363,7 +365,7 @@ Our models of genome evolution (JC, Kimura) only work on the four canonical type
\subsection*{Too Short Sequence}
-\andi was designed for big data sets of whole genomes. On short sequences the distance estimates are inaccurate. Use an multiple sequence alignment instead.
+\andi was designed for big data sets of whole genomes. On short sequences the distance estimates are inaccurate. Use a multiple sequence alignment instead.
\subsection*{Too Long Sequence}
=====================================
docs/manual/references.bib
=====================================
@@ -122,5 +122,25 @@
DOI = {10.3390/life6010011}
}
+ at article{logdet,
+ AUTHOR = {Lockhart, P.J. and M.A. Steel and M.D. Hendy and D. Penny},
+ TITLE = {Recovering Evolutionary Trees under a More Realistic Model of Sequence Evolution},
+ JOURNAL = {Molecular Biology and Evolution},
+ VOLUME = {11},
+ YEAR = {1994},
+ NUMBER = {4},
+ PAGES = {605-612},
+ DOI = {10.1093/oxfordjournals.molbev.a040136}
+}
+ at article{kimura,
+ AUTHOR = {Kimura, M.},
+ TITLE = {A Simple Method for Estimating Evolutionary Rate of Base Substitutions Through Comparative Studies of Nucleotide Sequences},
+ JOURNAL = {Journal of Molecular Evolution},
+ VOLUME = {16},
+ YEAR = {1980},
+ NUMBER = {2},
+ PAGES = {111-120},
+ DOI = {10.1007/BF01731581}
+}
=====================================
libs/pfasta.c
=====================================
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2015-2018, Fabian Klötzl <fabian-pfasta at kloetzl.info>
+ * Copyright (c) 2015-2020, Fabian Klötzl <fabian-pfasta at kloetzl.info>
*
* Permission to use, copy, modify, and/or distribute this software for any
* purpose with or without fee is hereby granted, provided that the above
@@ -27,18 +27,24 @@
#include "pfasta.h"
-#define VERSION "v14"
+#define VERSION "v15"
#ifdef __SSE2__
#include <emmintrin.h>
#endif
-#ifdef __STDC_NO_THREADS__
-#define thread_local
-#else
+#if __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
#include <threads.h>
+#define PFASTA_THREADSAFE 1
+#else
+#define thread_local
+#define PFASTA_THREADSAFE 0
#endif
+int pfasta_threadsafe() {
+ return PFASTA_THREADSAFE ;
+}
+
/** The following is the maximum length of an error string. It has to be
* carefully chosen, so that all calls to PF_FAIL_STR succeed. For instance,
* the line number can account for up to 20 characters.
@@ -159,7 +165,8 @@ cleanup:
}
int buffer_peek(struct pfasta_parser *pp) {
- return LIKELY(pp->read_ptr < pp->fill_ptr) ? *pp->read_ptr : EOF;
+ return LIKELY(pp->read_ptr < pp->fill_ptr) ? *(unsigned char *)pp->read_ptr
+ : EOF;
}
char *buffer_begin(struct pfasta_parser *pp) { return pp->read_ptr; }
@@ -257,7 +264,8 @@ size_t count_newlines(const char *begin, const char *end) {
static int copy_word(struct pfasta_parser *pp, dynstr *target) {
int return_code = 0;
- while (LIKELY(!my_isspace(buffer_peek(pp)))) {
+ int c;
+ while (c = buffer_peek(pp), c != EOF && LIKELY(!my_isspace(c))) {
char *end_of_word = find_first_space(buffer_begin(pp), buffer_end(pp));
size_t word_length = end_of_word - buffer_begin(pp);
@@ -439,7 +447,9 @@ int pfasta_read_sequence(struct pfasta_parser *pp, struct pfasta_record *pr) {
PF_FAIL_STR(pp, "Empty sequence on line %zu.", pp->line_number);
PF_FAIL_BUBBLE_CHECK(pp, check);
- while (LIKELY(isalpha(buffer_peek(pp)))) {
+ // Assume a line begins only with alpha, -, *, or more spaces
+ char c;
+ while (c = buffer_peek(pp), LIKELY(isalpha(c) || c == '-' || c == '*')) {
int check = copy_word(pp, &sequence);
if (UNLIKELY(check == E_EOF)) break;
PF_FAIL_BUBBLE_CHECK(pp, check);
=====================================
libs/pfasta.h
=====================================
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2015-2018, Fabian Klötzl <fabian-pfasta at kloetzl.info>
+ * Copyright (c) 2015-2020, Fabian Klötzl <fabian-pfasta at kloetzl.info>
*
* Permission to use, copy, modify, and/or distribute this software for any
* purpose with or without fee is hereby granted, provided that the above
@@ -83,11 +83,10 @@ void pfasta_free(struct pfasta_parser *pp);
*/
const char *pfasta_version(void);
-#ifdef __STDC_NO_THREADS__
-/** If the preprocessor macro `PFASTA_NO_THREADS` is defined, the parser is not
- * fully thread safe. */
-#define PFASTA_NO_THREADS
-#endif
+/**
+ * Returns 0 iff pfasta is not threadsafe.
+ */
+int pfasta_threadsafe();
#ifdef __cplusplus
}
=====================================
scripts/_andi
=====================================
@@ -25,6 +25,7 @@ args+=(
Raw\:Uncorrected\ distances
JC\:Jukes\-Cantor\ corrected
Kimura\:Kimura\-two\-parameter
+ LogDet\:Logarithmic\ determinant
))'
"($info)-p+[Significance of an anchor; default\: 0.025]:float:"
"($info)--progress=[Show progress bar]:when:(always auto never)"
=====================================
src/Makefile.am
=====================================
@@ -4,7 +4,7 @@ andi_SOURCES = andi.c esa.c process.c sequence.c io.c global.h esa.h process.h s
model.h model.c
andi_CPPFLAGS = $(OPENMP_CFLAGS) -I$(top_srcdir)/libs -I$(top_srcdir)/opt -std=gnu99
andi_CFLAGS = $(OPENMP_CFLAGS) -Wall -Wextra -Wno-missing-field-initializers
-andi_LDADD = $(PSUFSORT) $(top_builddir)/libs/libpfasta.a $(top_builddir)/opt/libcompat.a
+andi_LDADD = $(top_builddir)/libs/libpfasta.a $(top_builddir)/opt/libcompat.a
.PHONY: perf
perf: CFLAGS+= -g -O3 -ggdb -fno-omit-frame-pointer
=====================================
src/andi.c
=====================================
@@ -142,7 +142,7 @@ int main(int argc, char *argv[]) {
if (prop <= 0.0 || prop >= 1.0) {
soft_errx("A probability should be a value between 0 and "
- "1, exlusive; Ignoring -p %f argument.",
+ "1, exclusive; Ignoring -p %f argument.",
prop);
break;
}
@@ -199,17 +199,17 @@ int main(int argc, char *argv[]) {
break;
}
case 'm': {
- // valid options are 'RAW' and 'JC'
if (strcasecmp(optarg, "RAW") == 0) {
MODEL = M_RAW;
} else if (strcasecmp(optarg, "JC") == 0) {
MODEL = M_JC;
} else if (strcasecmp(optarg, "KIMURA") == 0) {
MODEL = M_KIMURA;
+ } else if (strcasecmp(optarg, "LOGDET") == 0) {
+ MODEL = M_LOGDET;
} else {
- soft_errx(
- "Ignoring argument for --model. Expected Raw, JC or "
- "Kimura");
+ soft_errx("Ignoring argument for --model. Expected Raw, "
+ "JC, Kimura or LogDet");
}
break;
}
@@ -340,14 +340,13 @@ void usage(int status) {
"\tUse '-' as file name to read from stdin.\n"
"Options:\n"
" -b, --bootstrap=INT Print additional bootstrap matrices\n"
- " --file-of-filenames=FILE Read additional filenames from "
- "FILE; one per line\n"
+ " --file-of-filenames=FILE Read additional filenames from FILE; "
+ "one per line\n"
" -j, --join Treat all sequences from one file as a single "
"genome\n"
" -l, --low-memory Use less memory at the cost of speed\n"
" -m, --model=MODEL Pick an evolutionary model of 'Raw', 'JC', "
- "'Kimura'; default: "
- "JC\n"
+ "'Kimura', 'LogDet'; default: JC\n"
" -p FLOAT Significance of an anchor; default: 0.025\n"
" --progress=WHEN Print a progress bar 'always', 'never', or "
"'auto'; default: auto\n"
=====================================
src/global.h
=====================================
@@ -47,7 +47,7 @@ extern gsl_rng *RNG;
*/
extern int MODEL;
-enum { M_RAW, M_JC, M_KIMURA };
+enum { M_RAW, M_JC, M_KIMURA, M_LOGDET };
/**
* This enum contains the available flags. Please note that all
=====================================
src/io.c
=====================================
@@ -227,7 +227,6 @@ void read_fasta(const char *file_name, dsa_t *dsa) {
pfasta_record_free(&pr);
}
-
fail:
pfasta_free(&pp);
close(file_descriptor);
@@ -264,6 +263,7 @@ void print_distances(const struct model *D, const seq_t *sequences, size_t n,
* compiler warnings. The real default is set in andi.c.*/
case M_JC: estimate = &estimate_JC; break;
case M_KIMURA: estimate = &estimate_KIMURA; break;
+ case M_LOGDET: estimate = &estimate_LOGDET; break;
}
for (i = 0; i < n; i++) {
=====================================
src/model.c
=====================================
@@ -80,7 +80,8 @@ double model_coverage(const model *MM) {
*/
double estimate_RAW(const model *MM) {
size_t nucl = model_total(MM);
- size_t SNPs = model_sum(MM, AtoC, AtoG, AtoT, CtoG, CtoT, GtoT);
+ size_t SNPs = model_sum(MM, AtoC, AtoG, AtoT, CtoA, CtoG, CtoT, GtoA, GtoC,
+ GtoT, TtoA, TtoC, TtoG);
// Insignificant results. All abort the fail train.
if (nucl <= 3) {
@@ -111,8 +112,9 @@ double estimate_JC(const model *MM) {
*/
double estimate_KIMURA(const model *MM) {
size_t nucl = model_total(MM);
- size_t transitions = model_sum(MM, AtoG, CtoT);
- size_t transversions = model_sum(MM, AtoC, AtoT, GtoC, GtoT);
+ size_t transitions = model_sum(MM, AtoG, GtoA, CtoT, TtoC);
+ size_t transversions =
+ model_sum(MM, AtoC, CtoA, AtoT, TtoA, GtoC, CtoG, GtoT, TtoG);
double P = (double)transitions / (double)nucl;
double Q = (double)transversions / (double)nucl;
@@ -124,6 +126,77 @@ double estimate_KIMURA(const model *MM) {
return dist <= 0.0 ? 0.0 : dist;
}
+/** @brief computes the evolutionary distance using LogDet.
+ *
+ * The LogDet distance between sequence X and and sequence Y
+ * is given as
+ *
+ * -(1 / K) * (log(det(Fxy)) - 0.5 * log(det(Fxx * Fyy)))
+ *
+ * Where K is the number of character states, Fxy is the site-pattern
+ * frequency matrix, and diagonal matrices Fxx and Fyy give the
+ * frequencies of the different character states in sequences X and Y.
+ *
+ * Each i,j-th entry in Fxy is the proportion of homologous sites
+ * where sequences X and Y have character states i and j, respectively.
+ *
+ * For our purposes, X is the Subject (From) sequence and Y is the
+ * Query (To) sequence and matrix Fxy looks like
+ *
+ * To A C G T
+ * From
+ * A ( )
+ * C ( )
+ * G ( )
+ * T ( )
+ *
+ * @param MM - The mutation matrix.
+ * @returns The LogDet distance.
+ */
+double estimate_LOGDET(const model *MM) {
+
+ double nucl = (double)model_total(MM);
+ double P[MUTCOUNTS];
+ for (int i = 0; i < MUTCOUNTS; i++) {
+ P[i] = MM->counts[i] / nucl;
+ }
+
+ double logDetFxxFyy =
+ // log determinant of diagonal matrix of row sums
+ log(model_sum(MM, AtoA, AtoC, AtoG, AtoT) / nucl) +
+ log(model_sum(MM, CtoA, CtoC, CtoG, CtoT) / nucl) +
+ log(model_sum(MM, GtoA, GtoC, GtoG, GtoT) / nucl) +
+ log(model_sum(MM, TtoA, TtoC, TtoG, TtoT) / nucl) +
+ // log determinant of diagonal matrix of column sums
+ log(model_sum(MM, AtoA, CtoA, GtoA, TtoA) / nucl) +
+ log(model_sum(MM, AtoC, CtoC, GtoC, TtoC) / nucl) +
+ log(model_sum(MM, AtoG, CtoG, GtoG, TtoG) / nucl) +
+ log(model_sum(MM, AtoT, CtoT, GtoT, TtoT) / nucl);
+
+ // determinant of the site-pattern frequency matrix
+ double detFxy =
+ P[AtoA] * P[CtoC] * (P[GtoG] * P[TtoT] - P[TtoG] * P[GtoT]) -
+ P[AtoA] * P[CtoG] * (P[GtoC] * P[TtoT] - P[TtoC] * P[GtoT]) +
+ P[AtoA] * P[CtoT] * (P[GtoC] * P[TtoG] - P[TtoC] * P[GtoG]) -
+
+ P[AtoC] * P[CtoA] * (P[GtoG] * P[TtoT] - P[TtoG] * P[GtoT]) +
+ P[AtoC] * P[CtoG] * (P[GtoA] * P[TtoT] - P[TtoA] * P[GtoT]) -
+ P[AtoC] * P[CtoT] * (P[GtoA] * P[TtoG] - P[TtoA] * P[GtoG]) +
+
+ P[AtoG] * P[CtoA] * (P[GtoC] * P[TtoT] - P[TtoC] * P[GtoT]) -
+ P[AtoG] * P[CtoC] * (P[GtoA] * P[TtoT] - P[TtoA] * P[GtoT]) +
+ P[AtoG] * P[CtoT] * (P[GtoA] * P[TtoC] - P[TtoA] * P[GtoC]) -
+
+ P[AtoT] * P[CtoA] * (P[GtoC] * P[TtoG] - P[TtoC] * P[GtoG]) +
+ P[AtoT] * P[CtoC] * (P[GtoA] * P[TtoG] - P[TtoA] * P[GtoG]) -
+ P[AtoT] * P[CtoG] * (P[GtoA] * P[TtoC] - P[TtoA] * P[GtoC]);
+
+ double dist = -0.25 * (log(detFxy) - 0.5 * logDetFxxFyy);
+
+ // fix negative zero
+ return dist <= 0.0 ? 0.0 : dist;
+}
+
/** @brief Bootstrap a mutation matrix.
*
* The classical bootstrapping process, as described by Felsenstein, resamples
@@ -132,25 +205,17 @@ double estimate_KIMURA(const model *MM) {
* resample the elements of the mutation matrix. See Klötzl & Haubold (2016)
* for details. http://www.mdpi.com/2075-1729/6/1/11/htm
*
- * @param MM - The original mutation matrix.
+ * @param datum - The original mutation matrix.
* @returns A bootstrapped mutation matrix.
*/
-model model_bootstrap(const model MM) {
- model datum = MM;
-
- size_t nucl = model_total(&MM);
+model model_bootstrap(model datum) {
+ size_t nucl = model_total(&datum);
double p[MUTCOUNTS];
for (size_t i = 0; i < MUTCOUNTS; ++i) {
- p[i] = MM.counts[i] / (double)nucl;
+ p[i] = datum.counts[i] / (double)nucl;
}
- unsigned int n[MUTCOUNTS];
-
- gsl_ran_multinomial(RNG, MUTCOUNTS, nucl, p, n);
-
- for (size_t i = 0; i < MUTCOUNTS; ++i) {
- datum.counts[i] = n[i];
- }
+ gsl_ran_multinomial(RNG, MUTCOUNTS, nucl, p, datum.counts);
return datum;
}
@@ -216,7 +281,7 @@ void model_count_equal(model *MM, const char *S, size_t len) {
* @param c - input nucleotide
* @returns 2bit representation.
*/
-char nucl2bit(char c) {
+char nucl2bit(unsigned char c) {
c &= 6;
c ^= c >> 1;
return c >> 1;
@@ -244,25 +309,13 @@ void model_count(model *MM, const char *S, const char *Q, size_t len) {
// Pick the correct two bits representing s and q.
unsigned char foo = nucl2bit(s);
- unsigned char baz = nucl2bit(q);
-
- /*
- * The mutation matrix is symmetric. For convenience we define the
- * mutation XtoY with X > Y.
- */
- if (baz > foo) {
- int temp = foo;
- foo = baz;
- baz = temp;
- }
+ unsigned char bar = nucl2bit(q);
/*
* Finally, we want to map the indices to the correct mutation. This is
* done by utilising the mutation types in model.h.
*/
- static const unsigned int map4 = 0x9740;
- unsigned int base = (map4 >> (4 * baz)) & 0xf;
- unsigned int index = base + (foo - baz);
+ unsigned int index = (foo << 2) + bar;
local_counts[index]++;
}
=====================================
src/model.h
=====================================
@@ -8,8 +8,7 @@
#include <stdlib.h>
/**
- * This enum contains all possible mutations. Some of them are considered
- * symmetric (e.g. AtoT and TtoA) and thus have the same value. The total number
+ * This enum contains all possible mutations. The total number
* of different possible mutations is MUTCOUNTS.
*/
enum {
@@ -17,19 +16,19 @@ enum {
AtoC,
AtoG,
AtoT,
+ CtoA,
CtoC,
CtoG,
CtoT,
+ GtoA,
+ GtoC,
GtoG,
GtoT,
+ TtoA,
+ TtoC,
+ TtoG,
TtoT,
- MUTCOUNTS,
- CtoA = AtoC,
- GtoA = AtoG,
- GtoC = CtoG,
- TtoA = AtoT,
- TtoC = CtoT,
- TtoG = GtoT
+ MUTCOUNTS
};
/** @brief The mutation matrix.
@@ -46,12 +45,15 @@ enum {
*
* The cells are absolute counts. Together with seq_len (the query length),
* we can deduce the substitution rate and coverage.
+ *
+ * As libdivsufsort is 32 bit the sequence length is limited to (INT_MAX-1)/2.
+ * We can thus use the same limit for the counts.
*/
typedef struct model {
/** The absolute counts of mutation types. */
- size_t counts[MUTCOUNTS];
+ unsigned int counts[MUTCOUNTS];
/** The query length. */
- size_t seq_len;
+ unsigned int seq_len;
} model;
void model_count_equal(model *, const char *, size_t);
@@ -61,4 +63,5 @@ double model_coverage(const model *);
double estimate_RAW(const model *);
double estimate_JC(const model *);
double estimate_KIMURA(const model *);
-model model_bootstrap(const model);
+double estimate_LOGDET(const model *);
+model model_bootstrap(model);
=====================================
test/Makefile.am
=====================================
@@ -9,12 +9,12 @@ test_seq_LDADD = $(GLIB_LIBS) $(top_builddir)/opt/libcompat.a
test_process_SOURCES = test_process.c $(top_srcdir)/src/esa.c $(top_srcdir)/src/io.c $(top_srcdir)/src/model.c $(top_srcdir)/src/process.c $(top_srcdir)/src/sequence.c $(top_srcdir)/src/global.h
test_process_CPPFLAGS = $(OPENMP_CFLAGS) -I$(top_srcdir)/src -I$(top_srcdir)/opt -I$(top_srcdir)/libs -DDEBUG -std=gnu99
test_process_CFLAGS = $(OPENMP_CFLAGS) -Wall -Wextra $(GLIB_CFLAGS) -Wno-missing-field-initializers
-test_process_LDADD = $(GLIB_LIBS) $(PSUFSORT) $(top_builddir)/opt/libcompat.a $(top_builddir)/libs/libpfasta.a
+test_process_LDADD = $(GLIB_LIBS) $(top_builddir)/opt/libcompat.a $(top_builddir)/libs/libpfasta.a
test_esa_SOURCES = test_esa.c $(top_srcdir)/src/esa.c $(top_srcdir)/src/sequence.c $(top_srcdir)/src/esa.h
test_esa_CPPFLAGS = $(OPENMP_CFLAGS) -I$(top_srcdir)/libs -I$(top_srcdir)/opt -I$(top_srcdir)/src -DDEBUG -std=gnu99
test_esa_CFLAGS = $(OPENMP_CFLAGS) -Wall -Wextra $(GLIB_CFLAGS) -Wno-missing-field-initializers
-test_esa_LDADD = $(GLIB_LIBS) $(PSUFSORT) $(top_builddir)/opt/libcompat.a
+test_esa_LDADD = $(GLIB_LIBS) $(top_builddir)/opt/libcompat.a
test_fasta_SOURCES = test_fasta.cxx
View it on GitLab: https://salsa.debian.org/med-team/andi/-/commit/70820b092e7429fa2ac3c4a3e4175dca7eab272c
--
View it on GitLab: https://salsa.debian.org/med-team/andi/-/commit/70820b092e7429fa2ac3c4a3e4175dca7eab272c
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20211230/01bc5b66/attachment-0001.htm>
More information about the debian-med-commit
mailing list