[med-svn] [Git][med-team/last-align][master] 7 commits: New upstream version 963

Fri Jan 11 21:52:19 GMT 2019

Andreas Tille pushed to branch master at Debian Med / last-align


Commits:
173bf5b1 by Andreas Tille at 2019-01-11T21:39:53Z
New upstream version 963
- - - - -
3a1b1cf6 by Andreas Tille at 2019-01-11T21:39:54Z
Update upstream source from tag 'upstream/963'

Update to upstream version '963'
with Debian dir eae83b3125bacd214777180d904bcf68dcfdb81b
- - - - -
b0ef285c by Andreas Tille at 2019-01-11T21:39:54Z
New upstream version

- - - - -
ab8b0745 by Andreas Tille at 2019-01-11T21:39:54Z
debhelper 12

- - - - -
46eee411 by Andreas Tille at 2019-01-11T21:39:56Z
Standards-Version: 4.3.0

- - - - -
cfaeac21 by Andreas Tille at 2019-01-11T21:48:57Z
Replace needs-recommends by explicitly adding recommended package python-pil to test depends

- - - - -
00ab3317 by Andreas Tille at 2019-01-11T21:51:47Z
Upload to unstable

- - - - -


14 changed files:

- ChangeLog.txt
- debian/changelog
- debian/compat
- debian/control
- debian/tests/control
- doc/Makefile
- doc/last-parallel.html
- doc/last-parallel.txt
- doc/last-split.html
- doc/last-split.txt
- doc/last-tutorial.html
- doc/last-tutorial.txt
- src/tantan.cc
- src/version.hh


Changes:

=====================================
ChangeLog.txt
=====================================
@@ -1,8 +1,19 @@
+2018-12-26  Martin C. Frith  <Martin C. Frith>
+
+	* src/tantan.cc:
+	Make tantan repeat-finding faster
+	[44199bff9cef] [tip]
+
+	* doc/Makefile, doc/last-parallel.txt, doc/last-split.txt, doc/last-
+	tutorial.txt:
+	Update the documents a bit
+	[6727ae8b7a12]
+
 2018-12-10  Martin C. Frith  <Martin C. Frith>
 
 	* doc/last-dotplot.txt, scripts/last-dotplot:
 	Add last-dotplot --maxseqs option
-	[ed0fb9b1eb40] [tip]
+	[ed0fb9b1eb40]
 
 	* src/split/last-split.cc:
 	Make last-split -n keep E-values


=====================================
debian/changelog
=====================================
@@ -1,3 +1,13 @@
+last-align (963-1) unstable; urgency=medium
+
+  * New upstream version
+  * debhelper 12
+  * Standards-Version: 4.3.0
+  * debian/tests/control: Replace needs-recommends by explicitly
+    adding recommended package python-pil to test depends
+
+ -- Andreas Tille <tille at debian.org>  Fri, 11 Jan 2019 22:49:18 +0100
+
 last-align (961-1) unstable; urgency=medium
 
   [ Jelmer Vernooĳ ]


=====================================
debian/compat
=====================================
@@ -1 +1 @@
-11
+12


=====================================
debian/control
=====================================
@@ -4,11 +4,11 @@ Uploaders: Charles Plessy <plessy at debian.org>,
            Andreas Tille <tille at debian.org>
 Section: science
 Priority: optional
-Build-Depends: debhelper (>= 11~),
+Build-Depends: debhelper (>= 12~),
                help2man,
                python-pil,
                zlib1g-dev
-Standards-Version: 4.2.1
+Standards-Version: 4.3.0
 Vcs-Browser: https://salsa.debian.org/med-team/last-align
 Vcs-Git: https://salsa.debian.org/med-team/last-align.git
 Homepage: http://last.cbrc.jp/


=====================================
debian/tests/control
=====================================
@@ -1,3 +1,3 @@
 Tests: run-unit-test
-Depends: @
-Restrictions: allow-stderr, needs-recommends
+Depends: @, python-pil
+Restrictions: allow-stderr


=====================================
doc/Makefile
=====================================
@@ -15,6 +15,7 @@ ${DOCS}: last-doc.css Makefile
 
 # Ugh!  Is there a better way?
 RST_CSS = `locate html4css1.css | tail -n1`
+#RST_CSS = html4css1.css
 
 RSTFLAGS = --initial-header-level=2 --no-compact-lists	\
 --no-compact-field-lists --option-limit=0 --no-doc-info


=====================================
doc/last-parallel.html
=====================================
@@ -364,11 +364,11 @@ parallel-fasta "lastal mydb" < queries.fa > myalns.maf
 </pre>
 <p>Instead of this:</p>
 <pre class="literal-block">
-lastal -Q1 -D100 db q.fastq | last-split > out.maf
+lastal -Q1 db q.fastq | last-split > out.maf
 </pre>
 <p>try this:</p>
 <pre class="literal-block">
-parallel-fastq "lastal -Q1 -D100 db | last-split" < q.fastq > out.maf
+parallel-fastq "lastal -Q1 db | last-split" < q.fastq > out.maf
 </pre>
 <p>Instead of this:</p>
 <pre class="literal-block">


=====================================
doc/last-parallel.txt
=====================================
@@ -53,11 +53,11 @@ try this::
 
 Instead of this::
 
-  lastal -Q1 -D100 db q.fastq | last-split > out.maf
+  lastal -Q1 db q.fastq | last-split > out.maf
 
 try this::
 
-  parallel-fastq "lastal -Q1 -D100 db | last-split" < q.fastq > out.maf
+  parallel-fastq "lastal -Q1 db | last-split" < q.fastq > out.maf
 
 Instead of this::
 


=====================================
doc/last-split.html
=====================================
@@ -334,7 +334,7 @@ format), and the genome is in "genome.fasta" (in fasta format).  We
 can do the alignment like this:</p>
 <pre class="literal-block">
 lastdb -uNEAR -R01 db genome.fasta
-lastal -Q1 -D100 db q.fastq | last-split > out.maf
+lastal -Q1 db q.fastq | last-split > out.maf
 </pre>
 </div>
 <div class="section" id="spliced-alignment-of-rna-reads-to-a-genome">
@@ -397,7 +397,7 @@ matches.</p>
 <h2>Going faster by parallelization</h2>
 <p>For example, split alignment of DNA reads to a genome:</p>
 <pre class="literal-block">
-parallel-fastq "lastal -Q1 -D100 db | last-split" < q.fastq > out.maf
+parallel-fastq "lastal -Q1 db | last-split" < q.fastq > out.maf
 </pre>
 <p>This requires GNU parallel to be installed
 (<a class="reference external" href="http://www.gnu.org/software/parallel/">http://www.gnu.org/software/parallel/</a>).</p>
@@ -558,7 +558,7 @@ middle of the query, we can do "spliced" alignment without considering
 splice signals or favouring cis-splices:</p>
 <pre class="literal-block">
 lastdb -uNEAR -R01 db genome.fasta
-lastal -Q1 -D100 db q.fastq | last-split -c0 > out.maf
+lastal -Q1 db q.fastq | last-split -c0 > out.maf
 </pre>
 </div>
 </div>


=====================================
doc/last-split.txt
=====================================
@@ -21,7 +21,7 @@ format), and the genome is in "genome.fasta" (in fasta format).  We
 can do the alignment like this::
 
   lastdb -uNEAR -R01 db genome.fasta
-  lastal -Q1 -D100 db q.fastq | last-split > out.maf
+  lastal -Q1 db q.fastq | last-split > out.maf
 
 Spliced alignment of RNA reads to a genome
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -78,7 +78,7 @@ Going faster by parallelization
 
 For example, split alignment of DNA reads to a genome::
 
-  parallel-fastq "lastal -Q1 -D100 db | last-split" < q.fastq > out.maf
+  parallel-fastq "lastal -Q1 db | last-split" < q.fastq > out.maf
 
 This requires GNU parallel to be installed
 (http://www.gnu.org/software/parallel/).
@@ -173,7 +173,7 @@ middle of the query, we can do "spliced" alignment without considering
 splice signals or favouring cis-splices::
 
   lastdb -uNEAR -R01 db genome.fasta
-  lastal -Q1 -D100 db q.fastq | last-split -c0 > out.maf
+  lastal -Q1 db q.fastq | last-split -c0 > out.maf
 
 Options
 -------


=====================================
doc/last-tutorial.html
=====================================
@@ -379,20 +379,28 @@ lastdb -p -cR01 protdb proteins.fa
 lastal -F15 protdb dnas.fa
 </pre>
 </div>
-<div class="section" id="example-4-find-short-protein-alignments">
-<h2>Example 4: Find short protein alignments</h2>
+<div class="section" id="example-4-find-high-similarity-and-short-protein-alignments">
+<h2>Example 4: Find high-similarity, and short, protein alignments</h2>
 <p>LAST uses a <a class="reference external" href="last-matrices.html">scoring scheme</a> to find
-similarities.  Some scoring schemes are good for long-and-weak
-similarities, others for short-and-strong similarities.  If we seek
-very short similarities, weak ones are hopeless (statistically
-insignificant), so we had better focus on strong ones.  The PAM30
-scoring scheme may work well:</p>
+similarities.  Some scoring schemes are tuned for weak similarities,
+others for strong similarities.  The PAM30 scoring scheme finds strong
+protein similarities:</p>
 <pre class="literal-block">
 lastdb -p -cR01 invdb invertebrate.fa
 lastal -pPAM30 invdb vertebrate.fa
 </pre>
-<p>(How short is "very short"?  It depends on the amount of sequence data
-we are searching, but perhaps roughly less than 40 amino acids.)</p>
+<p>This has two advantages:</p>
+<ul>
+<li><p class="first">It omits weak alignments, or alignment parts (occasionally a strong
+similarity is flanked by a weak similarity).</p>
+</li>
+<li><p class="first">It can find short similarities.  If we seek very short similarities,
+weak ones are hopeless (statistically insignificant), so we had
+better focus on strong ones.  (How short is "very short"?  It
+depends on the amount of sequence data we are searching, but perhaps
+roughly less than 40 amino acids.)</p>
+</li>
+</ul>
 </div>
 <div class="section" id="example-5-align-human-dna-sequences-to-the-human-genome">
 <h2>Example 5: Align human DNA sequences to the human genome</h2>
@@ -446,13 +454,14 @@ for smaller genomes).</p>
 <p>DNA sequences are not always perfectly accurate, and they are
 sometimes provided in fastq format, which indicates the reliability of
 each base.  LAST can use this information to improve alignment
-accuracy.  (It assumes the reliabilities reflect substitution errors,
-not insertion/deletion errors: if that is not true, it may be better
-to use fasta format.)  Option -Q1 indicates fastq-sanger format:</p>
+accuracy.  Option -Q1 indicates fastq-sanger format:</p>
 <pre class="literal-block">
 lastdb -uNEAR -R01 humandb human/chr*.fa
-lastal -Q1 -D100 humandb queries.fastq | last-split > myalns.maf
+lastal -Q1 humandb queries.fastq | last-split > myalns.maf
 </pre>
+<p><strong>Assumption:</strong> LAST assumes the reliabilities reflect substitution
+errors, not insertion/deletion errors.  If that is not true, you can
+tell it to ignore the reliability data with -Q0.</p>
 </div>
 <div class="section" id="fastq-format-confusion">
 <h2>Fastq format confusion</h2>
@@ -490,7 +499,19 @@ the two reads in a pair to match (e.g.) different chromosomes.</p>
 </div>
 <div class="section" id="example-9-compare-the-human-and-chimp-genomes">
 <h2>Example 9: Compare the human and chimp genomes</h2>
-<p>See <a class="reference external" href="https://github.com/mcfrith/last-genome-alignments">here</a>.</p>
+<p>See <a class="reference external" href="https://github.com/mcfrith/last-genome-alignments">here</a>.  But
+that recipe is <em>extremely</em> slow-and-accurate.  You can <a class="reference external" href="last-tuning.html">tune</a> it to compare huge, high-similarity genomes with
+moderate run time and memory use:</p>
+<ul>
+<li><p class="first">Omit the sensitivity-boosting lastal -m option.</p>
+</li>
+<li><p class="first">Add -W99 (or so) to the lastdb options.</p>
+</li>
+<li><p class="first">If the "reference" genome (the one given to lastdb) is > 4 GB, it's
+probably more efficient to use lastdb8 and lastal8, instead of
+lastdb and lastal.</p>
+</li>
+</ul>
 </div>
 <div class="section" id="example-10-ambiguity-of-alignment-columns">
 <h2>Example 10: Ambiguity of alignment columns</h2>


=====================================
doc/last-tutorial.txt
=====================================
@@ -70,21 +70,27 @@ score penalty of 15 for frameshifts::
   lastdb -p -cR01 protdb proteins.fa
   lastal -F15 protdb dnas.fa
 
-Example 4: Find short protein alignments
-----------------------------------------
+Example 4: Find high-similarity, and short, protein alignments
+--------------------------------------------------------------
 
 LAST uses a `scoring scheme <last-matrices.html>`_ to find
-similarities.  Some scoring schemes are good for long-and-weak
-similarities, others for short-and-strong similarities.  If we seek
-very short similarities, weak ones are hopeless (statistically
-insignificant), so we had better focus on strong ones.  The PAM30
-scoring scheme may work well::
+similarities.  Some scoring schemes are tuned for weak similarities,
+others for strong similarities.  The PAM30 scoring scheme finds strong
+protein similarities::
 
   lastdb -p -cR01 invdb invertebrate.fa
   lastal -pPAM30 invdb vertebrate.fa
 
-(How short is "very short"?  It depends on the amount of sequence data
-we are searching, but perhaps roughly less than 40 amino acids.)
+This has two advantages:
+
+* It omits weak alignments, or alignment parts (occasionally a strong
+  similarity is flanked by a weak similarity).
+
+* It can find short similarities.  If we seek very short similarities,
+  weak ones are hopeless (statistically insignificant), so we had
+  better focus on strong ones.  (How short is "very short"?  It
+  depends on the amount of sequence data we are searching, but perhaps
+  roughly less than 40 amino acids.)
 
 Example 5: Align human DNA sequences to the human genome
 --------------------------------------------------------
@@ -138,12 +144,14 @@ Example 7: Align human fastq sequences to the human genome
 DNA sequences are not always perfectly accurate, and they are
 sometimes provided in fastq format, which indicates the reliability of
 each base.  LAST can use this information to improve alignment
-accuracy.  (It assumes the reliabilities reflect substitution errors,
-not insertion/deletion errors: if that is not true, it may be better
-to use fasta format.)  Option -Q1 indicates fastq-sanger format::
+accuracy.  Option -Q1 indicates fastq-sanger format::
 
   lastdb -uNEAR -R01 humandb human/chr*.fa
-  lastal -Q1 -D100 humandb queries.fastq | last-split > myalns.maf
+  lastal -Q1 humandb queries.fastq | last-split > myalns.maf
+
+**Assumption:** LAST assumes the reliabilities reflect substitution
+errors, not insertion/deletion errors.  If that is not true, you can
+tell it to ignore the reliability data with -Q0.
 
 Fastq format confusion
 ----------------------
@@ -183,7 +191,18 @@ See `here <https://github.com/mcfrith/last-genome-alignments>`_.
 Example 9: Compare the human and chimp genomes
 ----------------------------------------------
 
-See `here <https://github.com/mcfrith/last-genome-alignments>`_.
+See `here <https://github.com/mcfrith/last-genome-alignments>`_.  But
+that recipe is *extremely* slow-and-accurate.  You can `tune
+<last-tuning.html>`_ it to compare huge, high-similarity genomes with
+moderate run time and memory use:
+
+* Omit the sensitivity-boosting lastal -m option.
+
+* Add -W99 (or so) to the lastdb options.
+
+* If the "reference" genome (the one given to lastdb) is > 4 GB, it's
+  probably more efficient to use lastdb8 and lastal8, instead of
+  lastdb and lastal.
 
 Example 10: Ambiguity of alignment columns
 ------------------------------------------


=====================================
src/tantan.cc
=====================================
@@ -20,10 +20,10 @@ void multiplyAll(std::vector<double> &v, double factor) {
 }
 
 double firstRepeatOffsetProb(double probMult, int maxRepeatOffset) {
-  if (probMult < 1 || probMult > 1)
+  if (probMult < 1 || probMult > 1) {
     return (1 - probMult) / (1 - std::pow(probMult, maxRepeatOffset));
-  else
-    return 1.0 / maxRepeatOffset;
+  }
+  return 1.0 / maxRepeatOffset;
 }
 
 void checkForwardAndBackwardTotals(double fTot, double bTot) {
@@ -64,6 +64,7 @@ struct Tantan {
   double b2fLast;  // background state to last foreground state
 
   double backgroundProb;
+  std::vector<double> b2fProbs;  // background state to each foreground state
   std::vector<double> foregroundProbs;
   std::vector<double> insertionProbs;
 
@@ -99,7 +100,7 @@ struct Tantan {
     //f2g = firstGapProb;
     //g2f = 1 - otherGapProb;
     oneGapProb = firstGapProb * (1 - otherGapProb);
-    endGapProb = firstGapProb * 1;
+    endGapProb = firstGapProb * (maxRepeatOffset > 1);
     f2f0 = 1 - repeatEndProb;
     f2f1 = 1 - repeatEndProb - firstGapProb;
     f2f2 = 1 - repeatEndProb - firstGapProb * 2;
@@ -110,9 +111,16 @@ struct Tantan {
     b2fFirst = repeatProb * firstRepeatOffsetProb(b2fDecay, maxRepeatOffset);
     b2fLast = repeatProb * firstRepeatOffsetProb(b2fGrowth, maxRepeatOffset);
 
+    b2fProbs.resize(maxRepeatOffset);
     foregroundProbs.resize(maxRepeatOffset);
     insertionProbs.resize(maxRepeatOffset - 1);
 
+    double p = b2fFirst;
+    for (int i = 0; i < maxRepeatOffset; ++i) {
+      b2fProbs[i] = p;
+      p *= b2fDecay;
+    }
+
     scaleFactors.resize((seqEnd - seqBeg) / scaleStepSize);
   }
 
@@ -125,8 +133,7 @@ struct Tantan {
   double forwardTotal() {
     double fromForeground = std::accumulate(foregroundProbs.begin(),
                                             foregroundProbs.end(), 0.0);
-    fromForeground *= f2b;
-    double total = backgroundProb * b2b + fromForeground;
+    double total = backgroundProb * b2b + fromForeground * f2b;
     assert(total > 0);
     return total;
   }
@@ -148,36 +155,31 @@ struct Tantan {
     double f = *foregroundPtr;
     double fromForeground = f;
 
-    if (insertionProbs.empty()) {
-      *foregroundPtr = fromBackground + f * f2f0;
-    } else {
-      double *insertionPtr = &insertionProbs.back();
-      double i = *insertionPtr;
-      *foregroundPtr = fromBackground + f * f2f1 + i * endGapProb;
-      double d = f;
-      --foregroundPtr;
-      fromBackground *= b2fGrowth;
-
-      while (foregroundPtr > &foregroundProbs.front()) {
-        f = *foregroundPtr;
-        fromForeground += f;
-        i = *(insertionPtr - 1);
-        *foregroundPtr = fromBackground + f * f2f2 + (i + d) * oneGapProb;
-        *insertionPtr = f + i * g2g;
-        d = f + d * g2g;
-        --foregroundPtr;
-        --insertionPtr;
-        fromBackground *= b2fGrowth;
-      }
+    double *insertionPtr = &insertionProbs.back();
+    double i = *insertionPtr;
+    *foregroundPtr = fromBackground + f * f2f1 + i * endGapProb;
+    double d = f;
+    --foregroundPtr;
+    fromBackground *= b2fGrowth;
 
+    while (foregroundPtr > &foregroundProbs.front()) {
       f = *foregroundPtr;
       fromForeground += f;
-      *foregroundPtr = fromBackground + f * f2f1 + d * endGapProb;
-      *insertionPtr = f;
+      i = *(insertionPtr - 1);
+      *foregroundPtr = fromBackground + f * f2f2 + (i + d) * oneGapProb;
+      *insertionPtr = f + i * g2g;
+      d = f + d * g2g;
+      --foregroundPtr;
+      --insertionPtr;
+      fromBackground *= b2fGrowth;
     }
 
-    fromForeground *= f2b;
-    backgroundProb = backgroundProb * b2b + fromForeground;
+    f = *foregroundPtr;
+    fromForeground += f;
+    *foregroundPtr = fromBackground + f * f2f1 + d * endGapProb;
+    *insertionPtr = f;
+
+    backgroundProb = backgroundProb * b2b + fromForeground * f2b;
   }
 
   void calcBackwardTransitionProbsWithGaps() {
@@ -186,57 +188,48 @@ struct Tantan {
     double f = *foregroundPtr;
     double toForeground = f;
 
-    if (insertionProbs.empty()) {
-      *foregroundPtr = toBackground + f2f0 * f;
-    } else {
-      double *insertionPtr = &insertionProbs.front();
-      double i = *insertionPtr;
-      *foregroundPtr = toBackground + f2f1 * f + i;
-      double d = endGapProb * f;
-      ++foregroundPtr;
-      toForeground *= b2fGrowth;
-
-      while (foregroundPtr < &foregroundProbs.back()) {
-        f = *foregroundPtr;
-        toForeground += f;
-        i = *(insertionPtr + 1);
-        *foregroundPtr = toBackground + f2f2 * f + (i + d);
-        double oneGapProb_f = oneGapProb * f;
-        *insertionPtr = oneGapProb_f + g2g * i;
-        d = oneGapProb_f + g2g * d;
-        ++foregroundPtr;
-        ++insertionPtr;
-        toForeground *= b2fGrowth;
-      }
+    double *insertionPtr = &insertionProbs.front();
+    double i = *insertionPtr;
+    *foregroundPtr = toBackground + f2f1 * f + i;
+    double d = endGapProb * f;
+    ++foregroundPtr;
+    toForeground *= b2fGrowth;
 
+    while (foregroundPtr < &foregroundProbs.back()) {
       f = *foregroundPtr;
       toForeground += f;
-      *foregroundPtr = toBackground + f2f1 * f + d;
-      *insertionPtr = endGapProb * f;
+      i = *(insertionPtr + 1);
+      *foregroundPtr = toBackground + f2f2 * f + (i + d);
+      double oneGapProb_f = oneGapProb * f;
+      *insertionPtr = oneGapProb_f + g2g * i;
+      d = oneGapProb_f + g2g * d;
+      ++foregroundPtr;
+      ++insertionPtr;
+      toForeground *= b2fGrowth;
     }
 
-    toForeground *= b2fLast;
-    backgroundProb = b2b * backgroundProb + toForeground;
+    f = *foregroundPtr;
+    toForeground += f;
+    *foregroundPtr = toBackground + f2f1 * f + d;
+    *insertionPtr = endGapProb * f;
+
+    backgroundProb = b2b * backgroundProb + b2fLast * toForeground;
   }
 
   void calcForwardTransitionProbs() {
     if (endGapProb > 0) return calcForwardTransitionProbsWithGaps();
 
-    double fromBackground = backgroundProb * b2fLast;
+    double b = backgroundProb;
     double fromForeground = 0;
-    double *foregroundPtr = END(foregroundProbs);
     double *foregroundBeg = BEG(foregroundProbs);
 
-    while (foregroundPtr > foregroundBeg) {
-      --foregroundPtr;
-      double f = *foregroundPtr;
+    for (int i = 0; i < maxRepeatOffset; ++i) {
+      double f = foregroundBeg[i];
       fromForeground += f;
-      *foregroundPtr = fromBackground + f * f2f0;
-      fromBackground *= b2fGrowth;
+      foregroundBeg[i] = b * b2fProbs[i] + f * f2f0;
     }
 
-    fromForeground *= f2b;
-    backgroundProb = backgroundProb * b2b + fromForeground;
+    backgroundProb = b * b2b + fromForeground * f2b;
   }
 
   void calcBackwardTransitionProbs() {
@@ -244,18 +237,14 @@ struct Tantan {
 
     double toBackground = f2b * backgroundProb;
     double toForeground = 0;
-    double *foregroundPtr = BEG(foregroundProbs);
-    double *foregroundEnd = END(foregroundProbs);
+    double *foregroundBeg = BEG(foregroundProbs);
 
-    while (foregroundPtr < foregroundEnd) {
-      toForeground *= b2fGrowth;
-      double f = *foregroundPtr;
-      toForeground += f;
-      *foregroundPtr = toBackground + f2f0 * f;
-      ++foregroundPtr;
+    for (int i = 0; i < maxRepeatOffset; ++i) {
+      double f = foregroundBeg[i];
+      toForeground += b2fProbs[i] * f;
+      foregroundBeg[i] = toBackground + f2f0 * f;
     }
 
-    toForeground *= b2fLast;
     backgroundProb = b2b * backgroundProb + toForeground;
   }
 
@@ -281,12 +270,17 @@ struct Tantan {
     }
   }
 
-  void calcEmissionProbs() {
-    const double *lrRow = likelihoodRatioMatrix[*seqPtr];
+  bool isNearSeqBeg() {
+    return seqPtr - seqBeg < maxRepeatOffset;
+  }
 
-    bool isNearSeqBeg = (seqPtr - seqBeg < maxRepeatOffset);
-    const uchar *seqStop = isNearSeqBeg ? seqBeg : seqPtr - maxRepeatOffset;
+  const uchar *seqFurthestBack() {
+    return isNearSeqBeg() ? seqBeg : seqPtr - maxRepeatOffset;
+  }
 
+  void calcEmissionProbs() {
+    const double *lrRow = likelihoodRatioMatrix[*seqPtr];
+    const uchar *seqStop = seqFurthestBack();
     double *foregroundPtr = BEG(foregroundProbs);
     const uchar *offsetPtr = seqPtr;
 


=====================================
src/version.hh
=====================================
@@ -1 +1 @@
-"961"
+"963"



View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/15f20eb6b0da8fe96d3c318da9ab23f4526b9cdc...00ab331782aca5bc5aa1ce6eac24250f759d3c93

-- 
View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/15f20eb6b0da8fe96d3c318da9ab23f4526b9cdc...00ab331782aca5bc5aa1ce6eac24250f759d3c93
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20190111/f299043c/attachment-0001.html>