[med-svn] [Git][med-team/last-align][master] 4 commits: New upstream version 938

Andreas Tille gitlab at salsa.debian.org
Wed May 23 06:14:07 BST 2018


Andreas Tille pushed to branch master at Debian Med / last-align


Commits:
93f448c6 by Andreas Tille at 2018-05-23T07:10:05+02:00
New upstream version 938
- - - - -
9ae39417 by Andreas Tille at 2018-05-23T07:10:06+02:00
Update upstream source from tag 'upstream/938'

Update to upstream version '938'
with Debian dir 1811318611cd61e17698e98a2c6e214d330a8432
- - - - -
2307f6c7 by Andreas Tille at 2018-05-23T07:10:06+02:00
New upstream version

- - - - -
54f0b1f0 by Andreas Tille at 2018-05-23T07:13:07+02:00
Upload to unstable

- - - - -


21 changed files:

- ChangeLog.txt
- debian/changelog
- doc/.htaccess
- doc/last-train.html
- doc/last-train.txt
- doc/last-tuning.html
- doc/last-tuning.txt
- doc/lastal.html
- doc/lastal.txt
- doc/lastdb.html
- doc/lastdb.txt
- scripts/last-dotplot
- scripts/last-map-probs
- scripts/last-postmask
- scripts/last-train
- scripts/maf-convert
- scripts/maf-join
- scripts/maf-swap
- src/LastalArguments.cc
- src/LastalArguments.hh
- src/version.hh


Changes:

=====================================
ChangeLog.txt
=====================================
--- a/ChangeLog.txt
+++ b/ChangeLog.txt
@@ -1,8 +1,42 @@
+2018-05-14  Martin C. Frith  <Martin C. Frith>
+
+	* scripts/last-dotplot, scripts/last-postmask, test/102.maf, test
+	/last-postmask-test.out, test/last-postmask-test.sh, test/last-
+	split-test.out, test/last-split-test.sh:
+	postmask: fix bug for unusual alignment headers
+	[65969d4d464d] [tip]
+
+	* scripts/last-dotplot, scripts/last-map-probs, scripts/last-postmask,
+	scripts/last-train, scripts/maf-convert, scripts/maf-swap:
+	Make last-train find LAST programs more robustly
+	[8a4fadbe6080]
+
+2018-05-07  Martin C. Frith  <Martin C. Frith>
+
+	* scripts/last-postmask, scripts/last-train, scripts/maf-convert,
+	scripts/maf-join, scripts/maf-swap:
+	Modernize the Python code
+	[b1c09fdd12fe]
+
+	* doc/last-tuning.txt, doc/lastal.txt, doc/lastdb.txt,
+	src/LastalArguments.cc, src/LastalArguments.hh, test/last-test.out,
+	test/last-test.sh:
+	Change lastal -x & -z options
+	[b09d82479c91]
+
+	* doc/last-train.txt, scripts/last-train:
+	Make last-train use last-postmask
+	[2755c8f0dc79]
+
+	* scripts/last-postmask, test/last-postmask-test.out:
+	postmask: bugfix for strand-asymmetric scores
+	[9caca45429d8]
+
 2018-04-20  Martin C. Frith  <Martin C. Frith>
 
 	* doc/Makefile, doc/lastal.txt, doc/maf-cut.txt, scripts/maf-cut:
 	Add maf-cut utility
-	[482845444bcd] [tip]
+	[482845444bcd]
 
 2018-04-18  Martin C. Frith  <Martin C. Frith>
 


=====================================
debian/changelog
=====================================
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+last-align (938-1) unstable; urgency=medium
+
+  * New upstream version
+
+ -- Andreas Tille <tille at debian.org>  Wed, 23 May 2018 07:10:11 +0200
+
 last-align (932-1) unstable; urgency=medium
 
   * New upstream version


=====================================
doc/.htaccess
=====================================
--- a/doc/.htaccess
+++ b/doc/.htaccess
@@ -5,4 +5,4 @@ IndexIgnore last-map-probs.txt last-matrices.txt last-pair-probs.txt
 IndexIgnore last-papers.txt last-parallel.txt last-postmask.txt
 IndexIgnore last-repeats.txt last-seeds.txt last-split.txt last-train.txt
 IndexIgnore last-tuning.txt last-tutorial.txt last.txt lastal.txt lastdb.txt
-IndexIgnore maf-convert.txt
+IndexIgnore maf-convert.txt maf-cut.txt


=====================================
doc/last-train.html
=====================================
--- a/doc/last-train.html
+++ b/doc/last-train.html
@@ -334,7 +334,7 @@ final score parameters, in a format that can be read by <a class="reference exte
 option</a>.</p>
 <p>You can also pipe sequences into last-train, for example:</p>
 <pre class="literal-block">
-zcat queries.fasta.gz | last-train mydb
+bzcat queries.fasta.bz2 | last-train mydb
 </pre>
 <div class="section" id="options">
 <h2>Options</h2>
@@ -377,6 +377,11 @@ e.g. score(A→G) = score(G→A).</td></tr>
 optimize the parameters for low-similarity alignments,
 similarly to the BLOSUM matrices.</td></tr>
 <tr><td class="option-group">
+<kbd><span class="option">--postmask=<var>NUMBER</var></span></kbd></td>
+<td>By default, last-train ignores alignments of mostly-lowercase
+sequence (by using <a class="reference external" href="last-postmask.html">last-postmask</a>).
+To turn this off, do <tt class="docutils literal"><span class="pre">--postmask=0</span></tt>.</td></tr>
+<tr><td class="option-group">
 <kbd><span class="option">--sample-number=<var>N</var></span></kbd></td>
 <td>Use N randomly-chosen chunks of the query sequences.  The
 queries are chopped into fixed-length chunks (as if they were


=====================================
doc/last-train.txt
=====================================
--- a/doc/last-train.txt
+++ b/doc/last-train.txt
@@ -20,7 +20,7 @@ option <lastal.html#score-options>`_.
 
 You can also pipe sequences into last-train, for example::
 
-  zcat queries.fasta.gz | last-train mydb
+  bzcat queries.fasta.bz2 | last-train mydb
 
 Options
 -------
@@ -46,6 +46,10 @@ Training options
          Ignore alignments with > PID% identity.  This aims to
          optimize the parameters for low-similarity alignments,
          similarly to the BLOSUM matrices.
+  --postmask=NUMBER
+         By default, last-train ignores alignments of mostly-lowercase
+         sequence (by using `last-postmask <last-postmask.html>`_).
+         To turn this off, do ``--postmask=0``.
   --sample-number=N
          Use N randomly-chosen chunks of the query sequences.  The
          queries are chopped into fixed-length chunks (as if they were


=====================================
doc/last-tuning.html
=====================================
--- a/doc/last-tuning.html
+++ b/doc/last-tuning.html
@@ -370,7 +370,7 @@ alphabetically earliest.</p>
 <div class="section" id="lastdb8-lastal8">
 <h2>lastdb8 & lastal8</h2>
 <p>If your reference has more than about 4 billion letters, 8-byte LAST
-<em>may</em> be beneficial.  Ordinary (4-byte) LAST cannot directly handle so
+may be beneficial.  Ordinary (4-byte) LAST cannot directly handle so
 much data, so it splits it into volumes, which is inefficient.  8-byte
 LAST can handle such data without voluming, but it uses more memory.</p>
 <p>8-byte LAST combines well with lastdb option -w or -W, which reduce
@@ -401,15 +401,27 @@ high-identity matches.</p>
 <p>This option (gapless alignment culling) can make lastal <strong>faster</strong> but
 <strong>less sensitive</strong>.  It can also <strong>reduce redundant output</strong>.  For
 example, -C2 makes it discard alignments (before gapped extension)
-whose query coordinates lie in those of 2 or more stronger alignments.</p>
+whose query coordinates lie in those of 2 or more stronger alignments.
+This works well for aligning long, repeat-rich, indel-poor sequences
+(e.g. mammal chromosomes) without repeat-masking.</p>
 </div>
-<div class="section" id="lastal-x">
-<h3>lastal -x</h3>
+<div class="section" id="lastal-z">
+<h3>lastal -z</h3>
 <p>This option can make lastal <strong>faster</strong> but <strong>less sensitive</strong>.  It
 sets the maximum score drop in alignments, in the gapped extension
 phase.  Lower values make it faster, by quitting unpromising
-extensions sooner.  The default aims at best accuracy.  For example,
-use -x50% to specify 50% of the default value.</p>
+extensions sooner.  The default aims at best accuracy.</p>
+<p>You can set this option in several ways: perhaps the most intuitive is
+via maximum gap length.  For example, -z10g sets the maximum score
+drop such that the longest possible gap length is 10.</p>
+</div>
+<div class="section" id="lastal-x">
+<h3>lastal -x</h3>
+<p>This option (preliminary gapped extension) can make lastal <strong>faster</strong>
+but <strong>less sensitive</strong>.  For example, -x2g makes it extend gapped
+alignments with a maximum gap length of 2, discard those with score
+below the gapped score threshold, then redo the survivors with the
+final max score drop (z).</p>
 </div>
 <div class="section" id="id2">
 <h3>lastal -M</h3>


=====================================
doc/last-tuning.txt
=====================================
--- a/doc/last-tuning.txt
+++ b/doc/last-tuning.txt
@@ -65,7 +65,7 @@ lastdb8 & lastal8
 ~~~~~~~~~~~~~~~~~
 
 If your reference has more than about 4 billion letters, 8-byte LAST
-*may* be beneficial.  Ordinary (4-byte) LAST cannot directly handle so
+may be beneficial.  Ordinary (4-byte) LAST cannot directly handle so
 much data, so it splits it into volumes, which is inefficient.  8-byte
 LAST can handle such data without voluming, but it uses more memory.
 
@@ -102,15 +102,29 @@ This option (gapless alignment culling) can make lastal **faster** but
 **less sensitive**.  It can also **reduce redundant output**.  For
 example, -C2 makes it discard alignments (before gapped extension)
 whose query coordinates lie in those of 2 or more stronger alignments.
+This works well for aligning long, repeat-rich, indel-poor sequences
+(e.g. mammal chromosomes) without repeat-masking.
 
-lastal -x
+lastal -z
 ---------
 
 This option can make lastal **faster** but **less sensitive**.  It
 sets the maximum score drop in alignments, in the gapped extension
 phase.  Lower values make it faster, by quitting unpromising
-extensions sooner.  The default aims at best accuracy.  For example,
-use -x50% to specify 50% of the default value.
+extensions sooner.  The default aims at best accuracy.
+
+You can set this option in several ways: perhaps the most intuitive is
+via maximum gap length.  For example, -z10g sets the maximum score
+drop such that the longest possible gap length is 10.
+
+lastal -x
+---------
+
+This option (preliminary gapped extension) can make lastal **faster**
+but **less sensitive**.  For example, -x2g makes it extend gapped
+alignments with a maximum gap length of 2, discard those with score
+below the gapped score threshold, then redo the survivors with the
+final max score drop (z).
 
 lastal -M
 ---------


=====================================
doc/lastal.html
=====================================
--- a/doc/lastal.html
+++ b/doc/lastal.html
@@ -520,25 +520,37 @@ looks like this:</p>
 <p class="last">The "-1" indicates the reverse frameshift.</p>
 </td></tr>
 <tr><td class="option-group">
-<kbd><span class="option">-x <var>DROP</var></span></kbd></td>
+<kbd><span class="option">-z <var>DROP</var></span></kbd></td>
 <td><p class="first">Maximum score drop for gapped alignments.  Gapped alignments are
 forbidden from having any internal region with score < -DROP.
-This serves two purposes: accuracy (avoid spurious internal
-regions in alignments) and speed (the smaller the faster).</p>
-<p class="last">You can specify either a score (e.g. -x20), or a percentage; for
-example, -x50% specifies 50% of the default value (rounded down
-to the nearest integer).</p>
+The default value is e-1, which arguably produces the best
+alignments.  Lower values improve speed, by quitting unpromising
+extensions sooner.  You can specify this parameter in 3 ways:</p>
+<ul class="last">
+<li><p class="first">A score (e.g. -z20).</p>
+</li>
+<li><p class="first">A percentage.  For example, -z50% specifies 50% of the default
+value (rounded down to the nearest integer).</p>
+</li>
+<li><p class="first">A maximum gap length.  For example, -z8g sets the maximum
+score drop to: min[a+8b, A+8B].  However, this never increases
+the value above the default.</p>
+</li>
+</ul>
 </td></tr>
 <tr><td class="option-group">
+<kbd><span class="option">-x <var>DROP</var></span></kbd></td>
+<td>This option makes lastal extend gapped alignments twice.  First,
+it extends gapped alignments with a maximum score drop of x, and
+discards those with score < e.  The surviving alignments are
+redone with a (presumably higher) maximum score drop of z.  This
+aims to improve speed with minimal effect on the final
+alignments.  You can specify -x in the same ways as -z (with the
+default value of x being z).</td></tr>
+<tr><td class="option-group">
 <kbd><span class="option">-y <var>DROP</var></span></kbd></td>
 <td>Maximum score drop for gapless alignments.</td></tr>
 <tr><td class="option-group">
-<kbd><span class="option">-z <var>DROP</var></span></kbd></td>
-<td>Maximum score drop for final gapped alignments.  Setting z
-different from x causes lastal to extend gapless alignments
-twice: first with a maximum score drop of x, and then with a
-(presumably higher) maximum score drop of z.</td></tr>
-<tr><td class="option-group">
 <kbd><span class="option">-d <var>SCORE</var></span></kbd></td>
 <td>Minimum score for gapless alignments.</td></tr>
 <tr><td class="option-group">


=====================================
doc/lastal.txt
=====================================
--- a/doc/lastal.txt
+++ b/doc/lastal.txt
@@ -187,25 +187,34 @@ Score options
 
       The "-1" indicates the reverse frameshift.
 
-  -x DROP
+  -z DROP
       Maximum score drop for gapped alignments.  Gapped alignments are
       forbidden from having any internal region with score < -DROP.
-      This serves two purposes: accuracy (avoid spurious internal
-      regions in alignments) and speed (the smaller the faster).
+      The default value is e-1, which arguably produces the best
+      alignments.  Lower values improve speed, by quitting unpromising
+      extensions sooner.  You can specify this parameter in 3 ways:
+
+      * A score (e.g. -z20).
+
+      * A percentage.  For example, -z50% specifies 50% of the default
+        value (rounded down to the nearest integer).
 
-      You can specify either a score (e.g. -x20), or a percentage; for
-      example, -x50% specifies 50% of the default value (rounded down
-      to the nearest integer).
+      * A maximum gap length.  For example, -z8g sets the maximum
+        score drop to: min[a+8b, A+8B].  However, this never increases
+        the value above the default.
+
+  -x DROP
+      This option makes lastal extend gapped alignments twice.  First,
+      it extends gapped alignments with a maximum score drop of x, and
+      discards those with score < e.  The surviving alignments are
+      redone with a (presumably higher) maximum score drop of z.  This
+      aims to improve speed with minimal effect on the final
+      alignments.  You can specify -x in the same ways as -z (with the
+      default value of x being z).
 
   -y DROP
       Maximum score drop for gapless alignments.
 
-  -z DROP
-      Maximum score drop for final gapped alignments.  Setting z
-      different from x causes lastal to extend gapless alignments
-      twice: first with a maximum score drop of x, and then with a
-      (presumably higher) maximum score drop of z.
-
   -d SCORE
       Minimum score for gapless alignments.
 


=====================================
doc/lastdb.html
=====================================
--- a/doc/lastdb.html
+++ b/doc/lastdb.html
@@ -569,8 +569,7 @@ bytes.</p>
 <p>lastdb can become catastrophically slow for highly redundant
 sequences, e.g. two almost-identical genomes.  It usually processes
 several GB per hour, but if it becomes much slower, try option "-i
-10", which is likely to solve the problem.  (If even that is too slow,
-try "-i 100" or so.)</p>
+10", which is likely to solve the problem.</p>
 </div>
 </div>
 </body>


=====================================
doc/lastdb.txt
=====================================
--- a/doc/lastdb.txt
+++ b/doc/lastdb.txt
@@ -234,5 +234,4 @@ Limitations
 lastdb can become catastrophically slow for highly redundant
 sequences, e.g. two almost-identical genomes.  It usually processes
 several GB per hour, but if it becomes much slower, try option "-i
-10", which is likely to solve the problem.  (If even that is too slow,
-try "-i 100" or so.)
+10", which is likely to solve the problem.


=====================================
scripts/last-dotplot
=====================================
--- a/scripts/last-dotplot
+++ b/scripts/last-dotplot
@@ -18,8 +18,15 @@ import subprocess
 import itertools, optparse, os, re, sys
 
 # Try to make PIL/PILLOW work:
-try: from PIL import Image, ImageDraw, ImageFont, ImageColor
-except ImportError: import Image, ImageDraw, ImageFont, ImageColor
+try:
+    from PIL import Image, ImageDraw, ImageFont, ImageColor
+except ImportError:
+    import Image, ImageDraw, ImageFont, ImageColor
+
+try:
+    from future_builtins import zip
+except ImportError:
+    pass
 
 def myOpen(fileName):  # faster than fileinput
     if fileName is None:
@@ -75,7 +82,7 @@ def tabBlocks(beg1, beg2, blocks):
 def mafBlocks(beg1, beg2, seq1, seq2):
     '''Get the gapless blocks of an alignment, from MAF format.'''
     size = 0
-    for x, y in itertools.izip(seq1, seq2):
+    for x, y in zip(seq1, seq2):
         if x == "-":
             if size:
                 yield beg1, beg2, size
@@ -149,8 +156,8 @@ def updateSeqs(coverDict, seqRanges, seqName, ranges, coveredRange):
 
 def readAlignments(fileName, opts):
     '''Get alignments and sequence limits, from MAF or tabular format.'''
-    seqRequests1 = map(seqRequestFromText, opts.seq1)
-    seqRequests2 = map(seqRequestFromText, opts.seq2)
+    seqRequests1 = [seqRequestFromText(i) for i in opts.seq1]
+    seqRequests2 = [seqRequestFromText(i) for i in opts.seq2]
 
     alignments = []
     seqRanges1 = []


=====================================
scripts/last-map-probs
=====================================
--- a/scripts/last-map-probs
+++ b/scripts/last-map-probs
@@ -120,6 +120,6 @@ if __name__ == "__main__":
 
     try: lastMapProbs(opts, args)
     except KeyboardInterrupt: pass  # avoid silly error message
-    except Exception, e:
+    except Exception as e:
         prog = os.path.basename(sys.argv[0])
         sys.exit(prog + ": error: " + str(e))


=====================================
scripts/last-postmask
=====================================
--- a/scripts/last-postmask
+++ b/scripts/last-postmask
@@ -12,8 +12,15 @@
 # Limitations: doesn't (yet) handle sequence quality data,
 # frameshifts, or generalized affine gaps.
 
+from __future__ import print_function
+
 import gzip
-import itertools, optparse, os, signal, sys
+import optparse, os, signal, sys
+
+try:
+    from future_builtins import zip
+except ImportError:
+    pass
 
 def myOpen(fileName):
     if fileName == "-":
@@ -22,9 +29,17 @@ def myOpen(fileName):
         return gzip.open(fileName)
     return open(fileName)
 
-def getScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
+def complement(base):
+    x = "ACGTRYKMBDHV"
+    y = "TGCAYRMKVHDB"
+    i = x.find(base)
+    return y[i] if i >= 0 else base
+
+def fastScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
+    matrixLen = 128
     defaultScore = min(map(min, matrix))
-    scoreMatrix = [[defaultScore for i in range(128)] for j in range(128)]
+    fastMatrix = [[defaultScore for i in range(matrixLen)]
+                  for j in range(matrixLen)]
     for i, x in enumerate(rowHeads):
         for j, y in enumerate(colHeads):
             xu = ord(x.upper())
@@ -33,70 +48,90 @@ def getScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
             yl = ord(y.lower())
             score = matrix[i][j]
             maskScore = min(score, 0)
-            scoreMatrix[xu][yu] = score
-            scoreMatrix[xu][yl] = maskScore
-            scoreMatrix[xl][yu] = maskScore
-            scoreMatrix[xl][yl] = maskScore
-    for i in range(128):
-        scoreMatrix[i][ord("-")] = -deleteCost
-        scoreMatrix[ord("-")][i] = -insertCost
-    return scoreMatrix
-
-def isGoodAlignment(seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
+            fastMatrix[xu][yu] = score
+            fastMatrix[xu][yl] = maskScore
+            fastMatrix[xl][yu] = maskScore
+            fastMatrix[xl][yl] = maskScore
+    for i in range(matrixLen):
+        fastMatrix[i][ord("-")] = -deleteCost
+        fastMatrix[ord("-")][i] = -insertCost
+    return fastMatrix
+
+def matrixPerStrand(rowHeads, colHeads, matrix, deleteCost, insertCost):
+    rowComps = [complement(i) for i in rowHeads]
+    colComps = [complement(i) for i in colHeads]
+    fwd = fastScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost)
+    rev = fastScoreMatrix(rowComps, colComps, matrix, deleteCost, insertCost)
+    return fwd, rev
+
+def isGoodAlignment(columns, scoreMatrix, delOpenCost, insOpenCost, minScore):
     """Does the alignment have a segment with score >= minScore?"""
-    r, q = seqs
     score = 0
-    xOld = " "
-    yOld = " "
-    for x, y in itertools.izip(r, q):
+    xOld = yOld = " "
+    for x, y in columns:
         score += scoreMatrix[ord(x)][ord(y)]
-        if score >= minScore: return True
-        if x == "-" and xOld != "-": score -= insOpenCost
-        if y == "-" and yOld != "-": score -= delOpenCost
-        if score < 0: score = 0
+        if score >= minScore:
+            return True
+        if x == "-" and xOld != "-":
+            score -= insOpenCost
+        if y == "-" and yOld != "-":
+            score -= delOpenCost
+        if score < 0:
+            score = 0
         xOld = x
         yOld = y
     return False
 
 def printIfGood(maf, seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
-    if isGoodAlignment(seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
+    cols = zip(*seqs)
+    if isGoodAlignment(cols, scoreMatrix, delOpenCost, insOpenCost, minScore):
         for line in maf:
-            print line,
-        print
+            print(line, end="")
+        print()
 
 def doOneFile(lines):
+    aDel = bDel = aIns = bIns = minScore = matrices = None
+    strandParam = 0
     scoreMatrix = []
+    rowHeads = []
+    colHeads = []
     maf = []
     seqs = []
 
     for line in lines:
         if line[0] == "#":
-            print line,
-            w = line.split()
-            for i in w:
+            print(line, end="")
+            fields = line.split()
+            for i in fields:
                 if i.startswith("a="): aDel = int(i[2:])
                 if i.startswith("b="): bDel = int(i[2:])
                 if i.startswith("A="): aIns = int(i[2:])
                 if i.startswith("B="): bIns = int(i[2:])
                 if i.startswith("e="): minScore = int(i[2:])
-            if len(w) > 1 and max(map(len, w)) == 1:
-                colHeads = w[1:]
-                rowHeads = []
-                matrix = []
-            elif len(w) > 2 and len(w[1]) == 1:
-                rowHeads.append(w[1])
-                matrix.append(map(int, w[2:]))
+                if i.startswith("S="): strandParam = int(i[2:])
+            if not colHeads and len(fields) > 1 and max(map(len, fields)) == 1:
+                colHeads = fields[1:]
+            elif len(fields) > 2 and len(fields[1]) == 1:
+                rowHeads.append(fields[1])
+                scoreMatrix.append([int(i) for i in fields[2:]])
         elif line.isspace():
-            if seqs: printIfGood(maf, seqs, scoreMatrix, aDel, aIns, minScore)
+            if seqs: printIfGood(maf, seqs, matrix, aDel, aIns, minScore)
             maf = []
             seqs = []
         else:
-            if not scoreMatrix:
-                scoreMatrix = getScoreMatrix(rowHeads, colHeads, matrix,
-                                             bDel, bIns)
             maf.append(line)
-            if line[0] == "s": seqs.append(line.split()[6])
-    if seqs: printIfGood(maf, seqs, scoreMatrix, aDel, aIns, minScore)
+            if line[0] == "s":
+                if not matrices:
+                    if None in (aDel, bDel, aIns, bIns, minScore):
+                        raise Exception("can't read alignment header")
+                    matrices = matrixPerStrand(rowHeads, colHeads,
+                                               scoreMatrix, bDel, bIns)
+                fields = line.split()
+                if len(seqs) == strandParam:
+                    strand = fields[4]
+                    matrix = matrices[strand == "-"]
+                seqs.append(fields[6])
+    if seqs: printIfGood(maf, seqs, matrix, aDel, aIns, minScore)
 
 def lastPostmask(args):
     if not args:
@@ -116,6 +151,6 @@ if __name__ == "__main__":
 
     try: lastPostmask(args)
     except KeyboardInterrupt: pass  # avoid silly error message
-    except Exception, e:
+    except Exception as e:
         prog = os.path.basename(sys.argv[0])
         sys.exit(prog + ": error: " + str(e))


=====================================
scripts/last-train
=====================================
--- a/scripts/last-train
+++ b/scripts/last-train
@@ -1,6 +1,8 @@
 #! /usr/bin/env python
 # Copyright 2015 Martin C. Frith
 
+from __future__ import print_function
+
 import gzip
 import math, optparse, os, random, signal, subprocess, sys, tempfile
 
@@ -157,7 +159,7 @@ def countsFromLastOutput(lines, opts):
         if line[0] == "s":
             strand = line.split()[4]  # slow?
         if line[0] == "c":
-            c = map(float, line.split()[1:])
+            c = [float(i) for i in line.split()[1:]]
             if not matrix:
                 matrixSize = int(math.sqrt(len(c) - 10))
                 matrix = [[1.0] * matrixSize for i in range(matrixSize)]
@@ -236,14 +238,16 @@ def matProbsFromCounts(counts, opts):
     total = sum(map(sum, counts))
     probs = [[j / total for j in i] for i in counts]
 
-    print "# substitution percent identity: %g" % (100 * identities / total)
-    print
-    print "# count matrix (query letters = columns, reference letters = rows):"
+    print("# substitution percent identity: %g" % (100 * identities / total))
+    print()
+    print("# count matrix "
+          "(query letters = columns, reference letters = rows):")
     writeCountMatrix(sys.stdout, counts, "# ")
-    print
-    print "# probability matrix (query letters = columns, reference letters = rows):"
+    print()
+    print("# probability matrix "
+          "(query letters = columns, reference letters = rows):")
     writeProbMatrix(sys.stdout, probs, "# ")
-    print
+    print()
 
     return probs
 
@@ -264,19 +268,19 @@ def gapProbsFromCounts(counts, opts):
         delExtendProb = (deletes - delOpens) / deletes
         insExtendProb = (inserts - insOpens) / inserts
 
-    print "# aligned letter pairs:", matches
-    print "# deletes:", deletes
-    print "# inserts:", inserts
-    print "# delOpens:", delOpens
-    print "# insOpens:", insOpens
-    print "# alignments:", alignments
-    print "# mean delete size: %g" % (deletes / delOpens)
-    print "# mean insert size: %g" % (inserts / insOpens)
-    print "# delOpenProb: %g" % delOpenProb
-    print "# insOpenProb: %g" % insOpenProb
-    print "# delExtendProb: %g" % delExtendProb
-    print "# insExtendProb: %g" % insExtendProb
-    print
+    print("# aligned letter pairs:", matches)
+    print("# deletes:", deletes)
+    print("# inserts:", inserts)
+    print("# delOpens:", delOpens)
+    print("# insOpens:", insOpens)
+    print("# alignments:", alignments)
+    print("# mean delete size: %g" % (deletes / delOpens))
+    print("# mean insert size: %g" % (inserts / insOpens))
+    print("# delOpenProb: %g" % delOpenProb)
+    print("# insOpenProb: %g" % insOpenProb)
+    print("# delExtendProb: %g" % delExtendProb)
+    print("# insExtendProb: %g" % insExtendProb)
+    print()
 
     delCloseProb = 1 - delExtendProb
     insCloseProb = 1 - insExtendProb
@@ -301,8 +305,8 @@ def scoreFromLetterProbs(scale, pairProb, prob1, prob2):
     return scoreFromProb(scale, probRatio)
 
 def matScoresFromProbs(scale, probs):
-    rowProbs = map(sum, probs)
-    colProbs = map(sum, zip(*probs))
+    rowProbs = [sum(i) for i in probs]
+    colProbs = [sum(i) for i in zip(*probs)]
     return [[scoreFromLetterProbs(scale, j, x, y) for j, y in zip(i, colProbs)]
             for i, x in zip(probs, rowProbs)]
 
@@ -328,17 +332,17 @@ def writeGapCosts(gapCosts, out):
 
 def printGapCosts(gapCosts):
     delExistCost, insExistCost, delExtendCost, insExtendCost = gapCosts
-    print "# delExistCost:", delExistCost
-    print "# insExistCost:", insExistCost
-    print "# delExtendCost:", delExtendCost
-    print "# insExtendCost:", insExtendCost
-    print
+    print("# delExistCost:", delExistCost)
+    print("# insExistCost:", insExistCost)
+    print("# delExtendCost:", delExtendCost)
+    print("# insExtendCost:", insExtendCost)
+    print()
 
 def tryToMakeChildProgramsFindable():
-    myDir = os.path.dirname(__file__)
-    srcDir = os.path.join(myDir, os.pardir, "src")
-    # put srcDir first, to avoid getting older versions of LAST:
-    os.environ["PATH"] = srcDir + os.pathsep + os.environ["PATH"]
+    d = os.path.dirname(__file__)
+    e = os.path.join(d, os.pardir, "src")
+    # put them first, to avoid getting older versions of LAST:
+    os.environ["PATH"] = d + os.pathsep + e + os.pathsep + os.environ["PATH"]
 
 def readLastalProgName(lastdbIndexName):
     bitsPerInt = "32"
@@ -377,8 +381,11 @@ def doTraining(opts, args):
     if opts.B: x.append("-B" + opts.B)
     x += args
     y = ["last-split", "-n"]
+    z = ["last-postmask"]
     p = subprocess.Popen(x, stdout=subprocess.PIPE)
     q = subprocess.Popen(y, stdin=p.stdout, stdout=subprocess.PIPE)
+    if opts.postmask:
+        q = subprocess.Popen(z, stdin=q.stdout, stdout=subprocess.PIPE)
     lastalVersion = versionFromHeader(q.stdout)
     externalScale = scaleFromHeader(q.stdout)
     internalScale = externalScale * scaleIncrease
@@ -387,15 +394,15 @@ def doTraining(opts, args):
         internalMatrix = scaledMatrix(externalMatrix, scaleIncrease)
     oldParameters = []
 
-    print "# lastal version:", lastalVersion
-    print "# maximum percent identity:", opts.pid
-    print "# scale of score parameters:", externalScale
-    print "# scale used while training:", internalScale
-    print
+    print("# lastal version:", lastalVersion)
+    print("# maximum percent identity:", opts.pid)
+    print("# scale of score parameters:", externalScale)
+    print("# scale used while training:", internalScale)
+    print()
 
     while True:
-        print "#", " ".join(x)
-        print
+        print("#", " ".join(x))
+        print()
         sys.stdout.flush()
         matCounts, gapCounts = countsFromLastOutput(q.stdout, opts)
         gapProbs = gapProbsFromCounts(gapCounts, opts)
@@ -407,9 +414,10 @@ def doTraining(opts, args):
         else:
             matProbs = matProbsFromCounts(matCounts, opts)
             matScores = matScoresFromProbs(internalScale, matProbs)
-            print "# score matrix (query letters = columns, reference letters = rows):"
+            print("# score matrix "
+                  "(query letters = columns, reference letters = rows):")
             writeScoreMatrix(sys.stdout, matScores, "# ")
-            print
+            print()
             parameters = gapCosts, matScores
             if parameters in oldParameters: break
             oldParameters.append(parameters)
@@ -423,6 +431,8 @@ def doTraining(opts, args):
         p.stdin.close()
         # in python2.6, the next line must come after p.stdin.close()
         q = subprocess.Popen(y, stdin=p.stdout, stdout=subprocess.PIPE)
+        if opts.postmask:
+            q = subprocess.Popen(z, stdin=q.stdout, stdout=subprocess.PIPE)
 
     gapCosts = gapCostsFromProbs(externalScale, gapProbs)
     writeGapCosts(gapCosts, sys.stdout)
@@ -431,7 +441,8 @@ def doTraining(opts, args):
     if not opts.Q:
         matScores = matScoresFromProbs(externalScale, matProbs)
         externalMatrix = matrixWithLetters(matScores)
-    print "# score matrix (query letters = columns, reference letters = rows):"
+    print("# score matrix "
+          "(query letters = columns, reference letters = rows):")
     writeMatrixWithLetters(sys.stdout, externalMatrix, "")
 
 def lastTrain(opts, args):
@@ -464,6 +475,8 @@ if __name__ == "__main__":
                   help="force insertion/deletion symmetry")
     og.add_option("--pid", type="float", default=100, help=
                   "skip alignments with > PID% identity (default: %default)")
+    og.add_option("--postmask", type="int", metavar="NUMBER", default=1, help=
+                  "skip mostly-lowercase alignments (default=%default)")
     og.add_option("--sample-number", type="int", default=500, metavar="N",
                   help="number of random sequence samples (default: %default)")
     og.add_option("--sample-length", type="int", default=2000, metavar="L",
@@ -518,6 +531,6 @@ if __name__ == "__main__":
 
     try: lastTrain(opts, args)
     except KeyboardInterrupt: pass  # avoid silly error message
-    except Exception, e:
+    except Exception as e:
         prog = os.path.basename(sys.argv[0])
         sys.exit(prog + ": error: " + str(e))


=====================================
scripts/maf-convert
=====================================
--- a/scripts/maf-convert
+++ b/scripts/maf-convert
@@ -6,6 +6,8 @@
 # By "MAF" we mean "multiple alignment format" described in the UCSC
 # Genome FAQ, not e.g. "MIRA assembly format".
 
+from __future__ import print_function
+
 from itertools import *
 import gzip
 import math, optparse, os, signal, sys
@@ -126,7 +128,7 @@ def mafInput(opts, lines):
         elif line[0] == "#":
             updateEvalueParameters(opts, line)
             if opts.isKeepComments:
-                print line,
+                print(line, end="")
     if sLines: yield aLine, sLines, qLines, pLines
 
 def isJoinable(opts, oldMaf, newMaf):
@@ -178,10 +180,10 @@ def writeAxt(maf):
     if score:
         outWords.append(score)
 
-    print " ".join(outWords)
+    print(" ".join(outWords))
     for i in sLines:
-        print i[6]
-    print  # print a blank line at the end
+        print(i[6])
+    print()  # print a blank line at the end
 
 def mafConvertToAxt(opts, lines):
     for maf in mafInput(opts, lines):
@@ -212,7 +214,7 @@ def writeTab(maf):
     gapWord = ",".join(matchAndInsertSizes(alignmentColumns, letterSizes))
     outWords.append(gapWord)
 
-    print "\t".join(outWords + endWords)
+    print("\t".join(outWords + endWords))
 
 def mafConvertToTab(opts, lines):
     for maf in mafInput(opts, lines):
@@ -236,7 +238,7 @@ def writeChain(maf):
 
     outWords.append(str(next(axtCounter) + 1))
 
-    print " ".join(outWords)
+    print(" ".join(outWords))
 
     letterSizes = [i[3] for i in sLines]
     rows = [i[6] for i in sLines]
@@ -244,11 +246,11 @@ def writeChain(maf):
     size = "0"
     for i in matchAndInsertSizes(alignmentColumns, letterSizes):
         if ":" in i:
-            print size + "\t" + i.replace(":", "\t")
+            print(size + "\t" + i.replace(":", "\t"))
             size = "0"
         else:
             size = i
-    print size
+    print(size)
 
 def mafConvertToChain(opts, lines):
     for maf in mafInput(opts, lines):
@@ -358,7 +360,7 @@ def writePsl(opts, mafs):
                 seqNameB, seqLenB, begB, endB, seqNameA, seqLenA, begA, endA,
                 blockCount, blockSizes, blockStartsB, blockStartsA)
 
-    print "\t".join(map(str, outWords))
+    print("\t".join(map(str, outWords)))
 
 def mafConvertToPsl(opts, lines):
     if opts.join:
@@ -409,12 +411,12 @@ def copyDictFile(lines):
             break
 
 def writeSamHeader(opts, fileNames):
-    print "@HD\tVN:1.3\tSO:unsorted"
+    print("@HD\tVN:1.3\tSO:unsorted")
 
     if opts.dictionary:
         sequenceLengths = dict(readSequenceLengths(fileNames))
         for k in sorted(sequenceLengths, key=karyotypicSortKey):
-            print "@SQ\tSN:%s\tLN:%s" % (k, sequenceLengths[k])
+            print("@SQ\tSN:%s\tLN:%s" % (k, sequenceLengths[k]))
 
     if opts.dictfile:
         f = myOpen(opts.dictfile)
@@ -422,7 +424,7 @@ def writeSamHeader(opts, fileNames):
         f.close()
 
     if opts.readgroup:
-        print "@RG\t" + "\t".join(opts.readgroup.split())
+        print("@RG\t" + "\t".join(opts.readgroup.split()))
 
 mapqMissing = "255"
 mapqMaximum = "254"
@@ -537,7 +539,7 @@ def writeSam(readGroup, maf):
     if score: out.append(score)
     if evalue: out.append(evalue)
     if readGroup: out.append(readGroup)
-    print "\t".join(out)
+    print("\t".join(out))
 
 def mafConvertToSam(opts, lines):
     readGroup = ""
@@ -598,13 +600,13 @@ def writeBlast(opts, maf, oldQueryName):
     seqNameB, seqLenB, strandB, letterSizeB, begB, endB, rowB = fieldsB
 
     if seqNameB != oldQueryName:
-        print "Query= " + seqNameB
-        print "         (%s letters)" % seqLenB
-        print
+        print("Query= " + seqNameB)
+        print("         (%s letters)" % seqLenB)
+        print()
 
-    print ">" + seqNameA
-    print "          Length = %s" % seqLenA
-    print
+    print(">" + seqNameA)
+    print("          Length = %s" % seqLenA)
+    print()
 
     score, evalue = scoreAndEvalue(aLine)
 
@@ -617,7 +619,7 @@ def writeBlast(opts, maf, oldQueryName):
     if evalue:
         scoreLine += ", Expect = %s" % evalue
 
-    print scoreLine
+    print(scoreLine)
 
     alignmentColumns = zip(rowA, rowB)
 
@@ -629,19 +631,19 @@ def writeBlast(opts, maf, oldQueryName):
     if gaps:
         gapPercent = 100 * gaps // alnSize  # round down, like BLAST
         identLine += ", Gaps = %s/%s (%s%%)" % (gaps, alnSize, gapPercent)
-    print identLine
+    print(identLine)
 
-    print " Strand = %s / %s" % (strandText(strandB), strandText(strandA))
-    print
+    print(" Strand = %s / %s" % (strandText(strandB), strandText(strandA)))
+    print()
 
     for chunk in blastChunker(sLines, opts.linesize, alignmentColumns):
         cols, rows, begs, ends = chunk
         begWidth = maxlen(begs)
         matchSymbols = ''.join(map(pairwiseMatchSymbol, cols))
-        print "Query: %-*s %s %s" % (begWidth, begs[1], rows[1], ends[1])
-        print "       %-*s %s"    % (begWidth, " ", matchSymbols)
-        print "Sbjct: %-*s %s %s" % (begWidth, begs[0], rows[0], ends[0])
-        print
+        print("Query: %-*s %s %s" % (begWidth, begs[1], rows[1], ends[1]))
+        print("       %-*s %s"    % (begWidth, " ", matchSymbols))
+        print("Sbjct: %-*s %s %s" % (begWidth, begs[0], rows[0], ends[0]))
+        print()
 
 def mafConvertToBlast(opts, lines):
     oldQueryName = ""
@@ -682,7 +684,7 @@ def writeBlastTab(opts, maf):
             bitScore = opts.bitScoreA * float(score) - opts.bitScoreB
             out.append("%.3g" % bitScore)
 
-    print "\t".join(out)
+    print("\t".join(out))
 
 def mafConvertToBlastTab(opts, lines):
     for maf in mafInput(opts, lines):
@@ -691,7 +693,7 @@ def mafConvertToBlastTab(opts, lines):
 ##### Routines for converting to HTML format: #####
 
 def writeHtmlHeader():
-    print '''
+    print('''
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
  "http://www.w3.org/TR/html4/strict.dtd">
 <html lang="en"><head>
@@ -718,7 +720,7 @@ pre {font-family: "Courier New", monospace, serif; font-size: 0.8125em}
 <pre class="key"><span class="e">  </span> prob > 0.5  </pre>
 <pre class="key"><span class="f">  </span> prob ≤ 0.5  </pre>
 </div>
-'''
+''')
 
 def probabilityClass(probabilityColumn):
     p = ord(min(probabilityColumn)) - 33
@@ -756,7 +758,7 @@ def writeHtml(opts, maf):
         scoreLine += " score=" + score
         if evalue:
             scoreLine += ", expect=" + evalue
-    print "<h3>%s:</h3>" % scoreLine
+    print("<h3>%s:</h3>" % scoreLine)
 
     if pLines:
         probRows = [i.split()[1] for i in pLines]
@@ -770,7 +772,7 @@ def writeHtml(opts, maf):
     rows = [i[6] for i in sLines]
     alignmentColumns = zip(*rows)
 
-    print '<pre>'
+    print('<pre>')
     for chunk in blastChunker(sLines, opts.linesize, alignmentColumns):
         cols, rows, begs, ends = chunk
         begWidth = maxlen(begs)
@@ -782,10 +784,10 @@ def writeHtml(opts, maf):
             spans = [htmlSpan(r, i) for i in classRuns]
             spans = ''.join(spans)
             formatParams = nameWidth, n, begWidth, b, spans, endWidth, e
-            print '%-*s %*s %s %*s' % formatParams
-        print ' ' * nameWidth, ' ' * begWidth, matchSymbols
-        print
-    print '</pre>'
+            print('%-*s %*s %s %*s' % formatParams)
+        print(' ' * nameWidth, ' ' * begWidth, matchSymbols)
+        print()
+    print('</pre>')
 
 def mafConvertToHtml(opts, lines):
     for maf in mafInput(opts, lines):
@@ -841,7 +843,7 @@ def mafConvert(opts, args):
 
     if not opts.noheader:
         if isFormat(formatName, "html"):
-            print "</body></html>"
+            print("</body></html>")
 
 if __name__ == "__main__":
     signal.signal(signal.SIGPIPE, signal.SIG_DFL)  # avoid silly error message
@@ -882,6 +884,6 @@ if __name__ == "__main__":
         op.error("need file (not pipe) with option -d")
 
     try: mafConvert(opts, args)
-    except Exception, e:
+    except Exception as e:
         prog = os.path.basename(sys.argv[0])
         sys.exit(prog + ": error: " + str(e))


=====================================
scripts/maf-join
=====================================
--- a/scripts/maf-join
+++ b/scripts/maf-join
@@ -10,6 +10,8 @@
 # WARNING: Alignment columns with a gap in the top genome are joined
 # arbitrarily!!!
 
+from __future__ import print_function
+
 import sys, os, fileinput, optparse, signal
 
 signal.signal(signal.SIGPIPE, signal.SIG_DFL)  # stop spurious error message
@@ -70,16 +72,16 @@ class MafBlock:
         seq = [''.join(i) for i in self.seq]
         columns = self.chr, beg, size, self.strand, self.chrSize, seq
         widths = map(maxLen, columns)
-        print 'a'
+        print('a')
         for row in zip(*columns):
             widthsAndFields = zip(widths, row)
             field0 = "%-*s" % widthsAndFields[0]  # left-justify
             fields = ["%*s" % i for i in widthsAndFields[1:]]  # right-justify
-            print 's', field0, ' '.join(fields)
+            print('s', field0, ' '.join(fields))
         pad = ' '.join(' ' * i for i in widths[:-1])
         for i in self.prob:
-            print 'p', pad, i
-        print  # blank line afterwards
+            print('p', pad, i)
+        print()  # blank line afterwards
 
 def topSeqBeg(maf): return maf.beg[0]
 def topSeqEnd(maf): return maf.end[0]


=====================================
scripts/maf-swap
=====================================
--- a/scripts/maf-swap
+++ b/scripts/maf-swap
@@ -9,11 +9,13 @@
 
 # Seems to work with Python 2.x, x>=4
 
+from __future__ import print_function
+
 import fileinput, itertools, optparse, os, signal, string, sys
 
 def filterComments(lines):
     for i in lines:
-        if i.startswith("#"): print i,
+        if i.startswith("#"): print(i, end="")
         else: yield i
 
 def mafInput(lines):
@@ -119,8 +121,8 @@ def mafSwap(opts, args):
         mafLines[1:stop] = mafLines[start:stop] + mafLines[1:start]
         if not isCanonicalStrand(mafLines[1]):
             mafLines = flippedMaf(mafLines)
-        for i in mafLines: print i,
-        print  # blank line after each alignment
+        for i in mafLines: print(i, end="")
+        print()  # blank line after each alignment
 
 if __name__ == "__main__":
     signal.signal(signal.SIGPIPE, signal.SIG_DFL)  # avoid silly error message
@@ -135,6 +137,6 @@ if __name__ == "__main__":
 
     try: mafSwap(opts, args)
     except KeyboardInterrupt: pass  # avoid silly error message
-    except Exception, e:
+    except Exception as e:
         prog = os.path.basename(sys.argv[0])
         sys.exit(prog + ": error: " + str(e))


=====================================
src/LastalArguments.cc
=====================================
--- a/src/LastalArguments.cc
+++ b/src/LastalArguments.cc
@@ -41,15 +41,14 @@ static char parseOutputFormat( const char* text ){
   return 0;
 }
 
-static int parseScoreOrPercent(const std::string &s) {
+static int parseScoreAndSuffix(const std::string &s, char &suffix) {
+  char c = 0;
   int x;
-  std::istringstream iss(s);
-  if(!(iss >> x) || x < 0) ERR("bad value: " + s);
-  char c;
-  if (iss >> c) {
-    if (c != '%' || iss >> c) ERR("bad value: " + s);
-    x = -x;  // negative value indicates percentage (kludge)
+  std::istringstream i(s);
+  if (!(i >> x) || x < 0 || (i >> c && i >> c)) {
+    ERR("bad value: " + s);
   }
+  suffix = c;
   return x;
 }
 
@@ -78,9 +77,11 @@ LastalArguments::LastalArguments() :
   gapPairCost(-1),  // this means: OFF
   frameshiftCost(-1),  // this means: ordinary, non-translated alignment
   matrixFile(""),
-  maxDropGapped(-100),  // depends on minScoreGapped & maxDropGapless
+  maxDropGapped(100),  // depends on maxDropFinal
+  maxDropGappedSuffix('%'),
   maxDropGapless(-1),  // depends on the score matrix
-  maxDropFinal(-1),  // depends on maxDropGapped
+  maxDropFinal(100),  // depends on minScoreGapped
+  maxDropFinalSuffix('%'),
   inputFormat(sequenceFormat::fasta),
   minHitDepth(1),
   maxHitDepth(-1),
@@ -128,9 +129,9 @@ Score options (default settings):\n\
 -B: insertion extension cost (b)\n\
 -c: unaligned residue pair cost (off)\n\
 -F: frameshift cost (off)\n\
--x: maximum score drop for gapped alignments (e-1)\n\
+-x: maximum score drop for preliminary gapped alignments (z)\n\
 -y: maximum score drop for gapless alignments (min[t*10, x])\n\
--z: maximum score drop for final gapped alignments (x)\n\
+-z: maximum score drop for final gapped alignments (e-1)\n\
 -d: minimum score for gapless alignments (min[e, t*ln(1000*refSize/n)])\n\
 -e: minimum score for gapped alignments\n\
 \n\
@@ -234,15 +235,14 @@ LAST home page: http://last.cbrc.jp/\n\
       if( frameshiftCost < 0 ) badopt( c, optarg );
       break;
     case 'x':
-      maxDropGapped = parseScoreOrPercent(optarg);
+      maxDropGapped = parseScoreAndSuffix(optarg, maxDropGappedSuffix);
       break;
     case 'y':
       unstringify( maxDropGapless, optarg );
       if( maxDropGapless < 0 ) badopt( c, optarg );
       break;
     case 'z':
-      unstringify( maxDropFinal, optarg );
-      if( maxDropFinal < 0 ) badopt( c, optarg );
+      maxDropFinal = parseScoreAndSuffix(optarg, maxDropFinalSuffix);
       break;
     case 'd':
       unstringify( minScoreGapless, optarg );
@@ -523,6 +523,10 @@ void LastalArguments::setDefaultsFromAlphabet( bool isDna, bool isProtein,
   }
 }
 
+static int percent(int val, int percentage) {
+  return (percentage == 100) ? val : val * percentage / 100;
+}
+
 void LastalArguments::setDefaultsFromMatrix( double lambda, int minScore ){
   if( outputType < 2 && minScoreGapped < 0 ) minScoreGapped = minScoreGapless;
   if( minScoreGapped < 0 ){
@@ -536,11 +540,19 @@ To proceed without E-values, set a score threshold with option -e.");
 
   if( temperature < 0 ) temperature = 1 / lambda;
 
-  if( maxDropGapped < 0 ){
-    int percent = -maxDropGapped;
-    maxDropGapped = std::max( minScoreGapped - 1, 0 );
-    maxDropGapped = std::max( maxDropGapped, maxDropGapless );
-    if (percent != 100) maxDropGapped = maxDropGapped * percent / 100;
+  int defaultMaxScoreDrop = std::max(minScoreGapped - 1, 0);
+  defaultMaxScoreDrop = std::max(defaultMaxScoreDrop, maxDropGapless);
+
+  if (maxDropFinalSuffix == '%') {
+    maxDropFinal = percent(defaultMaxScoreDrop, maxDropFinal);
+  } else if (maxDropFinalSuffix == 'g') {
+    maxDropFinal = std::min(minGapCost(maxDropFinal), defaultMaxScoreDrop);
+  }
+
+  if (maxDropGappedSuffix == '%') {
+    maxDropGapped = percent(maxDropFinal, maxDropGapped);
+  } else if (maxDropGappedSuffix == 'g') {
+    maxDropGapped = std::min(minGapCost(maxDropGapped), maxDropFinal);
   }
 
   if( maxDropGapless < 0 ){  // should it depend on temperature or lambda?
@@ -548,8 +560,6 @@ To proceed without E-values, set a score threshold with option -e.");
     else                  maxDropGapless = int( 10.0 * temperature + 0.5 );
     maxDropGapless = std::min( maxDropGapless, maxDropGapped );
   }
-
-  if( maxDropFinal < 0 ) maxDropFinal = maxDropGapped;
 }
 
 int LastalArguments::calcMinScoreGapless( double numLettersInReference,


=====================================
src/LastalArguments.hh
=====================================
--- a/src/LastalArguments.hh
+++ b/src/LastalArguments.hh
@@ -56,6 +56,11 @@ struct LastalArguments{
   // how many strands are we scanning (1 or 2)?
   int numOfStrands() const{ return (strand == 2) ? 2 : 1; }
 
+  int minGapCost(int gapLength) const {
+    return std::min(gapExistCost + gapLength * gapExtendCost,
+		    insExistCost + gapLength * insExtendCost);
+  }
+
   // options:
   int outputFormat;
   int outputType;
@@ -80,8 +85,10 @@ struct LastalArguments{
   int frameshiftCost;
   std::string matrixFile;
   int maxDropGapped;
+  char maxDropGappedSuffix;
   int maxDropGapless;
   int maxDropFinal;
+  char maxDropFinalSuffix;
   sequenceFormat::Enum inputFormat;
   size_t minHitDepth;
   size_t maxHitDepth;


=====================================
src/version.hh
=====================================
--- a/src/version.hh
+++ b/src/version.hh
@@ -1 +1 @@
-"932"
+"938"



View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/e2061edc0f126034f89e1bf00ed3aad468447d23...54f0b1f06cf4f118ee23f5da391f6776a40cbc36

-- 
View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/e2061edc0f126034f89e1bf00ed3aad468447d23...54f0b1f06cf4f118ee23f5da391f6776a40cbc36
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20180523/5057cfd6/attachment-0001.html>


More information about the debian-med-commit mailing list