[med-svn] [Git][med-team/last-align][master] 4 commits: New upstream version 938
Andreas Tille
gitlab at salsa.debian.org
Wed May 23 06:14:07 BST 2018
Andreas Tille pushed to branch master at Debian Med / last-align
Commits:
93f448c6 by Andreas Tille at 2018-05-23T07:10:05+02:00
New upstream version 938
- - - - -
9ae39417 by Andreas Tille at 2018-05-23T07:10:06+02:00
Update upstream source from tag 'upstream/938'
Update to upstream version '938'
with Debian dir 1811318611cd61e17698e98a2c6e214d330a8432
- - - - -
2307f6c7 by Andreas Tille at 2018-05-23T07:10:06+02:00
New upstream version
- - - - -
54f0b1f0 by Andreas Tille at 2018-05-23T07:13:07+02:00
Upload to unstable
- - - - -
21 changed files:
- ChangeLog.txt
- debian/changelog
- doc/.htaccess
- doc/last-train.html
- doc/last-train.txt
- doc/last-tuning.html
- doc/last-tuning.txt
- doc/lastal.html
- doc/lastal.txt
- doc/lastdb.html
- doc/lastdb.txt
- scripts/last-dotplot
- scripts/last-map-probs
- scripts/last-postmask
- scripts/last-train
- scripts/maf-convert
- scripts/maf-join
- scripts/maf-swap
- src/LastalArguments.cc
- src/LastalArguments.hh
- src/version.hh
Changes:
=====================================
ChangeLog.txt
=====================================
--- a/ChangeLog.txt
+++ b/ChangeLog.txt
@@ -1,8 +1,42 @@
+2018-05-14 Martin C. Frith <Martin C. Frith>
+
+ * scripts/last-dotplot, scripts/last-postmask, test/102.maf, test
+ /last-postmask-test.out, test/last-postmask-test.sh, test/last-
+ split-test.out, test/last-split-test.sh:
+ postmask: fix bug for unusual alignment headers
+ [65969d4d464d] [tip]
+
+ * scripts/last-dotplot, scripts/last-map-probs, scripts/last-postmask,
+ scripts/last-train, scripts/maf-convert, scripts/maf-swap:
+ Make last-train find LAST programs more robustly
+ [8a4fadbe6080]
+
+2018-05-07 Martin C. Frith <Martin C. Frith>
+
+ * scripts/last-postmask, scripts/last-train, scripts/maf-convert,
+ scripts/maf-join, scripts/maf-swap:
+ Modernize the Python code
+ [b1c09fdd12fe]
+
+ * doc/last-tuning.txt, doc/lastal.txt, doc/lastdb.txt,
+ src/LastalArguments.cc, src/LastalArguments.hh, test/last-test.out,
+ test/last-test.sh:
+ Change lastal -x & -z options
+ [b09d82479c91]
+
+ * doc/last-train.txt, scripts/last-train:
+ Make last-train use last-postmask
+ [2755c8f0dc79]
+
+ * scripts/last-postmask, test/last-postmask-test.out:
+ postmask: bugfix for strand-asymmetric scores
+ [9caca45429d8]
+
2018-04-20 Martin C. Frith <Martin C. Frith>
* doc/Makefile, doc/lastal.txt, doc/maf-cut.txt, scripts/maf-cut:
Add maf-cut utility
- [482845444bcd] [tip]
+ [482845444bcd]
2018-04-18 Martin C. Frith <Martin C. Frith>
=====================================
debian/changelog
=====================================
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+last-align (938-1) unstable; urgency=medium
+
+ * New upstream version
+
+ -- Andreas Tille <tille at debian.org> Wed, 23 May 2018 07:10:11 +0200
+
last-align (932-1) unstable; urgency=medium
* New upstream version
=====================================
doc/.htaccess
=====================================
--- a/doc/.htaccess
+++ b/doc/.htaccess
@@ -5,4 +5,4 @@ IndexIgnore last-map-probs.txt last-matrices.txt last-pair-probs.txt
IndexIgnore last-papers.txt last-parallel.txt last-postmask.txt
IndexIgnore last-repeats.txt last-seeds.txt last-split.txt last-train.txt
IndexIgnore last-tuning.txt last-tutorial.txt last.txt lastal.txt lastdb.txt
-IndexIgnore maf-convert.txt
+IndexIgnore maf-convert.txt maf-cut.txt
=====================================
doc/last-train.html
=====================================
--- a/doc/last-train.html
+++ b/doc/last-train.html
@@ -334,7 +334,7 @@ final score parameters, in a format that can be read by <a class="reference exte
option</a>.</p>
<p>You can also pipe sequences into last-train, for example:</p>
<pre class="literal-block">
-zcat queries.fasta.gz | last-train mydb
+bzcat queries.fasta.bz2 | last-train mydb
</pre>
<div class="section" id="options">
<h2>Options</h2>
@@ -377,6 +377,11 @@ e.g. score(A→G) = score(G→A).</td></tr>
optimize the parameters for low-similarity alignments,
similarly to the BLOSUM matrices.</td></tr>
<tr><td class="option-group">
+<kbd><span class="option">--postmask=<var>NUMBER</var></span></kbd></td>
+<td>By default, last-train ignores alignments of mostly-lowercase
+sequence (by using <a class="reference external" href="last-postmask.html">last-postmask</a>).
+To turn this off, do <tt class="docutils literal"><span class="pre">--postmask=0</span></tt>.</td></tr>
+<tr><td class="option-group">
<kbd><span class="option">--sample-number=<var>N</var></span></kbd></td>
<td>Use N randomly-chosen chunks of the query sequences. The
queries are chopped into fixed-length chunks (as if they were
=====================================
doc/last-train.txt
=====================================
--- a/doc/last-train.txt
+++ b/doc/last-train.txt
@@ -20,7 +20,7 @@ option <lastal.html#score-options>`_.
You can also pipe sequences into last-train, for example::
- zcat queries.fasta.gz | last-train mydb
+ bzcat queries.fasta.bz2 | last-train mydb
Options
-------
@@ -46,6 +46,10 @@ Training options
Ignore alignments with > PID% identity. This aims to
optimize the parameters for low-similarity alignments,
similarly to the BLOSUM matrices.
+ --postmask=NUMBER
+ By default, last-train ignores alignments of mostly-lowercase
+ sequence (by using `last-postmask <last-postmask.html>`_).
+ To turn this off, do ``--postmask=0``.
--sample-number=N
Use N randomly-chosen chunks of the query sequences. The
queries are chopped into fixed-length chunks (as if they were
=====================================
doc/last-tuning.html
=====================================
--- a/doc/last-tuning.html
+++ b/doc/last-tuning.html
@@ -370,7 +370,7 @@ alphabetically earliest.</p>
<div class="section" id="lastdb8-lastal8">
<h2>lastdb8 & lastal8</h2>
<p>If your reference has more than about 4 billion letters, 8-byte LAST
-<em>may</em> be beneficial. Ordinary (4-byte) LAST cannot directly handle so
+may be beneficial. Ordinary (4-byte) LAST cannot directly handle so
much data, so it splits it into volumes, which is inefficient. 8-byte
LAST can handle such data without voluming, but it uses more memory.</p>
<p>8-byte LAST combines well with lastdb option -w or -W, which reduce
@@ -401,15 +401,27 @@ high-identity matches.</p>
<p>This option (gapless alignment culling) can make lastal <strong>faster</strong> but
<strong>less sensitive</strong>. It can also <strong>reduce redundant output</strong>. For
example, -C2 makes it discard alignments (before gapped extension)
-whose query coordinates lie in those of 2 or more stronger alignments.</p>
+whose query coordinates lie in those of 2 or more stronger alignments.
+This works well for aligning long, repeat-rich, indel-poor sequences
+(e.g. mammal chromosomes) without repeat-masking.</p>
</div>
-<div class="section" id="lastal-x">
-<h3>lastal -x</h3>
+<div class="section" id="lastal-z">
+<h3>lastal -z</h3>
<p>This option can make lastal <strong>faster</strong> but <strong>less sensitive</strong>. It
sets the maximum score drop in alignments, in the gapped extension
phase. Lower values make it faster, by quitting unpromising
-extensions sooner. The default aims at best accuracy. For example,
-use -x50% to specify 50% of the default value.</p>
+extensions sooner. The default aims at best accuracy.</p>
+<p>You can set this option in several ways: perhaps the most intuitive is
+via maximum gap length. For example, -z10g sets the maximum score
+drop such that the longest possible gap length is 10.</p>
+</div>
+<div class="section" id="lastal-x">
+<h3>lastal -x</h3>
+<p>This option (preliminary gapped extension) can make lastal <strong>faster</strong>
+but <strong>less sensitive</strong>. For example, -x2g makes it extend gapped
+alignments with a maximum gap length of 2, discard those with score
+below the gapped score threshold, then redo the survivors with the
+final max score drop (z).</p>
</div>
<div class="section" id="id2">
<h3>lastal -M</h3>
=====================================
doc/last-tuning.txt
=====================================
--- a/doc/last-tuning.txt
+++ b/doc/last-tuning.txt
@@ -65,7 +65,7 @@ lastdb8 & lastal8
~~~~~~~~~~~~~~~~~
If your reference has more than about 4 billion letters, 8-byte LAST
-*may* be beneficial. Ordinary (4-byte) LAST cannot directly handle so
+may be beneficial. Ordinary (4-byte) LAST cannot directly handle so
much data, so it splits it into volumes, which is inefficient. 8-byte
LAST can handle such data without voluming, but it uses more memory.
@@ -102,15 +102,29 @@ This option (gapless alignment culling) can make lastal **faster** but
**less sensitive**. It can also **reduce redundant output**. For
example, -C2 makes it discard alignments (before gapped extension)
whose query coordinates lie in those of 2 or more stronger alignments.
+This works well for aligning long, repeat-rich, indel-poor sequences
+(e.g. mammal chromosomes) without repeat-masking.
-lastal -x
+lastal -z
---------
This option can make lastal **faster** but **less sensitive**. It
sets the maximum score drop in alignments, in the gapped extension
phase. Lower values make it faster, by quitting unpromising
-extensions sooner. The default aims at best accuracy. For example,
-use -x50% to specify 50% of the default value.
+extensions sooner. The default aims at best accuracy.
+
+You can set this option in several ways: perhaps the most intuitive is
+via maximum gap length. For example, -z10g sets the maximum score
+drop such that the longest possible gap length is 10.
+
+lastal -x
+---------
+
+This option (preliminary gapped extension) can make lastal **faster**
+but **less sensitive**. For example, -x2g makes it extend gapped
+alignments with a maximum gap length of 2, discard those with score
+below the gapped score threshold, then redo the survivors with the
+final max score drop (z).
lastal -M
---------
=====================================
doc/lastal.html
=====================================
--- a/doc/lastal.html
+++ b/doc/lastal.html
@@ -520,25 +520,37 @@ looks like this:</p>
<p class="last">The "-1" indicates the reverse frameshift.</p>
</td></tr>
<tr><td class="option-group">
-<kbd><span class="option">-x <var>DROP</var></span></kbd></td>
+<kbd><span class="option">-z <var>DROP</var></span></kbd></td>
<td><p class="first">Maximum score drop for gapped alignments. Gapped alignments are
forbidden from having any internal region with score < -DROP.
-This serves two purposes: accuracy (avoid spurious internal
-regions in alignments) and speed (the smaller the faster).</p>
-<p class="last">You can specify either a score (e.g. -x20), or a percentage; for
-example, -x50% specifies 50% of the default value (rounded down
-to the nearest integer).</p>
+The default value is e-1, which arguably produces the best
+alignments. Lower values improve speed, by quitting unpromising
+extensions sooner. You can specify this parameter in 3 ways:</p>
+<ul class="last">
+<li><p class="first">A score (e.g. -z20).</p>
+</li>
+<li><p class="first">A percentage. For example, -z50% specifies 50% of the default
+value (rounded down to the nearest integer).</p>
+</li>
+<li><p class="first">A maximum gap length. For example, -z8g sets the maximum
+score drop to: min[a+8b, A+8B]. However, this never increases
+the value above the default.</p>
+</li>
+</ul>
</td></tr>
<tr><td class="option-group">
+<kbd><span class="option">-x <var>DROP</var></span></kbd></td>
+<td>This option makes lastal extend gapped alignments twice. First,
+it extends gapped alignments with a maximum score drop of x, and
+discards those with score < e. The surviving alignments are
+redone with a (presumably higher) maximum score drop of z. This
+aims to improve speed with minimal effect on the final
+alignments. You can specify -x in the same ways as -z (with the
+default value of x being z).</td></tr>
+<tr><td class="option-group">
<kbd><span class="option">-y <var>DROP</var></span></kbd></td>
<td>Maximum score drop for gapless alignments.</td></tr>
<tr><td class="option-group">
-<kbd><span class="option">-z <var>DROP</var></span></kbd></td>
-<td>Maximum score drop for final gapped alignments. Setting z
-different from x causes lastal to extend gapless alignments
-twice: first with a maximum score drop of x, and then with a
-(presumably higher) maximum score drop of z.</td></tr>
-<tr><td class="option-group">
<kbd><span class="option">-d <var>SCORE</var></span></kbd></td>
<td>Minimum score for gapless alignments.</td></tr>
<tr><td class="option-group">
=====================================
doc/lastal.txt
=====================================
--- a/doc/lastal.txt
+++ b/doc/lastal.txt
@@ -187,25 +187,34 @@ Score options
The "-1" indicates the reverse frameshift.
- -x DROP
+ -z DROP
Maximum score drop for gapped alignments. Gapped alignments are
forbidden from having any internal region with score < -DROP.
- This serves two purposes: accuracy (avoid spurious internal
- regions in alignments) and speed (the smaller the faster).
+ The default value is e-1, which arguably produces the best
+ alignments. Lower values improve speed, by quitting unpromising
+ extensions sooner. You can specify this parameter in 3 ways:
+
+ * A score (e.g. -z20).
+
+ * A percentage. For example, -z50% specifies 50% of the default
+ value (rounded down to the nearest integer).
- You can specify either a score (e.g. -x20), or a percentage; for
- example, -x50% specifies 50% of the default value (rounded down
- to the nearest integer).
+ * A maximum gap length. For example, -z8g sets the maximum
+ score drop to: min[a+8b, A+8B]. However, this never increases
+ the value above the default.
+
+ -x DROP
+ This option makes lastal extend gapped alignments twice. First,
+ it extends gapped alignments with a maximum score drop of x, and
+ discards those with score < e. The surviving alignments are
+ redone with a (presumably higher) maximum score drop of z. This
+ aims to improve speed with minimal effect on the final
+ alignments. You can specify -x in the same ways as -z (with the
+ default value of x being z).
-y DROP
Maximum score drop for gapless alignments.
- -z DROP
- Maximum score drop for final gapped alignments. Setting z
- different from x causes lastal to extend gapless alignments
- twice: first with a maximum score drop of x, and then with a
- (presumably higher) maximum score drop of z.
-
-d SCORE
Minimum score for gapless alignments.
=====================================
doc/lastdb.html
=====================================
--- a/doc/lastdb.html
+++ b/doc/lastdb.html
@@ -569,8 +569,7 @@ bytes.</p>
<p>lastdb can become catastrophically slow for highly redundant
sequences, e.g. two almost-identical genomes. It usually processes
several GB per hour, but if it becomes much slower, try option "-i
-10", which is likely to solve the problem. (If even that is too slow,
-try "-i 100" or so.)</p>
+10", which is likely to solve the problem.</p>
</div>
</div>
</body>
=====================================
doc/lastdb.txt
=====================================
--- a/doc/lastdb.txt
+++ b/doc/lastdb.txt
@@ -234,5 +234,4 @@ Limitations
lastdb can become catastrophically slow for highly redundant
sequences, e.g. two almost-identical genomes. It usually processes
several GB per hour, but if it becomes much slower, try option "-i
-10", which is likely to solve the problem. (If even that is too slow,
-try "-i 100" or so.)
+10", which is likely to solve the problem.
=====================================
scripts/last-dotplot
=====================================
--- a/scripts/last-dotplot
+++ b/scripts/last-dotplot
@@ -18,8 +18,15 @@ import subprocess
import itertools, optparse, os, re, sys
# Try to make PIL/PILLOW work:
-try: from PIL import Image, ImageDraw, ImageFont, ImageColor
-except ImportError: import Image, ImageDraw, ImageFont, ImageColor
+try:
+ from PIL import Image, ImageDraw, ImageFont, ImageColor
+except ImportError:
+ import Image, ImageDraw, ImageFont, ImageColor
+
+try:
+ from future_builtins import zip
+except ImportError:
+ pass
def myOpen(fileName): # faster than fileinput
if fileName is None:
@@ -75,7 +82,7 @@ def tabBlocks(beg1, beg2, blocks):
def mafBlocks(beg1, beg2, seq1, seq2):
'''Get the gapless blocks of an alignment, from MAF format.'''
size = 0
- for x, y in itertools.izip(seq1, seq2):
+ for x, y in zip(seq1, seq2):
if x == "-":
if size:
yield beg1, beg2, size
@@ -149,8 +156,8 @@ def updateSeqs(coverDict, seqRanges, seqName, ranges, coveredRange):
def readAlignments(fileName, opts):
'''Get alignments and sequence limits, from MAF or tabular format.'''
- seqRequests1 = map(seqRequestFromText, opts.seq1)
- seqRequests2 = map(seqRequestFromText, opts.seq2)
+ seqRequests1 = [seqRequestFromText(i) for i in opts.seq1]
+ seqRequests2 = [seqRequestFromText(i) for i in opts.seq2]
alignments = []
seqRanges1 = []
=====================================
scripts/last-map-probs
=====================================
--- a/scripts/last-map-probs
+++ b/scripts/last-map-probs
@@ -120,6 +120,6 @@ if __name__ == "__main__":
try: lastMapProbs(opts, args)
except KeyboardInterrupt: pass # avoid silly error message
- except Exception, e:
+ except Exception as e:
prog = os.path.basename(sys.argv[0])
sys.exit(prog + ": error: " + str(e))
=====================================
scripts/last-postmask
=====================================
--- a/scripts/last-postmask
+++ b/scripts/last-postmask
@@ -12,8 +12,15 @@
# Limitations: doesn't (yet) handle sequence quality data,
# frameshifts, or generalized affine gaps.
+from __future__ import print_function
+
import gzip
-import itertools, optparse, os, signal, sys
+import optparse, os, signal, sys
+
+try:
+ from future_builtins import zip
+except ImportError:
+ pass
def myOpen(fileName):
if fileName == "-":
@@ -22,9 +29,17 @@ def myOpen(fileName):
return gzip.open(fileName)
return open(fileName)
-def getScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
+def complement(base):
+ x = "ACGTRYKMBDHV"
+ y = "TGCAYRMKVHDB"
+ i = x.find(base)
+ return y[i] if i >= 0 else base
+
+def fastScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
+ matrixLen = 128
defaultScore = min(map(min, matrix))
- scoreMatrix = [[defaultScore for i in range(128)] for j in range(128)]
+ fastMatrix = [[defaultScore for i in range(matrixLen)]
+ for j in range(matrixLen)]
for i, x in enumerate(rowHeads):
for j, y in enumerate(colHeads):
xu = ord(x.upper())
@@ -33,70 +48,90 @@ def getScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost):
yl = ord(y.lower())
score = matrix[i][j]
maskScore = min(score, 0)
- scoreMatrix[xu][yu] = score
- scoreMatrix[xu][yl] = maskScore
- scoreMatrix[xl][yu] = maskScore
- scoreMatrix[xl][yl] = maskScore
- for i in range(128):
- scoreMatrix[i][ord("-")] = -deleteCost
- scoreMatrix[ord("-")][i] = -insertCost
- return scoreMatrix
-
-def isGoodAlignment(seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
+ fastMatrix[xu][yu] = score
+ fastMatrix[xu][yl] = maskScore
+ fastMatrix[xl][yu] = maskScore
+ fastMatrix[xl][yl] = maskScore
+ for i in range(matrixLen):
+ fastMatrix[i][ord("-")] = -deleteCost
+ fastMatrix[ord("-")][i] = -insertCost
+ return fastMatrix
+
+def matrixPerStrand(rowHeads, colHeads, matrix, deleteCost, insertCost):
+ rowComps = [complement(i) for i in rowHeads]
+ colComps = [complement(i) for i in colHeads]
+ fwd = fastScoreMatrix(rowHeads, colHeads, matrix, deleteCost, insertCost)
+ rev = fastScoreMatrix(rowComps, colComps, matrix, deleteCost, insertCost)
+ return fwd, rev
+
+def isGoodAlignment(columns, scoreMatrix, delOpenCost, insOpenCost, minScore):
"""Does the alignment have a segment with score >= minScore?"""
- r, q = seqs
score = 0
- xOld = " "
- yOld = " "
- for x, y in itertools.izip(r, q):
+ xOld = yOld = " "
+ for x, y in columns:
score += scoreMatrix[ord(x)][ord(y)]
- if score >= minScore: return True
- if x == "-" and xOld != "-": score -= insOpenCost
- if y == "-" and yOld != "-": score -= delOpenCost
- if score < 0: score = 0
+ if score >= minScore:
+ return True
+ if x == "-" and xOld != "-":
+ score -= insOpenCost
+ if y == "-" and yOld != "-":
+ score -= delOpenCost
+ if score < 0:
+ score = 0
xOld = x
yOld = y
return False
def printIfGood(maf, seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
- if isGoodAlignment(seqs, scoreMatrix, delOpenCost, insOpenCost, minScore):
+ cols = zip(*seqs)
+ if isGoodAlignment(cols, scoreMatrix, delOpenCost, insOpenCost, minScore):
for line in maf:
- print line,
- print
+ print(line, end="")
+ print()
def doOneFile(lines):
+ aDel = bDel = aIns = bIns = minScore = matrices = None
+ strandParam = 0
scoreMatrix = []
+ rowHeads = []
+ colHeads = []
maf = []
seqs = []
for line in lines:
if line[0] == "#":
- print line,
- w = line.split()
- for i in w:
+ print(line, end="")
+ fields = line.split()
+ for i in fields:
if i.startswith("a="): aDel = int(i[2:])
if i.startswith("b="): bDel = int(i[2:])
if i.startswith("A="): aIns = int(i[2:])
if i.startswith("B="): bIns = int(i[2:])
if i.startswith("e="): minScore = int(i[2:])
- if len(w) > 1 and max(map(len, w)) == 1:
- colHeads = w[1:]
- rowHeads = []
- matrix = []
- elif len(w) > 2 and len(w[1]) == 1:
- rowHeads.append(w[1])
- matrix.append(map(int, w[2:]))
+ if i.startswith("S="): strandParam = int(i[2:])
+ if not colHeads and len(fields) > 1 and max(map(len, fields)) == 1:
+ colHeads = fields[1:]
+ elif len(fields) > 2 and len(fields[1]) == 1:
+ rowHeads.append(fields[1])
+ scoreMatrix.append([int(i) for i in fields[2:]])
elif line.isspace():
- if seqs: printIfGood(maf, seqs, scoreMatrix, aDel, aIns, minScore)
+ if seqs: printIfGood(maf, seqs, matrix, aDel, aIns, minScore)
maf = []
seqs = []
else:
- if not scoreMatrix:
- scoreMatrix = getScoreMatrix(rowHeads, colHeads, matrix,
- bDel, bIns)
maf.append(line)
- if line[0] == "s": seqs.append(line.split()[6])
- if seqs: printIfGood(maf, seqs, scoreMatrix, aDel, aIns, minScore)
+ if line[0] == "s":
+ if not matrices:
+ if None in (aDel, bDel, aIns, bIns, minScore):
+ raise Exception("can't read alignment header")
+ matrices = matrixPerStrand(rowHeads, colHeads,
+ scoreMatrix, bDel, bIns)
+ fields = line.split()
+ if len(seqs) == strandParam:
+ strand = fields[4]
+ matrix = matrices[strand == "-"]
+ seqs.append(fields[6])
+ if seqs: printIfGood(maf, seqs, matrix, aDel, aIns, minScore)
def lastPostmask(args):
if not args:
@@ -116,6 +151,6 @@ if __name__ == "__main__":
try: lastPostmask(args)
except KeyboardInterrupt: pass # avoid silly error message
- except Exception, e:
+ except Exception as e:
prog = os.path.basename(sys.argv[0])
sys.exit(prog + ": error: " + str(e))
=====================================
scripts/last-train
=====================================
--- a/scripts/last-train
+++ b/scripts/last-train
@@ -1,6 +1,8 @@
#! /usr/bin/env python
# Copyright 2015 Martin C. Frith
+from __future__ import print_function
+
import gzip
import math, optparse, os, random, signal, subprocess, sys, tempfile
@@ -157,7 +159,7 @@ def countsFromLastOutput(lines, opts):
if line[0] == "s":
strand = line.split()[4] # slow?
if line[0] == "c":
- c = map(float, line.split()[1:])
+ c = [float(i) for i in line.split()[1:]]
if not matrix:
matrixSize = int(math.sqrt(len(c) - 10))
matrix = [[1.0] * matrixSize for i in range(matrixSize)]
@@ -236,14 +238,16 @@ def matProbsFromCounts(counts, opts):
total = sum(map(sum, counts))
probs = [[j / total for j in i] for i in counts]
- print "# substitution percent identity: %g" % (100 * identities / total)
- print
- print "# count matrix (query letters = columns, reference letters = rows):"
+ print("# substitution percent identity: %g" % (100 * identities / total))
+ print()
+ print("# count matrix "
+ "(query letters = columns, reference letters = rows):")
writeCountMatrix(sys.stdout, counts, "# ")
- print
- print "# probability matrix (query letters = columns, reference letters = rows):"
+ print()
+ print("# probability matrix "
+ "(query letters = columns, reference letters = rows):")
writeProbMatrix(sys.stdout, probs, "# ")
- print
+ print()
return probs
@@ -264,19 +268,19 @@ def gapProbsFromCounts(counts, opts):
delExtendProb = (deletes - delOpens) / deletes
insExtendProb = (inserts - insOpens) / inserts
- print "# aligned letter pairs:", matches
- print "# deletes:", deletes
- print "# inserts:", inserts
- print "# delOpens:", delOpens
- print "# insOpens:", insOpens
- print "# alignments:", alignments
- print "# mean delete size: %g" % (deletes / delOpens)
- print "# mean insert size: %g" % (inserts / insOpens)
- print "# delOpenProb: %g" % delOpenProb
- print "# insOpenProb: %g" % insOpenProb
- print "# delExtendProb: %g" % delExtendProb
- print "# insExtendProb: %g" % insExtendProb
- print
+ print("# aligned letter pairs:", matches)
+ print("# deletes:", deletes)
+ print("# inserts:", inserts)
+ print("# delOpens:", delOpens)
+ print("# insOpens:", insOpens)
+ print("# alignments:", alignments)
+ print("# mean delete size: %g" % (deletes / delOpens))
+ print("# mean insert size: %g" % (inserts / insOpens))
+ print("# delOpenProb: %g" % delOpenProb)
+ print("# insOpenProb: %g" % insOpenProb)
+ print("# delExtendProb: %g" % delExtendProb)
+ print("# insExtendProb: %g" % insExtendProb)
+ print()
delCloseProb = 1 - delExtendProb
insCloseProb = 1 - insExtendProb
@@ -301,8 +305,8 @@ def scoreFromLetterProbs(scale, pairProb, prob1, prob2):
return scoreFromProb(scale, probRatio)
def matScoresFromProbs(scale, probs):
- rowProbs = map(sum, probs)
- colProbs = map(sum, zip(*probs))
+ rowProbs = [sum(i) for i in probs]
+ colProbs = [sum(i) for i in zip(*probs)]
return [[scoreFromLetterProbs(scale, j, x, y) for j, y in zip(i, colProbs)]
for i, x in zip(probs, rowProbs)]
@@ -328,17 +332,17 @@ def writeGapCosts(gapCosts, out):
def printGapCosts(gapCosts):
delExistCost, insExistCost, delExtendCost, insExtendCost = gapCosts
- print "# delExistCost:", delExistCost
- print "# insExistCost:", insExistCost
- print "# delExtendCost:", delExtendCost
- print "# insExtendCost:", insExtendCost
- print
+ print("# delExistCost:", delExistCost)
+ print("# insExistCost:", insExistCost)
+ print("# delExtendCost:", delExtendCost)
+ print("# insExtendCost:", insExtendCost)
+ print()
def tryToMakeChildProgramsFindable():
- myDir = os.path.dirname(__file__)
- srcDir = os.path.join(myDir, os.pardir, "src")
- # put srcDir first, to avoid getting older versions of LAST:
- os.environ["PATH"] = srcDir + os.pathsep + os.environ["PATH"]
+ d = os.path.dirname(__file__)
+ e = os.path.join(d, os.pardir, "src")
+ # put them first, to avoid getting older versions of LAST:
+ os.environ["PATH"] = d + os.pathsep + e + os.pathsep + os.environ["PATH"]
def readLastalProgName(lastdbIndexName):
bitsPerInt = "32"
@@ -377,8 +381,11 @@ def doTraining(opts, args):
if opts.B: x.append("-B" + opts.B)
x += args
y = ["last-split", "-n"]
+ z = ["last-postmask"]
p = subprocess.Popen(x, stdout=subprocess.PIPE)
q = subprocess.Popen(y, stdin=p.stdout, stdout=subprocess.PIPE)
+ if opts.postmask:
+ q = subprocess.Popen(z, stdin=q.stdout, stdout=subprocess.PIPE)
lastalVersion = versionFromHeader(q.stdout)
externalScale = scaleFromHeader(q.stdout)
internalScale = externalScale * scaleIncrease
@@ -387,15 +394,15 @@ def doTraining(opts, args):
internalMatrix = scaledMatrix(externalMatrix, scaleIncrease)
oldParameters = []
- print "# lastal version:", lastalVersion
- print "# maximum percent identity:", opts.pid
- print "# scale of score parameters:", externalScale
- print "# scale used while training:", internalScale
- print
+ print("# lastal version:", lastalVersion)
+ print("# maximum percent identity:", opts.pid)
+ print("# scale of score parameters:", externalScale)
+ print("# scale used while training:", internalScale)
+ print()
while True:
- print "#", " ".join(x)
- print
+ print("#", " ".join(x))
+ print()
sys.stdout.flush()
matCounts, gapCounts = countsFromLastOutput(q.stdout, opts)
gapProbs = gapProbsFromCounts(gapCounts, opts)
@@ -407,9 +414,10 @@ def doTraining(opts, args):
else:
matProbs = matProbsFromCounts(matCounts, opts)
matScores = matScoresFromProbs(internalScale, matProbs)
- print "# score matrix (query letters = columns, reference letters = rows):"
+ print("# score matrix "
+ "(query letters = columns, reference letters = rows):")
writeScoreMatrix(sys.stdout, matScores, "# ")
- print
+ print()
parameters = gapCosts, matScores
if parameters in oldParameters: break
oldParameters.append(parameters)
@@ -423,6 +431,8 @@ def doTraining(opts, args):
p.stdin.close()
# in python2.6, the next line must come after p.stdin.close()
q = subprocess.Popen(y, stdin=p.stdout, stdout=subprocess.PIPE)
+ if opts.postmask:
+ q = subprocess.Popen(z, stdin=q.stdout, stdout=subprocess.PIPE)
gapCosts = gapCostsFromProbs(externalScale, gapProbs)
writeGapCosts(gapCosts, sys.stdout)
@@ -431,7 +441,8 @@ def doTraining(opts, args):
if not opts.Q:
matScores = matScoresFromProbs(externalScale, matProbs)
externalMatrix = matrixWithLetters(matScores)
- print "# score matrix (query letters = columns, reference letters = rows):"
+ print("# score matrix "
+ "(query letters = columns, reference letters = rows):")
writeMatrixWithLetters(sys.stdout, externalMatrix, "")
def lastTrain(opts, args):
@@ -464,6 +475,8 @@ if __name__ == "__main__":
help="force insertion/deletion symmetry")
og.add_option("--pid", type="float", default=100, help=
"skip alignments with > PID% identity (default: %default)")
+ og.add_option("--postmask", type="int", metavar="NUMBER", default=1, help=
+ "skip mostly-lowercase alignments (default=%default)")
og.add_option("--sample-number", type="int", default=500, metavar="N",
help="number of random sequence samples (default: %default)")
og.add_option("--sample-length", type="int", default=2000, metavar="L",
@@ -518,6 +531,6 @@ if __name__ == "__main__":
try: lastTrain(opts, args)
except KeyboardInterrupt: pass # avoid silly error message
- except Exception, e:
+ except Exception as e:
prog = os.path.basename(sys.argv[0])
sys.exit(prog + ": error: " + str(e))
=====================================
scripts/maf-convert
=====================================
--- a/scripts/maf-convert
+++ b/scripts/maf-convert
@@ -6,6 +6,8 @@
# By "MAF" we mean "multiple alignment format" described in the UCSC
# Genome FAQ, not e.g. "MIRA assembly format".
+from __future__ import print_function
+
from itertools import *
import gzip
import math, optparse, os, signal, sys
@@ -126,7 +128,7 @@ def mafInput(opts, lines):
elif line[0] == "#":
updateEvalueParameters(opts, line)
if opts.isKeepComments:
- print line,
+ print(line, end="")
if sLines: yield aLine, sLines, qLines, pLines
def isJoinable(opts, oldMaf, newMaf):
@@ -178,10 +180,10 @@ def writeAxt(maf):
if score:
outWords.append(score)
- print " ".join(outWords)
+ print(" ".join(outWords))
for i in sLines:
- print i[6]
- print # print a blank line at the end
+ print(i[6])
+ print() # print a blank line at the end
def mafConvertToAxt(opts, lines):
for maf in mafInput(opts, lines):
@@ -212,7 +214,7 @@ def writeTab(maf):
gapWord = ",".join(matchAndInsertSizes(alignmentColumns, letterSizes))
outWords.append(gapWord)
- print "\t".join(outWords + endWords)
+ print("\t".join(outWords + endWords))
def mafConvertToTab(opts, lines):
for maf in mafInput(opts, lines):
@@ -236,7 +238,7 @@ def writeChain(maf):
outWords.append(str(next(axtCounter) + 1))
- print " ".join(outWords)
+ print(" ".join(outWords))
letterSizes = [i[3] for i in sLines]
rows = [i[6] for i in sLines]
@@ -244,11 +246,11 @@ def writeChain(maf):
size = "0"
for i in matchAndInsertSizes(alignmentColumns, letterSizes):
if ":" in i:
- print size + "\t" + i.replace(":", "\t")
+ print(size + "\t" + i.replace(":", "\t"))
size = "0"
else:
size = i
- print size
+ print(size)
def mafConvertToChain(opts, lines):
for maf in mafInput(opts, lines):
@@ -358,7 +360,7 @@ def writePsl(opts, mafs):
seqNameB, seqLenB, begB, endB, seqNameA, seqLenA, begA, endA,
blockCount, blockSizes, blockStartsB, blockStartsA)
- print "\t".join(map(str, outWords))
+ print("\t".join(map(str, outWords)))
def mafConvertToPsl(opts, lines):
if opts.join:
@@ -409,12 +411,12 @@ def copyDictFile(lines):
break
def writeSamHeader(opts, fileNames):
- print "@HD\tVN:1.3\tSO:unsorted"
+ print("@HD\tVN:1.3\tSO:unsorted")
if opts.dictionary:
sequenceLengths = dict(readSequenceLengths(fileNames))
for k in sorted(sequenceLengths, key=karyotypicSortKey):
- print "@SQ\tSN:%s\tLN:%s" % (k, sequenceLengths[k])
+ print("@SQ\tSN:%s\tLN:%s" % (k, sequenceLengths[k]))
if opts.dictfile:
f = myOpen(opts.dictfile)
@@ -422,7 +424,7 @@ def writeSamHeader(opts, fileNames):
f.close()
if opts.readgroup:
- print "@RG\t" + "\t".join(opts.readgroup.split())
+ print("@RG\t" + "\t".join(opts.readgroup.split()))
mapqMissing = "255"
mapqMaximum = "254"
@@ -537,7 +539,7 @@ def writeSam(readGroup, maf):
if score: out.append(score)
if evalue: out.append(evalue)
if readGroup: out.append(readGroup)
- print "\t".join(out)
+ print("\t".join(out))
def mafConvertToSam(opts, lines):
readGroup = ""
@@ -598,13 +600,13 @@ def writeBlast(opts, maf, oldQueryName):
seqNameB, seqLenB, strandB, letterSizeB, begB, endB, rowB = fieldsB
if seqNameB != oldQueryName:
- print "Query= " + seqNameB
- print " (%s letters)" % seqLenB
- print
+ print("Query= " + seqNameB)
+ print(" (%s letters)" % seqLenB)
+ print()
- print ">" + seqNameA
- print " Length = %s" % seqLenA
- print
+ print(">" + seqNameA)
+ print(" Length = %s" % seqLenA)
+ print()
score, evalue = scoreAndEvalue(aLine)
@@ -617,7 +619,7 @@ def writeBlast(opts, maf, oldQueryName):
if evalue:
scoreLine += ", Expect = %s" % evalue
- print scoreLine
+ print(scoreLine)
alignmentColumns = zip(rowA, rowB)
@@ -629,19 +631,19 @@ def writeBlast(opts, maf, oldQueryName):
if gaps:
gapPercent = 100 * gaps // alnSize # round down, like BLAST
identLine += ", Gaps = %s/%s (%s%%)" % (gaps, alnSize, gapPercent)
- print identLine
+ print(identLine)
- print " Strand = %s / %s" % (strandText(strandB), strandText(strandA))
- print
+ print(" Strand = %s / %s" % (strandText(strandB), strandText(strandA)))
+ print()
for chunk in blastChunker(sLines, opts.linesize, alignmentColumns):
cols, rows, begs, ends = chunk
begWidth = maxlen(begs)
matchSymbols = ''.join(map(pairwiseMatchSymbol, cols))
- print "Query: %-*s %s %s" % (begWidth, begs[1], rows[1], ends[1])
- print " %-*s %s" % (begWidth, " ", matchSymbols)
- print "Sbjct: %-*s %s %s" % (begWidth, begs[0], rows[0], ends[0])
- print
+ print("Query: %-*s %s %s" % (begWidth, begs[1], rows[1], ends[1]))
+ print(" %-*s %s" % (begWidth, " ", matchSymbols))
+ print("Sbjct: %-*s %s %s" % (begWidth, begs[0], rows[0], ends[0]))
+ print()
def mafConvertToBlast(opts, lines):
oldQueryName = ""
@@ -682,7 +684,7 @@ def writeBlastTab(opts, maf):
bitScore = opts.bitScoreA * float(score) - opts.bitScoreB
out.append("%.3g" % bitScore)
- print "\t".join(out)
+ print("\t".join(out))
def mafConvertToBlastTab(opts, lines):
for maf in mafInput(opts, lines):
@@ -691,7 +693,7 @@ def mafConvertToBlastTab(opts, lines):
##### Routines for converting to HTML format: #####
def writeHtmlHeader():
- print '''
+ print('''
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en"><head>
@@ -718,7 +720,7 @@ pre {font-family: "Courier New", monospace, serif; font-size: 0.8125em}
<pre class="key"><span class="e"> </span> prob > 0.5 </pre>
<pre class="key"><span class="f"> </span> prob ≤ 0.5 </pre>
</div>
-'''
+''')
def probabilityClass(probabilityColumn):
p = ord(min(probabilityColumn)) - 33
@@ -756,7 +758,7 @@ def writeHtml(opts, maf):
scoreLine += " score=" + score
if evalue:
scoreLine += ", expect=" + evalue
- print "<h3>%s:</h3>" % scoreLine
+ print("<h3>%s:</h3>" % scoreLine)
if pLines:
probRows = [i.split()[1] for i in pLines]
@@ -770,7 +772,7 @@ def writeHtml(opts, maf):
rows = [i[6] for i in sLines]
alignmentColumns = zip(*rows)
- print '<pre>'
+ print('<pre>')
for chunk in blastChunker(sLines, opts.linesize, alignmentColumns):
cols, rows, begs, ends = chunk
begWidth = maxlen(begs)
@@ -782,10 +784,10 @@ def writeHtml(opts, maf):
spans = [htmlSpan(r, i) for i in classRuns]
spans = ''.join(spans)
formatParams = nameWidth, n, begWidth, b, spans, endWidth, e
- print '%-*s %*s %s %*s' % formatParams
- print ' ' * nameWidth, ' ' * begWidth, matchSymbols
- print
- print '</pre>'
+ print('%-*s %*s %s %*s' % formatParams)
+ print(' ' * nameWidth, ' ' * begWidth, matchSymbols)
+ print()
+ print('</pre>')
def mafConvertToHtml(opts, lines):
for maf in mafInput(opts, lines):
@@ -841,7 +843,7 @@ def mafConvert(opts, args):
if not opts.noheader:
if isFormat(formatName, "html"):
- print "</body></html>"
+ print("</body></html>")
if __name__ == "__main__":
signal.signal(signal.SIGPIPE, signal.SIG_DFL) # avoid silly error message
@@ -882,6 +884,6 @@ if __name__ == "__main__":
op.error("need file (not pipe) with option -d")
try: mafConvert(opts, args)
- except Exception, e:
+ except Exception as e:
prog = os.path.basename(sys.argv[0])
sys.exit(prog + ": error: " + str(e))
=====================================
scripts/maf-join
=====================================
--- a/scripts/maf-join
+++ b/scripts/maf-join
@@ -10,6 +10,8 @@
# WARNING: Alignment columns with a gap in the top genome are joined
# arbitrarily!!!
+from __future__ import print_function
+
import sys, os, fileinput, optparse, signal
signal.signal(signal.SIGPIPE, signal.SIG_DFL) # stop spurious error message
@@ -70,16 +72,16 @@ class MafBlock:
seq = [''.join(i) for i in self.seq]
columns = self.chr, beg, size, self.strand, self.chrSize, seq
widths = map(maxLen, columns)
- print 'a'
+ print('a')
for row in zip(*columns):
widthsAndFields = zip(widths, row)
field0 = "%-*s" % widthsAndFields[0] # left-justify
fields = ["%*s" % i for i in widthsAndFields[1:]] # right-justify
- print 's', field0, ' '.join(fields)
+ print('s', field0, ' '.join(fields))
pad = ' '.join(' ' * i for i in widths[:-1])
for i in self.prob:
- print 'p', pad, i
- print # blank line afterwards
+ print('p', pad, i)
+ print() # blank line afterwards
def topSeqBeg(maf): return maf.beg[0]
def topSeqEnd(maf): return maf.end[0]
=====================================
scripts/maf-swap
=====================================
--- a/scripts/maf-swap
+++ b/scripts/maf-swap
@@ -9,11 +9,13 @@
# Seems to work with Python 2.x, x>=4
+from __future__ import print_function
+
import fileinput, itertools, optparse, os, signal, string, sys
def filterComments(lines):
for i in lines:
- if i.startswith("#"): print i,
+ if i.startswith("#"): print(i, end="")
else: yield i
def mafInput(lines):
@@ -119,8 +121,8 @@ def mafSwap(opts, args):
mafLines[1:stop] = mafLines[start:stop] + mafLines[1:start]
if not isCanonicalStrand(mafLines[1]):
mafLines = flippedMaf(mafLines)
- for i in mafLines: print i,
- print # blank line after each alignment
+ for i in mafLines: print(i, end="")
+ print() # blank line after each alignment
if __name__ == "__main__":
signal.signal(signal.SIGPIPE, signal.SIG_DFL) # avoid silly error message
@@ -135,6 +137,6 @@ if __name__ == "__main__":
try: mafSwap(opts, args)
except KeyboardInterrupt: pass # avoid silly error message
- except Exception, e:
+ except Exception as e:
prog = os.path.basename(sys.argv[0])
sys.exit(prog + ": error: " + str(e))
=====================================
src/LastalArguments.cc
=====================================
--- a/src/LastalArguments.cc
+++ b/src/LastalArguments.cc
@@ -41,15 +41,14 @@ static char parseOutputFormat( const char* text ){
return 0;
}
-static int parseScoreOrPercent(const std::string &s) {
+static int parseScoreAndSuffix(const std::string &s, char &suffix) {
+ char c = 0;
int x;
- std::istringstream iss(s);
- if(!(iss >> x) || x < 0) ERR("bad value: " + s);
- char c;
- if (iss >> c) {
- if (c != '%' || iss >> c) ERR("bad value: " + s);
- x = -x; // negative value indicates percentage (kludge)
+ std::istringstream i(s);
+ if (!(i >> x) || x < 0 || (i >> c && i >> c)) {
+ ERR("bad value: " + s);
}
+ suffix = c;
return x;
}
@@ -78,9 +77,11 @@ LastalArguments::LastalArguments() :
gapPairCost(-1), // this means: OFF
frameshiftCost(-1), // this means: ordinary, non-translated alignment
matrixFile(""),
- maxDropGapped(-100), // depends on minScoreGapped & maxDropGapless
+ maxDropGapped(100), // depends on maxDropFinal
+ maxDropGappedSuffix('%'),
maxDropGapless(-1), // depends on the score matrix
- maxDropFinal(-1), // depends on maxDropGapped
+ maxDropFinal(100), // depends on minScoreGapped
+ maxDropFinalSuffix('%'),
inputFormat(sequenceFormat::fasta),
minHitDepth(1),
maxHitDepth(-1),
@@ -128,9 +129,9 @@ Score options (default settings):\n\
-B: insertion extension cost (b)\n\
-c: unaligned residue pair cost (off)\n\
-F: frameshift cost (off)\n\
--x: maximum score drop for gapped alignments (e-1)\n\
+-x: maximum score drop for preliminary gapped alignments (z)\n\
-y: maximum score drop for gapless alignments (min[t*10, x])\n\
--z: maximum score drop for final gapped alignments (x)\n\
+-z: maximum score drop for final gapped alignments (e-1)\n\
-d: minimum score for gapless alignments (min[e, t*ln(1000*refSize/n)])\n\
-e: minimum score for gapped alignments\n\
\n\
@@ -234,15 +235,14 @@ LAST home page: http://last.cbrc.jp/\n\
if( frameshiftCost < 0 ) badopt( c, optarg );
break;
case 'x':
- maxDropGapped = parseScoreOrPercent(optarg);
+ maxDropGapped = parseScoreAndSuffix(optarg, maxDropGappedSuffix);
break;
case 'y':
unstringify( maxDropGapless, optarg );
if( maxDropGapless < 0 ) badopt( c, optarg );
break;
case 'z':
- unstringify( maxDropFinal, optarg );
- if( maxDropFinal < 0 ) badopt( c, optarg );
+ maxDropFinal = parseScoreAndSuffix(optarg, maxDropFinalSuffix);
break;
case 'd':
unstringify( minScoreGapless, optarg );
@@ -523,6 +523,10 @@ void LastalArguments::setDefaultsFromAlphabet( bool isDna, bool isProtein,
}
}
+static int percent(int val, int percentage) {
+ return (percentage == 100) ? val : val * percentage / 100;
+}
+
void LastalArguments::setDefaultsFromMatrix( double lambda, int minScore ){
if( outputType < 2 && minScoreGapped < 0 ) minScoreGapped = minScoreGapless;
if( minScoreGapped < 0 ){
@@ -536,11 +540,19 @@ To proceed without E-values, set a score threshold with option -e.");
if( temperature < 0 ) temperature = 1 / lambda;
- if( maxDropGapped < 0 ){
- int percent = -maxDropGapped;
- maxDropGapped = std::max( minScoreGapped - 1, 0 );
- maxDropGapped = std::max( maxDropGapped, maxDropGapless );
- if (percent != 100) maxDropGapped = maxDropGapped * percent / 100;
+ int defaultMaxScoreDrop = std::max(minScoreGapped - 1, 0);
+ defaultMaxScoreDrop = std::max(defaultMaxScoreDrop, maxDropGapless);
+
+ if (maxDropFinalSuffix == '%') {
+ maxDropFinal = percent(defaultMaxScoreDrop, maxDropFinal);
+ } else if (maxDropFinalSuffix == 'g') {
+ maxDropFinal = std::min(minGapCost(maxDropFinal), defaultMaxScoreDrop);
+ }
+
+ if (maxDropGappedSuffix == '%') {
+ maxDropGapped = percent(maxDropFinal, maxDropGapped);
+ } else if (maxDropGappedSuffix == 'g') {
+ maxDropGapped = std::min(minGapCost(maxDropGapped), maxDropFinal);
}
if( maxDropGapless < 0 ){ // should it depend on temperature or lambda?
@@ -548,8 +560,6 @@ To proceed without E-values, set a score threshold with option -e.");
else maxDropGapless = int( 10.0 * temperature + 0.5 );
maxDropGapless = std::min( maxDropGapless, maxDropGapped );
}
-
- if( maxDropFinal < 0 ) maxDropFinal = maxDropGapped;
}
int LastalArguments::calcMinScoreGapless( double numLettersInReference,
=====================================
src/LastalArguments.hh
=====================================
--- a/src/LastalArguments.hh
+++ b/src/LastalArguments.hh
@@ -56,6 +56,11 @@ struct LastalArguments{
// how many strands are we scanning (1 or 2)?
int numOfStrands() const{ return (strand == 2) ? 2 : 1; }
+ int minGapCost(int gapLength) const {
+ return std::min(gapExistCost + gapLength * gapExtendCost,
+ insExistCost + gapLength * insExtendCost);
+ }
+
// options:
int outputFormat;
int outputType;
@@ -80,8 +85,10 @@ struct LastalArguments{
int frameshiftCost;
std::string matrixFile;
int maxDropGapped;
+ char maxDropGappedSuffix;
int maxDropGapless;
int maxDropFinal;
+ char maxDropFinalSuffix;
sequenceFormat::Enum inputFormat;
size_t minHitDepth;
size_t maxHitDepth;
=====================================
src/version.hh
=====================================
--- a/src/version.hh
+++ b/src/version.hh
@@ -1 +1 @@
-"932"
+"938"
View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/e2061edc0f126034f89e1bf00ed3aad468447d23...54f0b1f06cf4f118ee23f5da391f6776a40cbc36
--
View it on GitLab: https://salsa.debian.org/med-team/last-align/compare/e2061edc0f126034f89e1bf00ed3aad468447d23...54f0b1f06cf4f118ee23f5da391f6776a40cbc36
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20180523/5057cfd6/attachment-0001.html>
More information about the debian-med-commit
mailing list