[med-svn] [Git][med-team/last-align][master] 4 commits: New upstream version 1542
Charles Plessy (@plessy)
gitlab at salsa.debian.org
Tue Feb 20 23:32:44 GMT 2024
Charles Plessy pushed to branch master at Debian Med / last-align
Commits:
3373b227 by Charles Plessy at 2024-02-21T08:04:42+09:00
New upstream version 1542
- - - - -
4a68474d by Charles Plessy at 2024-02-21T08:04:42+09:00
routine-update: New upstream version
- - - - -
9b5762b7 by Charles Plessy at 2024-02-21T08:04:49+09:00
Update upstream source from tag 'upstream/1542'
Update to upstream version '1542'
with Debian dir 693a10483a78a1405bfacbf3fc63e6b23037fc82
- - - - -
4a58fc43 by Charles Plessy at 2024-02-21T08:12:46+09:00
routine-update: Ready to upload to unstable
- - - - -
11 changed files:
- bin/maf-convert
- data/MAM4.seed
- data/MAM8.seed
- data/YASS.seed
- debian/changelog
- doc/last-seeds.rst
- doc/lastdb.rst
- src/makefile
- test/maf-convert-test.out
- test/maf-convert-test.sh
- + test/toprev.maf
Changes:
=====================================
bin/maf-convert
=====================================
@@ -65,6 +65,14 @@ def myOpen(fileName):
def maxlen(s):
return max(map(len, s))
+complementLookup = bytearray(range(128))
+for x, y in zip("ACGTRYKMBDHVU",
+ "TGCAYRMKVHDBA"):
+ complementLookup[ord(x.upper())] = ord(y.upper())
+ complementLookup[ord(x.lower())] = ord(y.lower())
+def revcomp(seq):
+ return "".join(chr(complementLookup[ord(i)]) for i in reversed(seq))
+
def pairOrDie(sLines, formatName):
if len(sLines) != 2:
e = "for %s, each alignment must have 2 sequences" % formatName
@@ -688,7 +696,8 @@ def cigarParts(alignmentColumns):
yield str(size) + "X"
def writeSam(readGroup, mafs):
- seq = qual = cigar = score = evalue = ""
+ seq = qual = score = evalue = ""
+ cigar = []
mismapProb = 2.0
editDistance = 0
for maf in mafs:
@@ -697,11 +706,8 @@ def writeSam(readGroup, mafs):
seqNameA, seqLenA, strandA, letterSizeA, begA, endA, rowA = fieldsA
seqNameB, seqLenB, strandB, letterSizeB, begB, endB, rowB = fieldsB
- if letterSizeA > 1 or letterSizeB > 1:
- raise Exception("this looks like translated DNA - can't convert to SAM format")
-
- if strandA != "+":
- raise Exception("for SAM, the 1st strand in each alignment must be +")
+ if letterSizeA * letterSizeB != 1:
+ raise RuntimeError("looks like DNA-to-protein alignment - can't convert to SAM format")
isSplice = False
for i in aLine.split():
@@ -716,15 +722,15 @@ def writeSam(readGroup, mafs):
if seq:
d = begA - oldEndA
- cigar += str(d) + "DN"[isSplice]
+ cigar.append(str(d) + "DN"[isSplice])
if not isSplice: editDistance += d
else:
- pos = str(begA + 1) # convert to 1-based coordinate
- if begB: cigar += str(begB) + "H"
+ pos = begA
+ if begB: cigar.append(str(begB) + "H")
oldEndA = endA
alignmentColumns = list(zip(rowA.upper(), rowB.upper()))
- cigar += "".join(cigarParts(iter(alignmentColumns)))
+ cigar.extend(cigarParts(iter(alignmentColumns)))
editDistance += sum(x != y for x, y in alignmentColumns)
# no special treatment of ambiguous bases: might be a minor bug
@@ -737,7 +743,13 @@ def writeSam(readGroup, mafs):
qual += ''.join(j for i, j in z if i != "-")
revBegB = seqLenB - endB
- if revBegB: cigar += str(revBegB) + "H"
+ if revBegB: cigar.append(str(revBegB) + "H")
+
+ if strandA == "-":
+ seq = revcomp(seq)
+ qual = qual[::-1]
+ cigar = reversed(cigar)
+ pos = seqLenA - endA
if mismapProb > 1:
mapq = mapqMissing
@@ -752,24 +764,26 @@ def writeSam(readGroup, mafs):
# I'm not sure whether to add 2 and/or 8 to flag.
if seqNameB.endswith("/1"):
seqNameB = seqNameB[:-2]
- if strandB == "+": flag = "99" # 1 + 2 + 32 + 64
- else: flag = "83" # 1 + 2 + 16 + 64
+ if strandB == strandA: flag = "99" # 1 + 2 + 32 + 64
+ else: flag = "83" # 1 + 2 + 16 + 64
elif seqNameB.endswith("/2"):
seqNameB = seqNameB[:-2]
- if strandB == "+": flag = "163" # 1 + 2 + 32 + 128
- else: flag = "147" # 1 + 2 + 16 + 128
+ if strandB == strandA: flag = "163" # 1 + 2 + 32 + 128
+ else: flag = "147" # 1 + 2 + 16 + 128
else:
- if strandB == "+": flag = "0"
- else: flag = "16"
+ if strandB == strandA: flag = "0"
+ else: flag = "16"
if len(qual) < len(seq): qual = "*"
+ pos += 1 # convert to 1-based coordinate
+ cigar = "".join(cigar)
out = [seqNameB, flag, seqNameA, pos, mapq, cigar, "*\t0\t0", seq, qual]
out.append("NM:i:" + str(editDistance))
if len(mafs) < 2:
if score.isdigit(): out.append("AS:i:" + score) # must be an integer
if evalue: out.append("EV:Z:" + evalue)
if readGroup: out.append(readGroup)
- print("\t".join(out))
+ print(*out, sep="\t")
def mafConvertToSam(opts, lines):
readGroup = ""
=====================================
data/MAM4.seed
=====================================
@@ -2,6 +2,8 @@
# uses about half as much memory. [From Frith & Noé 2014 NAR 42:e59
# Table S11, row 12.]
+#lastdb -U100
+
1 A C G T
0 ACGT
T AG CT
=====================================
data/MAM8.seed
=====================================
@@ -3,6 +3,8 @@
# mammal genomes). [From Frith & Noé 2014 NAR 42:e59 Table S12, row
# 15.]
+#lastdb -U100
+
1 A C G T
0 ACGT
T AG CT
=====================================
data/YASS.seed
=====================================
@@ -2,6 +2,8 @@
# similarities. It is a good compromise for both protein-coding and
# non protein-coding DNA (L Noé & G Kucherov, NAR 2005 33:W540-W543).
+#lastdb -U100
+
1 A C G T
0 ACGT
T AG CT
=====================================
debian/changelog
=====================================
@@ -1,3 +1,9 @@
+last-align (1542-1) unstable; urgency=medium
+
+ * New upstream version
+
+ -- Charles Plessy <plessy at debian.org> Wed, 21 Feb 2024 08:06:34 +0900
+
last-align (1540-1) unstable; urgency=medium
* New upstream version
=====================================
doc/last-seeds.rst
=====================================
@@ -99,6 +99,9 @@ And these patterns::
11TT010T01TT0001T
11TT10T1T101TT
+It sets this lastdb default:
+-U100
+
MAM8
----
@@ -123,6 +126,9 @@ And these patterns::
111100T011TTT00T0TT01T
1T1T10T1101101
+It sets this lastdb default:
+-U100
+
MURPHY10
--------
@@ -190,6 +196,9 @@ And this pattern::
1T1001100101
+It sets this lastdb default:
+-U100
+
RY4-9 (abbreviation: RY4)
-------------------------
=====================================
doc/lastdb.rst
=====================================
@@ -105,9 +105,14 @@ Advanced Options
repeat-finding slower. The default is 100 for DNA and 50 for
protein, which prevents non-homologous alignments.
- For DNA, however, if you specify -c (and don't specify AT-rich
- tantan), the default is 400. This avoids hugely redundant
- alignments of human centromeric repeats.
+ For DNA, however, the default is 400 if you:
+
+ * specify ``-c`` AND
+ * don't specify AT-rich tantan AND
+ * choose a seeding scheme other than MAM4, MAM8, or the default (YASS).
+
+ This avoids hugely redundant alignments of human centromeric
+ repeats.
-w STEP
Allow initial matches to start only at every STEP-th position in
=====================================
src/makefile
=====================================
@@ -95,7 +95,7 @@ ScoreMatrixData.hh: ../data/*.mat
../build/mat-inc.sh ../data/*.mat > $@
VERSION1 = git describe --dirty
-VERSION2 = echo ' (HEAD -> main, tag: 1540) ' | sed -e 's/.*tag: *//' -e 's/[,) ].*//'
+VERSION2 = echo ' (HEAD -> main, tag: 1542) ' | sed -e 's/.*tag: *//' -e 's/[,) ].*//'
VERSION = \"`test -e ../.git && $(VERSION1) || $(VERSION2)`\"
=====================================
test/maf-convert-test.out
=====================================
@@ -23417,6 +23417,15 @@ SRR359290.10000 16 chr12 110785492 255 27=1X47= * 0 0 AATGTTTGCGCATGTTCGAGATGAGT
102 16 chr6 65786102 255 453H13=1X5=1X3=223H * 0 0 AAGATGGCCCCAGCGCTCGCAAG * NM:i:2 AS:i:110 EV:Z:58
102 16 chr6 170257702 255 191H17=1D1=1D6=484H * 0 0 AGCACCCAGCCTGCCCTGCCAGCC * NM:i:2 AS:i:110 EV:Z:58
102 16 chr15 78230486 255 166H7=1X12=1D4=509H * 0 0 CAGGAGCGGCCACCATGGCCCAAG * NM:i:2 AS:i:110 EV:Z:58
+ at HD VN:1.3 SO:unsorted
+ at PG ID:lastal PN:lastal VN:1541
+chr15 0 L1P1_5end#LINE/L1 1 100 17013608H1=1X2=2X1=1X2=1X10=1X7=1X2=1X3=1X1=1X14=1X5=1X3=1X7=1X3=2X6=1X8=1X5=1X4=1X13=1X2=1X1=1X4=1X16=1X11=1X5=1X2=1X1=1X1=1X2=1X7=1X1=1X3=1X3=2X6=2X6=1X5=1X17=1X4=1X5=1X7=2X5=1X9=1X5=2X12=1X9=1X5=1X1=1X1=1X8=1X3=1X3=1X19=1X5=1X3=1X13=1X3=1X2=1X1=1X3=1X2=1X20=1X9=1X5=1X5=1X11=1X5=2X2=1D16=2X14=1X2=1X2=1X9=1X1=1X4=1X15=1X1=1X11=1X15=1X31=1X53=1X9=1X2=1X4=1X18=1X25=1X7=7I21=1X6=1X32=2X2=1X15=1X8=1X2=1X6=1X12=1X4=1X2=1X2=1X3=1X12=1X28=1X9=1X5=1X35=1X19=1X37=2X21=1X45=1X13=1X5=1X4=1X16=1X4=1X13=1X63=1X11=2X8=1X3=1X6=1X28=1X1D11=1X17=2X9=1X9=1X23=1X1=1X8=1X3=1X17=1X6=1X1=1X18=1X31=1X3=1X37=1X8=1X14=1X18=1X5=1X25=1X1=1X65=1D7=1X10=1X6=1X12=1X31=1X49=1X34=1X20=1X6=1X7=1X8=1X21=1X18=1X35=1X50=1X1=1X35=1X44=1X1=1X18=1X17=1X3=2X1=1X6=1X15=1X7=1X23=84975467H * 0 0 GAGGTGGCAGTCAAGATGGCCAAATAGGAGCACCTCTGTTCTACAGCTCCCAGTGTGAGTGACACAGAAGATGGGCAATTTCTTCATTTCCATCTGAGATACCTGGTTCATCTCACTAGGAATTGCCAGACAGTGGGCGCAGGatagtgggtgcaGTGCACCATGAGTGGGCAGAAGCAGAGTGAGTCATTTCCTCACTTGGGAAGTGCAAGTGGTCAGGGAGTTCCCTTACCTAATCAAATAAAGGGGCAACAGATGGCACCTGGGAAATCCAGTCACTCCCACCATAATACTGCTCTTTTTCAATGGGCTTAAAAAATGGCACACCAGGAGATTATATCCCACACCTTGCTTGGAGGGTCCTACGTCCATGGTGTCTCACTGATTGCTAGCACAGCAGTCTGTGATCAAACTACAAGGTGGCAGTGAGGCTGGGGGTGGGGCAACCCCATTGCCCAGGCTTGCTTAGGTAAACAAAGCTGCTGGAAAGCTCGAAGTAGGTGTAGCCCACCACAGCTCTACGAGGCCTGCCTTCCTCTGTAGGCTCCATCTCTGGGGGCAGGGCACAGACAAACAAAAAGGCAGCAGTAACCTCTGCAGACTTAAATGTCCCTGTCTGACAGCTTTGAAGAGAGAAGTGGTTCTACCGGCACACAGCTGGAGATCTGAGAATGGGCAGACTGCCTCCTCAAGTGGGTACCTGACCTCTGAACCCCGAGCAGCCTAACTGGGAGACACCCCTCAGTAGGGGCagactgacaccTCACACGGCCGTATAGTCCTCTGAGACAAAATTTCCAGAGCAAAGATCAGACAGCAGCATTCGTGGTTTACGAAAATCTGCTGTTCTGCAGTCACCGCTGCTGATACCCAGGCAAACAGGATCTGGAGTGTACCTCTAGCAAACTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAAGAAACAGAAAGGACATCCACACCAAAAACCCATCTGTATATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAGATGGGGAAAAAACAGAGCAGAAAAACTAGAAACTCTAAAAAGCAGAGTGCCTTTCCTCCTCCAAAGGAATGCAGTTCCTCACCAGCAATGGAACAAAGCTGGACGGAGAATGACTTTGACGAGTTGAGAGAAGAAGGCTTCAGACGATCAAATTACTCCGAGCTGCAGGAGGAAATTCAAACCAAAGGCAAAGAAGTTAAAAACTTtgaaaaaattTAGACGAATGTATAACTAGAATAACCAACACAGAGAAGTGCTTAAAGGAGCTGATGGAGCTGAAAGCCAAGGCTCCAGAACTACTTGAAGAATGCAGAAGCCTCAGGAGCCGAGGTGATCAACTGGAAGAAAGGATATCAGTGATGGAAGATGAAATGAATGAAATAAAGTGAGAAGGGAAGTTTAGAGAAAAAAgaataaaaagaaATGAACAAAGACTCCAAGAAATATGAGACTATGTGAAAAGACCATATCTATGTCTGATTGGTGTACCTGAAAGTGATGCGGAGAATGGAACCAAGTTGGAAAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAATCTAGAAGGCAGACCAACATTCATATTCAGAAAATACAGAGAATGCCACAAAGATACTCCTCGAGAAGAGCAACTTCAAGACACATAATTGTCAGATTCACCAAAGTTGAAATGAAGGAAAAAATATTAAGGGCAGCCAGAGAGAAAGGTCGGGTTACCCTCAAAGGGAAGCCCATCAGACGAACAGCCGATCTCTTGGCAGAAAGTCTACAAGCCAGAAGAGAGTGTGGGCCAATATTCAACATTTTTAAAGAAAAGAATTTTCAACCCAGAATTTCATATTCAGCCAAACTAAGCTTCATAAGTGAAGGAGAAATAAAATACTTTACAGACGATCAAATGCTGAGAGATTTTGTCACCACCAGGCCTGCTCTAAAAGAGCTCCTGAAGGAAGCACTAAACATGGAAAGGAACAATCAGTACCAGCCACTGCAAAATCATGCCAAATTGTAAAGTCCACTGTGGCTAGTAAGAAACTGCATCAAGTAACGAGAAAAATAACCAGCTAACATCATAA * NM:i:188 AS:i:9472
+chr15 0 L1P1_orf2#LINE/L1 3 100 17015723H34=1X23=1X8=1D15=1X54=1X10=1X15=1X2=1X6=1X11=1X24=4D52=1X43=1X6=1X17=1X12=15D5=1X19=2X14=1X11=1X7=1X25=1X21=1X7=1X7=1X33=1X36=1X14=1X27=1X13=1X7=1X6=2X155=1X38=1X35=1X20=1X55=1X44=1X4=1X30=1X37=1X12=1X14=1X1=1X11=1X66=1X10=1D8=1X1=2X8=1X17=1X6=1X16=1X1=1X8=1X4=1X6=1X28=1X76=1X2=1X81=1X21=1X9=1X27=1X38=1X21=1X1=1X66=1X5=1X7=1X39=1X19=1X3=1X13=1X25=1X48=1X28=2X19=1X6=1X34=1X64=1X58=1X2=1X26=1X30=1X10=1X29=1X20=1X47=1X56=1X41=2X22=1X36=1X4=1X47=2X26=1X33=1X8=1X36=1X33=1X2=1X5=1X3=1X14=2X22=2X12=1X12=1X2=1X12=1X3=1X23=1X39=1X2=1X19=1X7=1X8=2X131=1X1=1X28=1X20=1X38=1X7=1X19=1X4=1X59=1X16=2X4=1X9=1X1=1X5=1X9=1X69=2X15=1X9=1X12=1X1=1D21=1D25=84972201H * 0 0 GACAGGATCAAATTCACACATAACAATATTAACTCTAAATGTAAATGGACTAAATGCTTCAATTAAAGACACAGACTGGCAATTTGGATAAAGAGTCAAGACCCATCAGTGTGCTGTATTCAGGAAACCCATCTCACATGCAGAGACAAACATAGGCTCAAAATGAAAGGATGGTGGAAGATCTACAAAGCAAATGGAaaacaaaaaaaggggTTGCAATCCTAGTCTCTGATAAAACAGACTTTAAACCAACAAAGATCAAACGAGACAAAGAAGGCCATTACATAATGGTAAAGGGATCAATTCAGCAAGAAAAGCTAACTATCCTAAATGTATATGCACCCAGATTCTTAAAGCAAGTCCTGAGTGATGTACAAAGAGACTTACACTCCCACACAATAATAATTGGAGACTTTAACACCCCACTGTCAATATTAGACAGATCAACGAGACAAAAAGTTAGCAAGGATCCCCAGGAATTGAACTCAGCTCTGCACCAAGCGGGCCTAATAGACATCTACAGAACTCTCCACCCCAAATCGACAGAATATACATTTTTTTCAGCACCACAccacaCCTATTCCGAAATTGACCACATCGTTGGAAATAAAGCTATCCTCAGCAAATGTAAAAGAACAGAAATTATAACAAACTGTCTCTCAGACCACAGTGCAATCAAACTAGAACTCAGGATTAAGAAACTCACTCAAAACCGCTCAACTACATGGAAACTGAACAACCTGCTCCTGAATGACTACTGGGTACATAACAAAATGAAGGCAGAAATAAAGATGTTCTTTGAAACCAACAAGAACAAAGACACAACATACCAGAATCTCTGGGACACATTCAAAGCAGTGTGTAGAAGGAAATTTATAGCACTAAATGCCCACAAGAGAAAGCAGGAAAGATCCAAAATTGATACCCTAACATCACAATTAAAAGAACTAGAAAAGCAAGAGCAAACGCATTAAAAAGCTAGCAGAAGGCAAGAAATAACTAAAATCAGAGCAGAACTGAAGGAAATAGAGACACaaaaaaaCCTTCAAAAAATTAATGAATCCAGGAGATTGTTTTTTGAAAAGATCAACAAAATTGATAGACCGCTAGCAAGACTAATAAAGAAGAAAAGAGAGAAGAATCAAATAGATGCAATAAAAATGATAAAGTGAGTATCACCATCGATCCCACAGAAATACGAACTACTATCAGAGAATACTACATAAACCTCTACACAAACAAACTACAAAATCTAGAAGAAATGGATAAATTCCTTGACACATACACCCTCCCAAGACTAAACCAGGAAGAAGTTGAATCTCTGAATAGACCAATAACAGGCTCTGAAATTGTGGTAATAATCAATAGCTTACCAACCAAAAAGAGTCCAGGACCAGATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAGGAACTGGTACCATTCCTTCTGAAATTATTCCAATGAATAGAAAAAGAGGGAATCCTCCCTAATTCATTTTATGAGGCCAGCATCATCCTGATACCAAAGCCGGGCAGAGACACAACCAAAAaaCAAAATTTTAGACCAATATCCTTGATGAACATTGATGCAAAAATCCTCAATAAAATACTGGCAAACCGATTCCAGAAGCACATTAAAAAGCTTATCCACCATGATCAAGTGGGCTTCATCCCTAGGATGCAAGGCTGGTTCAATATATGCAAATCAATAAATGTAATCCAGCATATAAACAGAACCAGAGACAAAAACCACATGATTATCTCAATAGATGCAGAAAAGGCCTTTGATAAAATTCAACAACCCTTCATGCTAAAAATACTCAATAAATTAGGTATTGTTGGGACATATCTCAAAATAATAAGAGCTATCTATGACAAACACACAGCCAATATCATACTGAATGGGCAAAAACTGGAAGCATTCCCTTTGAAAACTGGCACAAGATAGGGATGCCCTCTCTCACCACTCCTATTCAACATAGTGTTGGAAGTTCTGGCCAGGGCCATTAGGCAGGAGAAGGAAATAAAGGGTATGCAATTAGGAAAAGAGGAAGTCAAATTGTCCTTGTTTGCAGATGACATGATTGTATATCTAGAAAACCCCATTGTCTCAGCCCAAAATCTCCTGAAGCTGATAAGCAACTTCAGCAAAGTCTCAGGATACAAAATCAATGTACAAAAATCACAAGCATTCTTATACACCAATAACAGACAAACAGAGAGCCAAATCATAAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAATATGTAGGAATCCAACTTACAAGGGACGTGAAGGACCTCTTCAAGGAGAACTACAAACCACTGTTCAATGAAATAAAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGAATAGGAAGAATCAATATCGTGAAAATGACCATACTGCCCAAGGTAATTTATAGATTCAATGTCATCCCCAACAAGCTACCAATGACTTTCTTCACAGAATTGGAAAAGACTACTTTAAAGTTCATATGGAACCAAAAAAGATCCTGCATCACCATGTCAATCCTAAGCCGCAAGAACAAAGCTGGAGGCATCAGTCTACCTGACTTCGAACTATACTACAGGGATACAGTAACCAAGACATCATGGTACTGGTACCAAAACAGAAATATAGATCAATGGAACAGAACAGAGCCCTCAGAAATAATGCTGCATATCTACAACTATCTGTTCTTTGATAAACCTGAACAAAACAAGCAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTAGCCATATGTAGAAAGCTGAAACTGGATCCCTTCCTTACACCTTATACAAAAATTAATTCAAGATGGATTAAAAAGTTaaacgttagacctaaaaccatAAAAATCCTAGAAGAAAACCTAGGCATTACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTTTAAAACAACAAAAGCAATGGCAACAAATGCCATAATTGACAAATGGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACTACCAGCAGAGTGAACAGGCAATGTACAAAATGGGAGATAGTTTTCACAACCTACTTATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTACAAGAAaaaaacaaacaaaGCCATCAAAAAGTGGGTGAAGGACATCAACAGACACTTCCCAAAGAAGACATTTATGCAGCCAAAAACACATGAAAAAATGCTCATC * NM:i:165 AS:i:15792
+chr15 0 L1PA3_3end#LINE/L1 150 52 17018992H24=1X20=1X14=1X35=1X22=1X4=1X9=1X3=1X4=1X10=1X4=1X35=2X4=1X10=1X27=1X1=2X7=1D51=1X14=1X3=2D9=1X15=1X4=1X1=1X3=1X13=1X11=2X2=1X13=1X7=1I42=1X27=1X5=1X8=1I8=1X4=1X24=1X1=1X48=1X4=1X12=1X15=1X19=1X11=1X10=1X5=1X13=1X18=1X3=1X43=84971448H * 0 0 ACTGGCCATCAGAGAAATGCAAATGAAAACCACAATGAGATACCACCTCACACCAGTTAGGATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGTTGGAGAGGATGTGGAGAAATAGAAACATTTTTACACTTTTGATGGGGCTGTAAACTAATTCAGCCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGAAGTAGAGCTAGAAATACAATTTGACCCAGCCATCCCATTACTGGGCACGTACCCAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCAGCACTATTCACAATTGCAGACTTGGAAACAACCCAAATGTCCAGCAATAACAGAGTGGATTAAGAAAACGTGGCACATATGTACAATGGAATACTATGGAGCCATACAAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATGAAACTGGAAATCATCATTCTCAGTAAACTATTGCAAGGACAAAAAAACCAAACACTGCATGTTCTCACTCATAGGTGGGAATTGAGCGATGAGAACACATGGACACAGGAAGGGGAACATCACACTCTGGGGACTGATGTGAGGTGgggggaggagggaggGATAGCATTAGGAGATATACCTAATGCTAAATGACGAGTTAATGGGTGCAGCACACCAACATGGCACATGTACACATATGTAACTAACCTGTACAATGTGCACATGTACCCTAAAACTTAAAGTATAataataataaaa * NM:i:53 AS:i:3398
+chr15 16 L1PA3_3end#LINE/L1 467 84 84955187H52=1X10=1X83=1X13=1X3=1X23=1X54=1X47=1X3=1X16=1X16=1X15=1X10=1X41=1X17=1X13=1X4=17035566H * 0 0 TAGCAAAGACTTGGAACCAACCCAAATGTCCAACAATGATAGACTGGATTAAAAAAATGTGGCGCATATACACCATGGAATACTATGCAGCCATAAAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATGAAATTGGAAATCAACATTCTCAGTAAAATATTGCAAGAACAAAAAACCAAACACCTCATATTCTCACTCATAGGTGGGAATTGAACAATGAGAACACATGGACACAGGAAAGGGAACATCACACTCTGGGGACTGTtgtggggtggggggagggGGGACGGAGAGCATTGGGAGATATATCTAATGCTAGATGACGGGTTAGTGGGTGCAGCACACCAGCATGTCACATGTATACATATGTAACTAACCTGCACATTGTGCACATTTACCCTAAAACTTAaagaataataataataagaaaa * NM:i:16 AS:i:2158
+chr15 0 L1PA3_3end#LINE/L1 40 81 17042713H2=1X3=1X7=1X6=1X12=1X14=1X16=1X1=2X78=1X8=1X7=1X11=1X64=2X2=1X5=1X13=1X10=1X35=1X39=1X59=1X6=1X15=1X35=1X12=1X64=1X41=1X7=1X4=1X27=1X43=2X3=1X23=1X2=1X2=1D2=1X1=1X12=2X18=1X11=1X10=1X7=1X3=1X12=1X58=84947620H * 0 0 aaaaaacaaacaaataaCCCCTTCAAAAAGTGGGTGAAGGACATGAACATACACTTCTCAAAAGAATATGTTTATGCAGCCAAAAAACACATGAAAAAATGCTCACCATCACTGGCCATCAGAGAAATGCAAATCAAAACCACAATGAAATACCATCACACACCAATTAGAATGGCAGTCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACAAAGTGGGTGGCACTGTAAACTAGTACAACCATTGTAGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACCAGAAATACCATTTGACCCAGCCATCCCATTACTGGGTATGTACCCAAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCAGCACTAGTCACAATAGCAAAGAGTTGGAACCAACCCAAATGTCCAACAATGATAGACTTGATTAAGAAAATATGGCACATATACACCATGGAATACTATGCAGCCATAAAAAATGATGAGTTCATGTCCTTTGTAGTGACATGGATGAAATTGGAAATCATCATTCTCAGTAAACTATTGCAAGAATAAAAGACCAAACACCGCATATTCTCACTCATAAGTGGGAATTGAACAATGAGAACACATGGACACAGGAAGGGGAATGTCATACTCTGGGGACTGTTGTGGGGTGTGGTGAGGCGCAGGGATAGCATTATGAGATATACCTAATGCTAAATGACGAGTTAATGGGTGCAGCACACCAGCGTGGTACATGTATACATGTGTAACTAACCTGCACATTGTGCACATGTACCCTAAAACTTAAAGTATAataataata * NM:i:47 AS:i:4053
+chr15 16 L1PA2_3end#LINE/L1 41 73 84941003H33=1X112=1X4=1X45=1X115=1X85=1X11=1X50=2X24=1X40=1X19=1X13=1X6=1X12=1X8=1X7=1D21=1X86=1X3=1X15=1X3=1X11=1X19=1X17=1X23=1X45=17049334H * 0 0 AGAaaaaaacaAACAACCCCATCAAAAAGTGGGTGAAGGACATGAACAGACACTTCTCAAAAGAAGACATTTATGCAGCCAAAAAACACATGAAAAAATGCTCATCATCACTGGCCATCAGAGAAATGCAAATCAAAACCACAATGTGATATCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACTACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTGTTGGTGGGACTGTAAACTAGTTCAACCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACTGGAAATACCATTTGACCCAGCCATCCCATTACTGGGTATATACCCAAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACATATGTTTATTGTGGCATTATTCACAATAGCAAAGACTTGGAACCAACCCAAATGTCCAACAACCATAGACTGGATTAAGAAAATGTGGTACATATACACCATGGAATACTATGCAGCCATAAAAAATGAAGAGTTCATGTCCTTTGTAGAGACATGGATGAAACTGGAAACCATCATTCTCAGCAAACTATCACAAGAACAAAAACCAAACACCGCATATTTTCACTCATAGGTGGGAATTGAACAATGAGATCACATGGACACAGGAAGGGGAATATCACACTCTGGGGACTGTGGTGGGGTGGGGGCAGGAGGGAGGGATAGCATTAGGAAATATACCTAATACTAGATGACGAGTTAGTGGCTGCAGCGCACCAGCATGTCACATGTATACATATGTAACTAATCTGCACAATGTGCACATGTACCCTAAAACTTAAAGTATAATAAAA * NM:i:26 AS:i:4305
+chr15 16 L1P1_orf2#LINE/L1 2609 100 84940427H4=1X17=1X92=1X22=1X25=1X103=1X20=1X45=1X3=1X39=1X28=1X14=1X70=1X16=1X63=17050187H * 0 0 TCACACTACCTGACTTCAAACTGTACTACAAGGCtacagtaaccaaaacagCATGGTACTGGTACCAAAACAGAGATATAGATCAATGGAACAGAACAGAGCCCTCAGAAATAACACCGCATATCTACAACTATCTGACCTTTGACAAACCTGAGAAAAACAAGAAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTAGCCATATGTAGAAAGCTGAAACTGGATCCCTTCCTTACACCTTATACAAAAATCAATTCAAGATGGATTAAAGATTTAAACGTTAGACCTAAAACCATAAAAACCCTAGAAGAAAACCTAAGCATTACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTCCAAAACACCAAAAGCAATGGCAACAAAAGACAAAATTGACAAATAGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACTACCATCAGAGTGAACAGGCAACCTACAAAATGGGAGAAAATTTTTGCAACCTACTCATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTAC * NM:i:14 AS:i:2956
221 chr1 160106735 75 + 249250621 SRR359290.9001 0 75 - 75 75
450 chr1 231468663 75 + 249250621 SRR359290.9002 0 75 + 75 75
120 chr15 50775672 24 + 102531392 SRR359290.9002 22 24 + 75 24
=====================================
test/maf-convert-test.sh
=====================================
@@ -46,6 +46,7 @@ maf2=bs100.maf
head -n999 $maf1 | $r -r 'ID:1 PL:ILLUMINA SM:x' sam
$r -d sam $maf1
$r -j1e9 sam 90089.maf 102.maf
+ $r sam toprev.maf
head -n999 $maf1 | $r -n tab
head -n999 $maf1 | $r tab
$r -n tab frameshift-new.maf
=====================================
test/toprev.maf
=====================================
@@ -0,0 +1,59 @@
+# LAST version 1541
+#
+# a=26 b=1 A=29 B=1 e=138 d=77 x=137 y=46 z=137 D=1e+07 E=23646.3
+# R=01 u=0 s=1 S=1 M=0 T=0 m=100 l=1 n=100 k=1 w=1000 t=4.62434 j=3 Q=0
+# Reference sequences=2716 normal letters=3491686
+# lambda=0.21926 K=0.327014
+#
+# A C G T M S K W R Y B D H V
+# A 5 -10 -5 -11 3 -7 -8 2 3 -11 -8 1 1 1
+# C -10 6 -10 -3 2 3 -5 -5 -10 3 1 -6 1 1
+# G -3 -10 6 -10 -5 3 2 -5 3 -10 1 1 -6 1
+# T -11 -5 -10 5 -8 -7 3 2 -11 3 1 1 1 -8
+# M 2 2 -7 -6 2 0 -6 0 0 -1 -2 -1 1 1
+# S -5 3 3 -5 0 3 0 -5 0 0 1 -1 -1 1
+# K -6 -7 2 2 -6 0 2 0 -1 0 1 1 -1 -2
+# W 2 -7 -7 2 0 -7 0 2 0 0 -1 1 1 -1
+# R 3 -10 3 -11 1 0 -1 0 3 -10 -2 1 -1 1
+# Y -11 3 -10 3 -1 0 1 0 -10 3 1 -1 1 -2
+# B -7 1 1 1 -2 1 1 -1 -2 1 1 -1 0 -1
+# D 1 -8 1 0 -1 -2 1 1 1 -1 -1 1 0 0
+# H 0 1 -8 1 1 -2 -1 1 -1 1 0 0 1 -1
+# V 1 1 1 -7 1 1 -2 -1 1 -2 -1 0 -1 1
+# N 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+#
+# Coordinates are 0-based. For - strand matches, coordinates
+# in the reverse complement of the 2nd sequence are used.
+#
+# name start alnSize strand seqSize alignment
+#
+# m=1 s=138
+#
+a score=9472 mismap=1e-10
+s L1P1_5end#LINE/L1 0 2110 + 2259 GGGGGNGGAGCCAAGATGGCCGAATAGGAACAGCTCCGGTCTACAGCTCCCAGCGTGAGCGACGCAGAAGACGGGTGATTTCTGCATTTCCANCTGAGGTACCGGGTTCATCTCACTGGGGAGTGCCGGACAGTGGgcgcaggacagtgggtgcagcgcaCCGTGCGCGAGCCGAAGCAGGGCGAGGCATCGCCTCACCCGGGAAGCGCAAGGGGTCAGGGAGTTCCCTTTCCTAGTCAAAGAAAGGGGTGACAGACGGCACCTGGAAAATCGGGTCACTCCCACCCTAATACTGCGCTTTTCCGACGGGCTTAANAAACGGCGCACCAGGAGATTATATCCCGCACCTGGCTCGGAGGGTCCTACGCCCACGGAGCCTCGCTCATTGCTAGCACAGCAGTCTGAGATCAAACTGCAAGGCGGCAGCGAGGCTGGGGGAGGGGCGCCCGCCATTGCCCAGGCTTGANTAGGTAAACAAAGCGGCCGGGAAGCTCGAACTGGGTGGAGCCCACCACAGCTCAAGGAGGCCTGCCTGCCTCTGTAGGCTCCACCTCTGGGGGCAGGGCACAGACAAACAAAAAGACAGCAGTAACCTCTGCAGACTTAAATGTCCCTGTCTGACAGCTTTGAAGAGAGCAGTGGTTCTCCCAGCACGCAGCTGGAGATCTGAGAACGGGCAGACTGCCTCCTCAAGTGGGTCCCTGACC-------CCCGAGCAGCCTAACTGGGAGGCACCCCCCAGTAGGGGCAGACTGACACCTCACACGGCCGGGTACTCCTCTGAGACAAAACTTCCAGAGGAACGATCAGGCAGCAGCATTCGCGGTTCACCAANATCCGCTGTTCTGCAGCCACCGCTGCTGATACCCAGGCAAACAGGGTCTGGAGTGGACCTCCAGCAAACTCCAACAGACCTGCAGCTGAGGGTCCTGNCTGTTAGAAGGAAAACTAACAAACAGAAAGGACATCCACACCAAAAACCCATCTGTACGTCACCATCATCAAAGACCAAAGGTAGATAAAACCACAAAGATGGGGAAAAAACAGAGCAGAAAAACTGGAAACTCTAAAAANCAGAGCGCCTCTCCTCCTCCAAAGGAACGCAGCTCCTCACCAGCAACGGAACAAAGCTGGACGGAGAATGACTTTGACGAGTTGAGAGAAGAAGGCTTCAGACGATCAAACTACTCCGAGCTANAGGAGGAAGTTCGAACCAATGGCAAAGAAGTTAaaaactttgaaaaaaaaTTAGACGAATGGATAACTAGAATAACCAATGCAGAGAAGTCCTTAAAGGACCTGATGGAGCTGAAAGCCAAGGCNCGAGAACTACGTGANGAATGCAGAAGCCTCAGNAGCCGATGCGATCAACTGGAAGAAAGGGTATCAGTGATGGAAGATGAAATGAatgaaatgaaGCGAGAAGGGAAGTTTAGAGAAAAAagaataaaaagaaaCGAACAAAGCCTCCAAGAAATATGGGACTATGTGAAAAGACCAAATCTACGTCTGATTGGTGTACCTGAAAGTGACGGGGAGAATGGAACCAAGTTGGAAAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAATCTAGCAAGGCAGGCCAACATTCAGATTCAGGAAATACAGAGAACGCCACAAAGATACTCCTCGAGAAGAGCAACTCCAAGACACATAATTGTCAGATTCACCAAAGTTGAAATGAAGGAAAAAATGTTAAGGGCAGCCAGAGAGAAAGGTCGGGTTACCCACAAAGGGAAGCCCATCAGACTAACAGCNGATCTCTCGGCAGAAACTCTACAAGCCAGAAGAGAGTGGGGGCCAATATTCAACATTCTTAAAGAAAAGAATTTTCAACCCAGAATTTCATATCCAGCCAAACTAAGCTTCATAAGTGAAGGAGAAATAAAATACTTTACAGACAAGCAAATGCTGAGAGATTTTGTCACCACCAGGCCTGCCCTAAAAGAGCTCCTGAAGGAAGCACTAAACATGGAAAGGAACAACCGGTACCAGCCACTGCAAAAACATGCCAAATTGTAAAGACCATCGAGGCTAGGAAGAAACTGCATCAACTAACGAGCAAAATAACCAGCTAACATCATAA
+s chr15 17013608 2114 + 101991189 GAGGTGGCAGTCAAGATGGCCAAATAGGAGCACCTCTGTTCTACAGCTCCCAGTGTGAGTGACACAGAAGATGGGCAATTTCTTCATTTCCATCTGAGATACCTGGTTCATCTCACTAGGAATTGCCAGACAGTGGGCGCAGGatagtgggtgcaGTGCACCATGAGTGGGCAGAAGCAGAGTGAGTCATTTCCTCACTTGGGAAGTGCAAGTGGTCAGGGAGTTCCCTTACCTAATCAAATAAAGGGGCAACAGATGGCACCTGGGAAATCCAGTCACTCCCACCATAATACTGCTCTTTTTCAATGGGCTTAAAAAATGGCACACCAGGAGATTATATCCCACACCTTGCTTGGAGGGTCCTACGTCCATGGTGTCTCACTGATTGCTAGCACAGCAGTCTGTGATCAAACTACAAGGTGGCAGTGAGGCTGGGGGTGGGGCAACC-CCATTGCCCAGGCTTGCTTAGGTAAACAAAGCTGCTGGAAAGCTCGAAGTAGGTGTAGCCCACCACAGCTCTACGAGGCCTGCCTTCCTCTGTAGGCTCCATCTCTGGGGGCAGGGCACAGACAAACAAAAAGGCAGCAGTAACCTCTGCAGACTTAAATGTCCCTGTCTGACAGCTTTGAAGAGAGAAGTGGTTCTACCGGCACACAGCTGGAGATCTGAGAATGGGCAGACTGCCTCCTCAAGTGGGTACCTGACCTCTGAACCCCGAGCAGCCTAACTGGGAGACACCCCTCAGTAGGGGCagactgacaccTCACACGGCCGTATAGTCCTCTGAGACAAAATTTCCAGAGCAAAGATCAGACAGCAGCATTCGTGGTTTACGAAAATCTGCTGTTCTGCAGTCACCGCTGCTGATACCCAGGCAAACAGGATCTGGAGTGTACCTCTAGCAAACTCCAACAGACCTGCAGCTGAGGGTCCTGTCTGTTAGAAGGAAAACTAAGAAACAGAAAGGACATCCACACCAAAAACCCATCTGTATATCACCATCATCAAAGACCAAAAGTAGATAAAACCACAAAGATGGGGAAAAAACAGAGCAGAAAAACTAGAAACTCTAAAAAGCAGAGTGCCTTTCCTCCTCCAAAGGAATGCAGTTCCTCACCAGCAATGGAACAAAGCTGGACGGAGAATGACTTTGACGAGTTGAGAGAAGAAGGCTTCAGACGATCAAATTACTCCGAGCTGCAGGAGGAAATTCAAACCAAAGGCAAAGAAGTTAAAAACTTtgaaaaaat-tTAGACGAATGTATAACTAGAATAACCAACACAGAGAAGTGCTTAAAGGAGCTGATGGAGCTGAAAGCCAAGGCTCCAGAACTACTTGAAGAATGCAGAAGCCTCAGGAGCCGAGGTGATCAACTGGAAGAAAGGATATCAGTGATGGAAGATGAAATGAATGAAATAAAGTGAGAAGGGAAGTTTAGAGAAAAAAgaataaaaagaaATGAACAAAGACTCCAAGAAATATGAGACTATGTGAAAAGACCATATCTATGTCTGATTGGTGTACCTGAAAGTGATGCGGAGAATGGAACCAAGTTGGAAAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAATCTAG-AAGGCAGACCAACATTCATATTCAGAAAATACAGAGAATGCCACAAAGATACTCCTCGAGAAGAGCAACTTCAAGACACATAATTGTCAGATTCACCAAAGTTGAAATGAAGGAAAAAATATTAAGGGCAGCCAGAGAGAAAGGTCGGGTTACCCTCAAAGGGAAGCCCATCAGACGAACAGCCGATCTCTTGGCAGAAAGTCTACAAGCCAGAAGAGAGTGTGGGCCAATATTCAACATTTTTAAAGAAAAGAATTTTCAACCCAGAATTTCATATTCAGCCAAACTAAGCTTCATAAGTGAAGGAGAAATAAAATACTTTACAGACGATCAAATGCTGAGAGATTTTGTCACCACCAGGCCTGCTCTAAAAGAGCTCCTGAAGGAAGCACTAAACATGGAAAGGAACAATCAGTACCAGCCACTGCAAAATCATGCCAAATTGTAAAGTCCACTGTGGCTAGTAAGAAACTGCATCAAGTAACGAGAAAAATAACCAGCTAACATCATAA
+
+a score=15792 mismap=1e-10
+s L1P1_orf2#LINE/L1 2 3288 + 3294 GACAGGATCAAATTCACACATAACAATATTaactttaaatgtaaatgGACTAAATGCTCCAATTAAAAGACACAGACTGGCAAATTGGATAAAGAGTCAAGACCCATCAGTGTGCTGTATTCAGGAAACCCATCTCACGTGCAGAGACACACATAGGCTCAAAATAAAGGGATGGAGGAAGATCTACCAAGCAAATGGAAAACAaaaaaAGGCAGGGGTTGCAATCCTAGTCTCTGATAAAACAGACTTTAAACCAACAAAGATCAAAAGAGACAAAGAAGGCCATTACATAATGGTAAAGGGATCAATTCAACAAGAAGAGCTAACTATCCTAAATATATATGCACCCAATACAGGAGCACCCAGATTCATAAAGCAAGTCCTGAGTGACCTACAAAGAGACTTAGACTCCCACACANTAATAATGGGAGACTTTAACACCCCACTGTCAACATTAGACAGATCAACGAGACAGAAAGTTAACAAGGATACCCAGGAATTGAACTCAGCTCTGCACCAAGCGGACCTAATAGACATCTACAGAACTCTCCACCCCAAATCAACAGAATATACATTCTTTTCAGCACCACaccacaCCTATTCCAAAATTGACCACATAGTTGGAAGTAAAGCNCTCCTCAGCAAATGTAAAAGAACAGAAATTATAACAAACTGTCTCTCAGACCACAGTGCAATCAAACTAGAACTCAGGATTAAGAAACTCACTCAAAACCGCTCAACTACATGGAAACTGAACAACCTGCTCCTGAATGACTACTGGGTACATAACGAAATGAAGGCAGAAATAAAGATGTTCTTTGAAACCAACGAGAACAAAGACACAACATACCAGAATCTCTGGGACGCATTCAAAGCAGTGTGTAGAGGGAAATTTATAGCACTAAATGCCCACAAGAGAAAGCAGGAAAGATCCAAAATTGACACCCTAACATCACAATTAAAAGAACTAGAAAAGCAAGAGCAAACACATTCAAAAGCTAGCAGAAGGCAAGAAATAACTAAGATCAGAGCAGAACTGAAGGAAATAGAGACACAaaaaaCCCTTCAAAAAATCAATGAATCCAGGAGCTGGTTTTTTGAAAGGATCAACAAAATTGATAGACCGCTAGCAAGACTAATAAAGAAGAAAAGAGAGAAGAATCAAATAGACGCAATAAAAAATGATAAAGGGGATATCACCACCGATCCCACAGAAATACAAACTACCATCAGAGAATACTACAAACACCTCTACGCAAATAAACTAGAAAATCTAGAAGAAATGGATAAATTCCTCGACACATACACCCTCCCAAGACTAAACCAGGAAGAAGTTGAATCTCTGAATAGACCAATAACAGGCTCTGAAATTGNGGCAATAATCAATAGCTTACCAACCAAAAAGAGTCCAGGACCAGATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAGGAGCTGGTACCATTCCTTCTGAAACTATTCCAATCAATAGAAAAAGAGGGAATCCTCCCTAACTCATTTTATGAGGCCAGCATCATCCTGATACCAAAGCCNGGCAGAGACACAACCAAAAAAGAGAATTTTAGACCAATATCCTTGATGAACATTGATGCAAAAATCCTCAATAAAATACTGGCAAACCGAATCCAGCAGCACATCAAAAAGCTTATCCACCATGATCAAGTGGGCTTCATCCCTGGGATGCAAGGCTGGTTCAACATACGCAAATCAATAAACGTAATCCAGCATATAAACAGAACCAAAGACAAAAACCACATGATTATCTCAATAGATGCAGAAAAGGCCTTTGACAAAATTCAACAACCCTTCATGCTAAAAACTCTCAATAAATTAGGTATTGATGGGACGTATCTCAAAATAATAAGAGCTATCTATGACAAACCCACAGCCAATATCATACTGAATGGGCAAAAACTGGAAGCATTCCCTTTGAAAACTGGCACAAGACAGGGATGCCCTCTCTCACCACTCCTATTCAACATAGTGTTGGAAGTTCTGGCCAGGGCAATCAGGCAGGAGAAGGAAATAAAGGGTATTCAATTAGGAAAAGAGGAAGTCAAATTGTCCCTGTTTGCAGACGACATGATTGTATATCTAGAAAACCCCATCGTCTCAGCCCAAAATCTCCTTAAGCTGATAAGCAACTTCAGCAAAGTCTCAGGATACAAAATCAATGTGCAAAAATCACAAGCATTCTTATACACCAATAACAGACAAACAGAGAGCCAAATCATGAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAATACCTAGGAATCCAACTTACAAGGGATGTGAAGGACCTCTTCAAGGAGAACTACAAACCACTGCTCAANGAAATAAAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGGGTAGGAAGAATCAATATCGTGAAAATGGCCATACTGCCCAAGGTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAATGACTTTCTTCACAGAATTGGAAAAAACTACTTTAAAGTTCATATGGAACCAAAAAAGAGCCCGCATCGCCAAGTCAATCCTAAGCCAAAAGAACAAAGCTGGAGGCATCACGCTACCTGACTTCAAACTATACTACAAGGCTACAGTAACCAAAACAGCATGGtactggtaccaaaacagAGATATAGATCAATGGAACAGAACAGAGCCCTCAGAAATAACGCCGCATATCTACAACTATCTGATCTTTGACAAACCTGAGAAAAACAAGCAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTAGCCATATGTAGAAAGCTGAAACTGGATCCCTTCCTTACACCTTATACAAAAATTAATTCAAGATGGATTAAAGACTTAAACGTTAGACCTAAAACCATAAAAACCCTAGAAGAAAACCTAGGCANTACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTCTAAAACACCAAAAGCAATGGCAACAAAAGCCAAAATTGACAAATGGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACTACCATCAGAGTGAACAGGCAACCTACAGAATGGGAGAAAATTTTCGCAACCTACTCATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTACAAGAAaaaaacaaacaaCCCCATCAAAAAGTGGGCGAAGGACATGAACAGACACTTCTCAAAAGAAGACATTTATGCAGCCAAAAAACACATGAAAAAATGCTCATC
+s chr15 17015723 3265 + 101991189 GACAGGATCAAATTCACACATAACAATATTAACTCTAAATGTAAATGGACTAAATGCTTCAATTAAA-GACACAGACTGGCAATTTGGATAAAGAGTCAAGACCCATCAGTGTGCTGTATTCAGGAAACCCATCTCACATGCAGAGACAAACATAGGCTCAAAATGAAAGGATGGTGGAAGATCTACAAAGCAAATGGAaaacaaaaaaagg----ggTTGCAATCCTAGTCTCTGATAAAACAGACTTTAAACCAACAAAGATCAAACGAGACAAAGAAGGCCATTACATAATGGTAAAGGGATCAATTCAGCAAGAAAAGCTAACTATCCTAAATGTATATGCACCCA---------------GATTCTTAAAGCAAGTCCTGAGTGATGTACAAAGAGACTTACACTCCCACACAATAATAATTGGAGACTTTAACACCCCACTGTCAATATTAGACAGATCAACGAGACAAAAAGTTAGCAAGGATCCCCAGGAATTGAACTCAGCTCTGCACCAAGCGGGCCTAATAGACATCTACAGAACTCTCCACCCCAAATCGACAGAATATACATTTTTTTCAGCACCACAccacaCCTATTCCGAAATTGACCACATCGTTGGAAATAAAGCTATCCTCAGCAAATGTAAAAGAACAGAAATTATAACAAACTGTCTCTCAGACCACAGTGCAATCAAACTAGAACTCAGGATTAAGAAACTCACTCAAAACCGCTCAACTACATGGAAACTGAACAACCTGCTCCTGAATGACTACTGGGTACATAACAAAATGAAGGCAGAAATAAAGATGTTCTTTGAAACCAACAAGAACAAAGACACAACATACCAGAATCTCTGGGACACATTCAAAGCAGTGTGTAGAAGGAAATTTATAGCACTAAATGCCCACAAGAGAAAGCAGGAAAGATCCAAAATTGATACCCTAACATCACAATTAAAAGAACTAGAAAAGCAAGAGCAAACGCATTAAAAAGCTAGCAGAAGGCAAGAAATAACTAAAATCAGAGCAGAACTGAAGGAAATAGAGACACaaaaaaaCCTTCAAAAAATTAATGAATCCAGGAGATTGTTTTTTGAAAAGATCAACAAAATTGATAGACCGCTAGCAAGACTAATAAAGAAGAAAAGAGAGAAGAATCAAATAGATGCAATAAAAA-TGATAAAGTGAGTATCACCATCGATCCCACAGAAATACGAACTACTATCAGAGAATACTACATAAACCTCTACACAAACAAACTACAAAATCTAGAAGAAATGGATAAATTCCTTGACACATACACCCTCCCAAGACTAAACCAGGAAGAAGTTGAATCTCTGAATAGACCAATAACAGGCTCTGAAATTGTGGTAATAATCAATAGCTTACCAACCAAAAAGAGTCCAGGACCAGATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAGGAACTGGTACCATTCCTTCTGAAATTATTCCAATGAATAGAAAAAGAGGGAATCCTCCCTAATTCATTTTATGAGGCCAGCATCATCCTGATACCAAAGCCGGGCAGAGACACAACCAAAAaaCAAAATTTTAGACCAATATCCTTGATGAACATTGATGCAAAAATCCTCAATAAAATACTGGCAAACCGATTCCAGAAGCACATTAAAAAGCTTATCCACCATGATCAAGTGGGCTTCATCCCTAGGATGCAAGGCTGGTTCAATATATGCAAATCAATAAATGTAATCCAGCATATAAACAGAACCAGAGACAAAAACCACATGATTATCTCAATAGATGCAGAAAAGGCCTTTGATAAAATTCAACAACCCTTCATGCTAAAAATACTCAATAAATTAGGTATTGTTGGGACATATCTCAAAATAATAAGAGCTATCTATGACAAACACACAGCCAATATCATACTGAATGGGCAAAAACTGGAAGCATTCCCTTTGAAAACTGGCACAAGATAGGGATGCCCTCTCTCACCACTCCTATTCAACATAGTGTTGGAAGTTCTGGCCAGGGCCATTAGGCAGGAGAAGGAAATAAAGGGTATGCAATTAGGAAAAGAGGAAGTCAAATTGTCCTTGTTTGCAGATGACATGATTGTATATCTAGAAAACCCCATTGTCTCAGCCCAAAATCTCCTGAAGCTGATAAGCAACTTCAGCAAAGTCTCAGGATACAAAATCAATGTACAAAAATCACAAGCATTCTTATACACCAATAACAGACAAACAGAGAGCCAAATCATAAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAATATGTAGGAATCCAACTTACAAGGGACGTGAAGGACCTCTTCAAGGAGAACTACAAACCACTGTTCAATGAAATAAAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGAATAGGAAGAATCAATATCGTGAAAATGACCATACTGCCCAAGGTAATTTATAGATTCAATGTCATCCCCAACAAGCTACCAATGACTTTCTTCACAGAATTGGAAAAGACTACTTTAAAGTTCATATGGAACCAAAAAAGATCCTGCATCACCATGTCAATCCTAAGCCGCAAGAACAAAGCTGGAGGCATCAGTCTACCTGACTTCGAACTATACTACAGGGATACAGTAACCAAGACATCATGGTACTGGTACCAAAACAGAAATATAGATCAATGGAACAGAACAGAGCCCTCAGAAATAATGCTGCATATCTACAACTATCTGTTCTTTGATAAACCTGAACAAAACAAGCAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTAGCCATATGTAGAAAGCTGAAACTGGATCCCTTCCTTACACCTTATACAAAAATTAATTCAAGATGGATTAAAAAGTTaaacgttagacctaaaaccatAAAAATCCTAGAAGAAAACCTAGGCATTACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTTTAAAACAACAAAAGCAATGGCAACAAATGCCATAATTGACAAATGGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACTACCAGCAGAGTGAACAGGCAATGTACAAAATGGGAGATAGTTTTCACAACCTACTTATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTACAAGAAaaaaacaaacaaaGCCATCAAAAAGTGGGTGAAGGACATCAACAGACACTTCCC-AAAGAAGACATTTATGCAGCC-AAAAACACATGAAAAAATGCTCATC
+
+a score=3398 mismap=6.76e-06
+s L1PA3_3end#LINE/L1 149 750 + 902 ACTGGCCATCAGAGAAATGCAAATCAAAACCACAATGAGATACCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTGTTGGTGGGACTGTAAACTAGTTCAACCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACTAGAAATACCATTTGACCCAGCCATCCCATTACTGGGTATATACCCAAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCGGCACTATTCACAATAGCAAAGACTTGGAACCAACCCAAATGTCCAACAATGATAGACTGGATTAAGAAAATGTGGCACATATACACCATGGAATACTATGCAGCCATA-AAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATGAAATTGGAAATCATCATTCTCAGTAAACTATCGCAAGAACAAAAAA-CCAAACACCGCATATTCTCACTCATAGGTGGGAATTGAACAATGAGAACACATGGACACAGGAAGGGGAACATCACACTCTGGGGACTGTTGTGGGgtggggggaggggggagggATAGCATTGGGAGATATACCTAATGCTAGATGACGAGTTAGTGGGTGCAGCGCACCAGCATGGCACATGTATACATATGTAACTAACCTGCACATTGTGCACATGTACCCTAAAACTTAAAGTATAataataataaaa
+s chr15 17018992 749 + 101991189 ACTGGCCATCAGAGAAATGCAAATGAAAACCACAATGAGATACCACCTCACACCAGTTAGGATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGTTGGAGAGGATGTGGAGAAATAGAAACATTTTTACACTTTTGATGGGGCTGTAAACTAATTCAGCCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGAAGTAGAGCTAGAAATACAATTTGACCCAGCCATCCCATTACTGGGCACGTACCCAA-GGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCAGCACTATTCACAATTGCA--GACTTGGAAACAACCCAAATGTCCAGCAATAACAGAGTGGATTAAGAAAACGTGGCACATATGTACAATGGAATACTATGGAGCCATACAAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATGAAACTGGAAATCATCATTCTCAGTAAACTATTGCAAGGACAAAAAAACCAAACACTGCATGTTCTCACTCATAGGTGGGAATTGAGCGATGAGAACACATGGACACAGGAAGGGGAACATCACACTCTGGGGACTGATGTGAGGTGgggggaggagggaggGATAGCATTAGGAGATATACCTAATGCTAAATGACGAGTTAATGGGTGCAGCACACCAACATGGCACATGTACACATATGTAACTAACCTGTACAATGTGCACATGTACCCTAAAACTTAAAGTATAataataataaaa
+
+a score=2158 mismap=3.89e-09
+s L1PA3_3end#LINE/L1 0 436 - 902 TTTttttattattattatACTTTAAGTTTTAGGGTACATGTGCACAATGTGCAGGTTAGTTACATATGTATACATGTGCCATGCTGGTGCGCTGCACCCACTAACTCGTCATCTAGCATTAGGTATATCTCCCAATGCTATCCCTCCCccctccccccaccccacaACAGTCCCCAGAGTGTGATGTTCCCCTTCCTGTGTCCATGTGTTCTCATTGTTCAATTCCCACCTATGAGTGAGAATATGCGGTGTTTGGTTTTTTGTTCTTGCGATAGTTTACTGAGAATGATGATTTCCAATTTCATCCATGTCCCTACAAAGGACATGAACTCATCATTTTTTATGGCTGCATAGTATTCCATGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATCATTGTTGGACATTTGGGTTGGTTCCAAGTCTTTGCTA
+s chr15 17035566 436 + 101991189 ttttcttattattattattcttTAAGTTTTAGGGTAAATGTGCACAATGTGCAGGTTAGTTACATATGTATACATGTGACATGCTGGTGTGCTGCACCCACTAACCCGTCATCTAGCATTAGATATATCTCCCAATGCTCTCCGTCCCccctccccccaccccacaACAGTCCCCAGAGTGTGATGTTCCCTTTCCTGTGTCCATGTGTTCTCATTGTTCAATTCCCACCTATGAGTGAGAATATGAGGTGTTTGGTTTTTTGTTCTTGCAATATTTTACTGAGAATGTTGATTTCCAATTTCATCCATGTCCCTACAAAGGACATGAACTCATCATTTTTTATGGCTGCATAGTATTCCATGGTGTATATGCGCCACATTTTTTTAATCCAGTCTATCATTGTTGGACATTTGGGTTGGTTCCAAGTCTTTGCTA
+
+a score=4053 mismap=8.38e-09
+s L1PA3_3end#LINE/L1 39 857 + 902 AAGAAaaaaacaaacaaCCCCATCAAAAAGTGGGCGAAGGACATGAACAGACACTTCTCAAAAGAAGACATTTATGCAGCCAAAAAACACATGAAAAAATGCTCACCATCACTGGCCATCAGAGAAATGCAAATCAAAACCACAATGAGATACCATCTCACACCAGTTAGAATGGCAATCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACACTGTTGGTGGGACTGTAAACTAGTTCAACCATTGTGGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACTAGAAATACCATTTGACCCAGCCATCCCATTACTGGGTATATACCCAAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCGGCACTATTCACAATAGCAAAGACTTGGAACCAACCCAAATGTCCAACAATGATAGACTGGATTAAGAAAATGTGGCACATATACACCATGGAATACTATGCAGCCATAAAAAATGATGAGTTCATGTCCTTTGTAGGGACATGGATGAAATTGGAAATCATCATTCTCAGTAAACTATCGCAAGAACAAAAAACCAAACACCGCATATTCTCACTCATAGGTGGGAATTGAACAATGAGAACACATGGACACAGGAAGGGGAACATCACACTCTGGGGACTGTTGTGGGgtggggggaggggggagggATAGCATTGGGAGATATACCTAATGCTAGATGACGAGTTAGTGGGTGCAGCGCACCAGCATGGCACATGTATACATATGTAACTAACCTGCACATTGTGCACATGTACCCTAAAACTTAAAGTATAataataata
+s chr15 17042713 856 + 101991189 aaaaaacaaacaaataaCCCCTTCAAAAAGTGGGTGAAGGACATGAACATACACTTCTCAAAAGAATATGTTTATGCAGCCAAAAAACACATGAAAAAATGCTCACCATCACTGGCCATCAGAGAAATGCAAATCAAAACCACAATGAAATACCATCACACACCAATTAGAATGGCAGTCATTAAAAAGTCAGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTTTTACAAAGTGGGTGGCACTGTAAACTAGTACAACCATTGTAGAAGTCAGTGTGGCGATTCCTCAGGGATCTAGAACCAGAAATACCATTTGACCCAGCCATCCCATTACTGGGTATGTACCCAAAGGACTATAAATCATGCTGCTATAAAGACACATGCACACGTATGTTTATTGCAGCACTAGTCACAATAGCAAAGAGTTGGAACCAACCCAAATGTCCAACAATGATAGACTTGATTAAGAAAATATGGCACATATACACCATGGAATACTATGCAGCCATAAAAAATGATGAGTTCATGTCCTTTGTAGTGACATGGATGAAATTGGAAATCATCATTCTCAGTAAACTATTGCAAGAATAAAAGACCAAACACCGCATATTCTCACTCATAAGTGGGAATTGAACAATGAGAACACATGGACACAGGAAGGGGAATGTCATACTCTGGGGACTGTTGTGGGGTGTGGTGA-GGCGCAGGGATAGCATTATGAGATATACCTAATGCTAAATGACGAGTTAATGGGTGCAGCACACCAGCGTGGTACATGTATACATGTGTAACTAACCTGCACATTGTGCACATGTACCCTAAAACTTAAAGTATAataataata
+
+a score=4305 mismap=5.26e-08
+s L1PA2_3end#LINE/L1 9 853 - 902 ttttattatACTTTAAGTTTTAGGGTACATGTGCACATTGTGCAGGTTAGTTACATATGTATACATGTGCCATGCTGGTGCGCTGCACCCACTAACTCGTCATCTAGCATTAGGTATATCTCCCAATGCTATCCCTCCcccctccccccaccccaccaCAGTCCCCAGAGTGTGATATTCCCCTTCCTGTGTCCATGTGATCTCATTGTTCAATTCCCACCTATGAGTGAGAATATGCGGTGTTTGGTTTTTTGTTCTTGCGATAGTTTACTGAGAATGATGATTTCCAATTTCATCCATGTCCCTACAAAGGACATGAACTCATCATTTTTTATGGCTGCATAGTATTCCATGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATCATTGTTGGACATTTGGGTTGGTTCCAAGTCTTTGCTATTGTGAATAATGCCGCAATAAACATACGTGTGCATGTGTCTTTATAGCAGCATGATTTATAGTCCTTTGGGTATATACCCAGTAATGGGATGGCTGGGTCAAATGGTATTTCTAGTTCTAGATCCCTGAGGAATCGCCACACTGACTTCCACAATGGTTGAACTAGTTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAATGATTGCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGATGAGCATTTTTTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCGCCCACTTTTTGATGGGGTTGTttgtttttttcT
+s chr15 17049334 852 + 101991189 TTTTATTATACTTTAAGTTTTAGGGTACATGTGCACATTGTGCAGATTAGTTACATATGTATACATGTGACATGCTGGTGCGCTGCAGCCACTAACTCGTCATCTAGTATTAGGTATATTTCCTAATGCTATCCCTCCCTCCTGCCCCCACCCCACCACAGTCCCCAGAGTGTGATATTCCCCTTCCTGTGTCCATGTGATCTCATTGTTCAATTCCCACCTATGAGTGAAAATATGCGGTGTTTGGTTTTT-GTTCTTGTGATAGTTTGCTGAGAATGATGGTTTCCAGTTTCATCCATGTCTCTACAAAGGACATGAACTCTTCATTTTTTATGGCTGCATAGTATTCCATGGTGTATATGTACCACATTTTCTTAATCCAGTCTATGGTTGTTGGACATTTGGGTTGGTTCCAAGTCTTTGCTATTGTGAATAATGCCACAATAAACATATGTGTGCATGTGTCTTTATAGCAGCATGATTTATAGTCCTTTGGGTATATACCCAGTAATGGGATGGCTGGGTCAAATGGTATTTCCAGTTCTAGATCCCTGAGGAATCGCCACACTGACTTCCACAATGGTTGAACTAGTTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTAGTTTCCTGACTTTTTAATGATTGCCATTCTAACTGGTGTGAGATGATATCACATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGATGAGCATTTTTTCATGTGTTTTTTGGCTGCATAAATGTCTTCTTTTGAGAAGTGTCTGTTCATGTCCTTCACCCACTTTTTGATGGGGTTGTTtgttttttTCT
+
+a score=2956 mismap=1e-10
+s L1P1_orf2#LINE/L1 111 575 - 3294 GTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCAGATGAGTAGGTTGCGAAAATTTTCTCCCATTCTGTAGGTTGCCTGTTCACTCTGATGGTAGTTTCTTTTGCTGTGCAGAAGCTCTTTAGTTTAATTAGATCCCATTTGTCAATTTTGGCTTTTGTTGCCATTGCTTTTGGTGTTTTAGACATGAAGTCCTTGCCCATGCCTATGTCCTGAATGGTANTGCCTAGGTTTTCTTCTAGGGTTTTTATGGTTTTAGGTCTAACGTTTAAGTCTTTAATCCATCTTGAATTAATTTTTGTATAAGGTGTAAGGAAGGGATCCAGTTTCAGCTTTCTACATATGGCTAGCCAGTTTTCCCAGCACCATTTATTAAATAGGGAATCCTTTCCCCATTGCTTGTTTTTCTCAGGTTTGTCAAAGATCAGATAGTTGTAGATATGCGGCGTTATTTCTGAGGGCTCTGTTCTGTTCCATTGATCTATATCTCTGTTTTGGTACCAGTACCATGctgttttggttactgtaGCCTTGTAGTATAGTTTGAAGTCAGGTAGCGTGA
+s chr15 17050187 575 + 101991189 GTAAATTTGTTTGAGTTCATTGTAGATTCTGGATATTAGCCCTTTGTCAGATGAGTAGGTTGCAAAAATTTTCTCCCATTTTGTAGGTTGCCTGTTCACTCTGATGGTAGTTTCTTTTGCTGTGCAGAAGCTCTTTAGTTTAATTAGATCCTATTTGTCAATTTTGTCTTTTGTTGCCATTGCTTTTGGTGTTTTGGACATGAAGTCCTTGCCCATGCCTATGTCCTGAATGGTAATGCTTAGGTTTTCTTCTAGGGTTTTTATGGTTTTAGGTCTAACGTTTAAATCTTTAATCCATCTTGAATTGATTTTTGTATAAGGTGTAAGGAAGGGATCCAGTTTCAGCTTTCTACATATGGCTAGCCAGTTTTCCCAGCACCATTTATTAAATAGGGAATCCTTTCCCCATTTCTTGTTTTTCTCAGGTTTGTCAAAGGTCAGATAGTTGTAGATATGCGGTGTTATTTCTGAGGGCTCTGTTCTGTTCCATTGATCTATATCTCTGTTTTGGTACCAGTACCATGctgttttggttactgtaGCCTTGTAGTACAGTTTGAAGTCAGGTAGTGTGA
+
View it on GitLab: https://salsa.debian.org/med-team/last-align/-/compare/52b5297cba13e88371f6fe2844d22f53da75892e...4a58fc4334ab31dd6ff1af182535db0c43e0a5fa
--
View it on GitLab: https://salsa.debian.org/med-team/last-align/-/compare/52b5297cba13e88371f6fe2844d22f53da75892e...4a58fc4334ab31dd6ff1af182535db0c43e0a5fa
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20240220/299c52af/attachment-0001.htm>
More information about the debian-med-commit
mailing list