[Debian-med-packaging] Bug#956324: Clustalo bus error on mipsel (Was: Bug#956324: python-biopython: FTBFS on mipsel)
Matthew Fernandez
matthew.fernandez at gmail.com
Sat Apr 18 00:28:23 BST 2020
> On Apr 17, 2020, at 13:18, Andreas Tille <tille at debian.org> wrote:
>
> Hi Matthew,
>
> On Fri, Apr 17, 2020 at 08:18:29AM -0700, Matthew Fernandez wrote:
>>> Thanks for the patch which I applied to packaging Git. I assume you
>>> want to express that while these fixes are definitely good coding
>>> practice the bus error problem is not fixed by it, right?
>>
>> Thanks, Andreas. It may fix the bus error, but I don’t have a MIPS machine
>> to test on. Some of those logging calls had the potential to leave you with
>> a misaligned stack pointer. IIUC unaligned loads on MIPS could cause such a
>> bus error.
>
> I tried with hope ... but failed:
>
> (sid_mipsel-dchroot)tille at eller:~/clustalo$ gdb --args src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
> GNU gdb (Debian 9.1-3) 9.1
> ...
> Reading symbols from src/clustalo...
> (gdb) run
> Starting program: /home/tille/clustalo/src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/mipsel-linux-gnu/libthread_db.so.1".
>
> Program received signal SIGBUS, Bus error.
> 0x5556a1b8 in PairDistances (distmat=0x7fff278c, mseq=0x55692a30, pairdist_type=<optimized out>, bPercID=<optimized out>, istart=0, iend=3, jstart=0, jend=3, fdist_in=0x0,
> fdist_out=0x0) at pair_dist.c:346
> 346 NewProgress(&prProgress, LogGetFP(&rLog, LOG_INFO),
OK, let me try a little harder :)
$ # enable debugging symbols and Address Sanitizer
$ CFLAGS="-g -fsanitize=address" CXXFLAGS="-g -fsanitize=address" ./configure
…
$ make clean && make
…
$ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
=================================================================
==30264==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffcfcbf5784 at pc 0x5620f0aa478c bp 0x7ffcfcbf56c0 sp 0x7ffcfcbf56b8
WRITE of size 4 at 0x7ffcfcbf5784 thread T0
#0 0x5620f0aa478b in PairDistances /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336
#1 0x5620f0a91d9f in AlignmentOrder /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:835
#2 0x5620f0a95c04 in Align /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1221
#3 0x5620f0a90d76 in MyMain /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192
#4 0x5620f0a88ca2 in main /home/matthew/clustal-omega-1.2.4/src/main.cpp:469
#5 0x7f3773d9009a in __libc_start_main ../csu/libc-start.c:308
#6 0x5620f0a89ad9 in _start (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9)
Address 0x7ffcfcbf5784 is located in stack of thread T0
SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336 in PairDistances
Shadow bytes around the buggy address:
0x10001f976aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001f976ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001f976ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001f976ad0: 00 00 00 00 00 00 00 00 00 00 00 00 ca ca ca ca
0x10001f976ae0: 04 cb cb cb cb cb cb cb 00 00 00 00 ca ca ca ca
=>0x10001f976af0:[04]cb cb cb cb cb cb cb 00 00 00 00 00 00 00 00
0x10001f976b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10001f976b10: f1 f1 f1 f1 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2
0x10001f976b20: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2 f2 f2
0x10001f976b30: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
0x10001f976b40: 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==30264==ABORTING
Looking at line 336 of pair_dist.c, it looks like the bound on the containing loop is wrong. So let’s try adjusting that:
$ vim src/clustal/pair_dist.c
$ git diff src/clustal/pair_dist.c
diff --git a/src/clustal/pair_dist.c b/src/clustal/pair_dist.c
index e6dbdc3..bb79e61 100644
--- a/src/clustal/pair_dist.c
+++ b/src/clustal/pair_dist.c
@@ -321,7 +321,7 @@ PairDistances(symmatrix_t **distmat, mseq_t *mseq, int pairdist_type, bool bPerc
/* FIXME: can get rid of iChunkStart, iChunkEnd now that we're using the arrays */
iChunkStart = iend;
- for(iChunk = 0; iChunk <= iNumberOfThreads; iChunk++)
+ for(iChunk = 0; iChunk < iNumberOfThreads; iChunk++)
{
iChunkEnd = iChunkStart;
if (iChunk == iNumberOfThreads - 1){
$ make
…
$ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal --force
=================================================================
==30601==ERROR: AddressSanitizer: global-buffer-overflow on address 0x561188847864 at pc 0x5611886da6e7 bp 0x7fffe6d77ef0 sp 0x7fffe6d77ee8
READ of size 4 at 0x561188847864 thread T0
#0 0x5611886da6e6 in FullAlignment::Build(HMM&, Hit&, char*) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250
#1 0x5611886df3eb in HitList::PrintAlignments(char**, char**, char*, char*, HMM&, char*, char) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhhitlist-C.h:197
#2 0x5611886f379b in hhalign /home/matthew/clustal-omega-1.2.4/src/hhalign/hhalign.cpp:1211
#3 0x56118863f848 in HHalignWrapper /home/matthew/clustal-omega-1.2.4/src/clustal/hhalign_wrapper.c:1342
#4 0x561188637db1 in Align /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1250
#5 0x561188632d76 in MyMain /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192
#6 0x56118862aca2 in main /home/matthew/clustal-omega-1.2.4/src/main.cpp:469
#7 0x7f6d857f109a in __libc_start_main ../csu/libc-start.c:308
#8 0x56118862bad9 in _start (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9)
0x561188847864 is located 60 bytes to the left of global variable 'Sim' defined in 'hhdecl-C.h:234:7' (0x5611888478a0) of size 1764
0x561188847864 is located 0 bytes to the right of global variable 'S' defined in 'hhdecl-C.h:235:7' (0x561188847180) of size 1764
SUMMARY: AddressSanitizer: global-buffer-overflow /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250 in FullAlignment::Build(HMM&, Hit&, char*)
Shadow bytes around the buggy address:
0x0ac2b1100eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ac2b1100f00: 00 00 00 00 00 00 00 00 00 00 00 00[04]f9 f9 f9
0x0ac2b1100f10: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac2b1100f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==30601==ABORTING
Looking at line 250 of hhfullalignment-C.h, we can see it’s reading the array ‘S’ out of bounds here. Someone has helpfully left a debugging line below this, so let’s shuffle it ahead of the faulting access and remove the part where it is also performing the faulting access:
$ vim src/hhalign/hhfullalignment-C.h
$ git diff src/hhalign/hhfullalignment-C.h
diff --git a/src/hhalign/hhfullalignment-C.h b/src/hhalign/hhfullalignment-C.h
index 8f40fd1..fd759f9 100644
--- a/src/hhalign/hhfullalignment-C.h
+++ b/src/hhalign/hhfullalignment-C.h
@@ -247,8 +247,8 @@ FullAlignment::Build(HMM& q, Hit& hit, char zcError[])
char qc=qa->seq[ q.nfirst][ qa->m[ q.nfirst][hit.i[step]] ];
char tc=ta->seq[hit.nfirst][ ta->m[hit.nfirst][hit.j[step]] ];
if (qc==tc) identities++; // count identical amino acids
+ fprintf(stderr,"%3i %3i %3i %3i %3i %1c %1c %6.2f %6.2f %6.2f \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,score_sim,hit.P_posterior[step],hit.sum_of_probs); //DEBUG
score_sim += S[(int)aa2i(qc)][(int)aa2i(tc)];
- // fprintf(stderr,"%3i %3i %3i %3i %3i %1c %1c %6.2f %6.2f %6.2f %6.2f \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,S[(int)aa2i(qc)][(int)aa2i(tc)],score_sim,hit.P_posterior[step],hit.sum_of_probs); //DEBUG
}
}
$ make
…
$ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out temp_test.dnd -o temp_test.aln --outfmt clustal —force
…
28 582 386 559 10 N - 127.25 0.01 2.91
=================================================================
==30936==ERROR: AddressSanitizer: global-buffer-overflow on address 0x563d5b2258a4 at pc 0x563d5b0b79e8 bp 0x7ffd269e0e40 sp 0x7ffd269e0e38
READ of size 4 at 0x563d5b2258a4 thread T0
#0 0x563d5b0b79e7 in FullAlignment::Build(HMM&, Hit&, char*) /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:251
…
So the values of qc and tc at this point are 'N' and '-', respectively. This results in an access to S[20][21], which is indeed out of range as S is a 21x21 array. To go further, I think I need some guidance from a domain expert. Is aa2i() ever expected to be called with a value that maps to GAP or ANY? Maybe S is actually meant to be a 22x22 array? Maybe the loop in hhfullalignment-C.h is meant to skip any iteration for which qc or tc map to GAP?
By the way, Andreas, I am doing this debugging on the upstream 1.2.4 release on an x86-64 machine so I still have no certainty that this is related to the root cause of your observed problem on MIPS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20200417/8596e775/attachment-0001.html>
More information about the Debian-med-packaging
mailing list