[Debian-med-packaging] Bug#1095434: sambamba: rare segmentation fault makes nanosv autopkgtest flaky
Étienne Mollier
emollier at debian.org
Fri Feb 7 20:41:29 GMT 2025
Package: sambamba
Version: 1.0.1+dfsg-2
Severity: wishlist
Greetings,
It seems that nanosv's autopkgtest may fail from time to time
with the following error:
autopkgtest [20:32:10]: test run-unit-test: [-----------------------
Fri Feb 7 19:32:10 2025 Busy with calculating the coverage distribution...
Segmentation fault
Can't calculate coverage distribution. The bed file may be inappropriate for your bam file.
autopkgtest [20:32:10]: test run-unit-test: -----------------------]
I caught it initially while looking up python-pysam excuses
migration[1] and reproduced the error after several attempts at
running the test on my machine.
[1]: https://ci.debian.net/packages/n/nanosv/testing/amd64/57524433/
I initially thought the problem was in nanosv, but it is a
Python script. I eventually identified that what might cause
the Segmentation fault could actually be a program invoked from
within nanosv. Indeed, when searching for the message before
the error, I ended up seeing the snippet:
sys.stderr.write(time.strftime("%c") + " Busy with calculating the coverage distribution...\n")
if 'sambamba' in NanoSV.opts_sambamba:
cmd = NanoSV.opts_sambamba + " depth base --min-coverage=0 " + NanoSV.opts_bam + " -L " + NanoSV.opts_bed + " --nthreads=" + str(NanoSV.opts_threads) + " 2> /dev/null | awk '{if (NR!=1) print $3}'"
I eventually derived the corresponding command, using test files
provided by the samtools-test package, plus preprocessing as
done by the run-unit-test command from nanosv autopkgtest:
$ sambamba depth base --min-coverage=0 reads_sort.bam \
-L target.bed --nthreads=12 2>/dev/null
which if run several times may sometimes end in Segmentation
fault, instead of returning normally, while still outputing the
expected results:
$ sambamba depth base --min-coverage=0 reads_sort.bam \
-L target.bed --nthreads=12 2>/dev/null
REF POS COV A C G T DEL REFSKIP SAMPLE
ref 6 1 0 0 0 1 0 0 *
ref 7 1 0 0 0 1 0 0 *
ref 8 3 3 0 0 0 0 0 *
ref 9 3 0 0 3 0 0 0 *
[…]
ref2 32 3 3 0 0 0 0 0 *
ref2 33 3 0 3 0 0 0 0 *
ref2 34 2 0 0 0 2 0 0 *
ref2 35 1 1 0 0 0 0 0 *
Segmentation fault
Trying to trigger the problem in gdb, I have not been able to
reproduce any crash after several dozens of attempts on single
threaded variant. Struggling to capture a backtrace, I
eventually ended up with the following failure mode in the
debugger:
Continuing.
Processing reference #1 (ref)
ref 6 1 0 0 0 1 0 0 *
ref 7 1 0 0 0 1 0 0 *
ref 8 3 3 0 0 0 0 0 *
ref 9 3 0 0 3 0 0 0 *
ref 10 3 2 1 0 0 0 0 *
ref 11 3 0 0 0 3 0 0 *
ref 12 3 3 0 0 0 0 0 *
ref 13 3 3 0 0 0 0 0 *
ref 14 2 0 0 2 0 0 0 *
ref 15 3 3 0 0 0 0 0 *
ref 16 3 0 0 0 3 0 0 *
ref 17 3 3 0 0 0 0 0 *
ref 18 2 0 0 1 0 1 0 *
ref 19 2 0 2 0 0 0 0 *
ref 20 2 0 0 0 2 0 0 *
ref 21 2 0 0 1 0 0 1 *
ref 22 1 0 0 0 0 0 1 *
ref 23 1 0 0 0 0 0 1 *
ref 24 1 0 0 0 0 0 1 *
ref 25 1 0 0 0 0 0 1 *
ref 26 1 0 0 0 0 0 1 *
ref 27 1 0 0 0 0 0 1 *
ref 28 2 0 0 0 1 0 1 *
ref 29 2 1 0 0 0 0 1 *
ref 30 2 0 0 1 0 0 1 *
ref 31 2 0 0 1 0 0 1 *
ref 32 2 0 1 0 0 0 1 *
ref 33 1 0 0 0 0 0 1 *
ref 34 1 0 0 0 0 0 1 *
ref 35 1 0 0 0 1 0 0 *
ref 36 2 0 2 0 0 0 0 *
ref 37 2 2 0 0 0 0 0 *
ref 38 2 0 0 2 0 0 0 *
ref 39 2 0 2 0 0 0 0 *
ref 40 1 0 0 1 0 0 0 *
ref 41 1 0 1 0 0 0 0 *
ref 42 1 0 1 0 0 0 0 *
ref 43 1 1 0 0 0 0 0 *
ref 44 1 0 0 0 1 0 0 *
Processing reference #2 (ref2)
ref2 0 1 1 0 0 0 0 0 *
ref2 1 2 0 0 2 0 0 0 *
ref2 2 2 0 0 2 0 0 0 *
ref2 3 2 0 0 0 2 0 0 *
ref2 4 2 0 0 0 2 0 0 *
ref2 5 3 0 0 0 3 0 0 *
ref2 6 3 0 0 0 3 0 0 *
ref2 7 3 3 0 0 0 0 0 *
ref2 8 3 0 0 0 3 0 0 *
ref2 9 4 3 1 0 0 0 0 *
ref2 10 4 4 0 0 0 0 0 *
ref2 11 5 5 0 0 0 0 0 *
ref2 12 5 5 0 0 0 0 0 *
ref2 13 6 0 3 0 3 0 0 *
ref2 14 6 6 0 0 0 0 0 *
ref2 15 6 6 0 0 0 0 0 *
ref2 16 6 2 0 0 4 0 0 *
ref2 17 6 0 0 0 6 0 0 *
Thread 6 "sambamba" received signal SIG35, Real-time event 35.
[Switching to Thread 0x7ffff4ffb6c0 (LWP 931343)]
0x00007ffff76eb051 in __GI___sigsuspend (set=0x7ffff4ff9368) at ../sysdeps/unix/sysv/linux/sigsuspend.c:26
26 in ../sysdeps/unix/sysv/linux/sigsuspend.c
(gdb)
Continuing.
ref2 18 6 6 0 0 0 0 0 *
ref2 19 6 6 0 0 0 0 0 *
ref2 20 5 0 0 4 1 0 0 *
ref2 21 5 0 0 0 5 0 0 *
ref2 22 4 0 4 0 0 0 0 *
ref2 23 4 0 0 0 4 0 0 *
ref2 24 4 4 0 0 0 0 0 *
ref2 25 4 0 4 0 0 0 0 *
ref2 26 4 4 0 0 0 0 0 *
ref2 27 3 0 0 3 0 0 0 *
ref2 28 3 3 0 0 0 0 0 *
ref2 29 3 0 0 3 0 0 0 *
ref2 30 3 0 3 0 0 0 0 *
ref2 31 3 3 0 0 0 0 0 *
ref2 32 3 3 0 0 0 0 0 *
ref2 33 3 0 3 0 0 0 0 *
ref2 34 2 0 0 0 2 0 0 *
ref2 35 1 1 0 0 0 0 0 *
[Thread 0x7ffff6fff6c0 (LWP 931339) exited]
[Thread 0x7ffff67fe6c0 (LWP 931340) exited]
[Thread 0x7ffff5ffd6c0 (LWP 931341) exited]
[Thread 0x7ffff57fc6c0 (LWP 931342) exited]
[Thread 0x7fffdffff6c0 (LWP 931344) exited]
[Thread 0x7fffd75ff6c0 (LWP 931345) exited]
[Thread 0x7fffdf7fe6c0 (LWP 931346) exited]
[Thread 0x7fffdeffd6c0 (LWP 931347) exited]
[Thread 0x7fffde7fc6c0 (LWP 931348) exited]
[Thread 0x7fffddffb6c0 (LWP 931349) exited]
[Thread 0x7fffdd7fa6c0 (LWP 931350) exited]
Thread 6 "sambamba" received signal SIGSEGV, Segmentation fault.
0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
(gdb) bt
#0 0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#1 0x00007ffff7aa97d1 in ?? () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#2 0x00007ffff7aaa759 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO) delegate) () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#3 0x00007ffff7a93320 in thread_entryPoint () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#4 0x00007ffff773d083 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#5 0x00007ffff77bb7b8 in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
I'm not sure how to parse this output overall. For all I can
tell, there seems to be something like a race condition in the
handling of the multiple threads of execution.
Anyways, for information, just in case, it could very well have
been missed, should no regression have occurred in debci. I put
the severity as "wishlist" item, given how hard it has been to
reproduce the problem in order to avoid having it forgotten.
Maybe it would be of greater interest upstream.
In hope this helps,
--
.''`. Étienne Mollier <emollier at debian.org>
: :' : pgp: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da
`. `' sent from /dev/pts/3, please excuse my verbosity
`-
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20250207/f22a7eb8/attachment.sig>
More information about the Debian-med-packaging
mailing list