[Debian-med-packaging] Bug#1095434: sambamba: rare segmentation fault makes nanosv autopkgtest flaky

Étienne Mollier emollier at debian.org
Fri Feb 7 20:41:29 GMT 2025


Package: sambamba
Version: 1.0.1+dfsg-2
Severity: wishlist

Greetings,

It seems that nanosv's autopkgtest may fail from time to time
with the following error:

	autopkgtest [20:32:10]: test run-unit-test: [-----------------------
	Fri Feb  7 19:32:10 2025 Busy with calculating the coverage distribution...
	Segmentation fault
	Can't calculate coverage distribution. The bed file may be inappropriate for your bam file.
	autopkgtest [20:32:10]: test run-unit-test: -----------------------]

I caught it initially while looking up python-pysam excuses
migration[1] and reproduced the error after several attempts at
running the test on my machine.

[1]: https://ci.debian.net/packages/n/nanosv/testing/amd64/57524433/

I initially thought the problem was in nanosv, but it is a
Python script.  I eventually identified that what might cause
the Segmentation fault could actually be a program invoked from
within nanosv.  Indeed, when searching for the message before
the error, I ended up seeing the snippet:

	    sys.stderr.write(time.strftime("%c") + " Busy with calculating the coverage distribution...\n")
	    if 'sambamba' in NanoSV.opts_sambamba:
	        cmd = NanoSV.opts_sambamba + " depth base --min-coverage=0 " + NanoSV.opts_bam + " -L " + NanoSV.opts_bed + " --nthreads=" + str(NanoSV.opts_threads) + " 2> /dev/null | awk '{if (NR!=1) print $3}'"

I eventually derived the corresponding command, using test files
provided by the samtools-test package, plus preprocessing as
done by the run-unit-test command from nanosv autopkgtest:

	$ sambamba depth base --min-coverage=0 reads_sort.bam \
		-L target.bed --nthreads=12 2>/dev/null

which if run several times may sometimes end in Segmentation
fault, instead of returning normally, while still outputing the
expected results:

	$ sambamba depth base --min-coverage=0 reads_sort.bam \
		-L target.bed --nthreads=12 2>/dev/null
	REF	POS	COV	A	C	G	T	DEL	REFSKIP	SAMPLE
	ref	6	1	0	0	0	1	0	0	*
	ref	7	1	0	0	0	1	0	0	*
	ref	8	3	3	0	0	0	0	0	*
	ref	9	3	0	0	3	0	0	0	*
	[…]
	ref2	32	3	3	0	0	0	0	0	*
	ref2	33	3	0	3	0	0	0	0	*
	ref2	34	2	0	0	0	2	0	0	*
	ref2	35	1	1	0	0	0	0	0	*
	Segmentation fault

Trying to trigger the problem in gdb, I have not been able to
reproduce any crash after several dozens of attempts on single
threaded variant.  Struggling to capture a backtrace, I
eventually ended up with the following failure mode in the
debugger:

	Continuing.
	Processing reference #1 (ref)
	ref	6	1	0	0	0	1	0	0	*
	ref	7	1	0	0	0	1	0	0	*
	ref	8	3	3	0	0	0	0	0	*
	ref	9	3	0	0	3	0	0	0	*
	ref	10	3	2	1	0	0	0	0	*
	ref	11	3	0	0	0	3	0	0	*
	ref	12	3	3	0	0	0	0	0	*
	ref	13	3	3	0	0	0	0	0	*
	ref	14	2	0	0	2	0	0	0	*
	ref	15	3	3	0	0	0	0	0	*
	ref	16	3	0	0	0	3	0	0	*
	ref	17	3	3	0	0	0	0	0	*
	ref	18	2	0	0	1	0	1	0	*
	ref	19	2	0	2	0	0	0	0	*
	ref	20	2	0	0	0	2	0	0	*
	ref	21	2	0	0	1	0	0	1	*
	ref	22	1	0	0	0	0	0	1	*
	ref	23	1	0	0	0	0	0	1	*
	ref	24	1	0	0	0	0	0	1	*
	ref	25	1	0	0	0	0	0	1	*
	ref	26	1	0	0	0	0	0	1	*
	ref	27	1	0	0	0	0	0	1	*
	ref	28	2	0	0	0	1	0	1	*
	ref	29	2	1	0	0	0	0	1	*
	ref	30	2	0	0	1	0	0	1	*
	ref	31	2	0	0	1	0	0	1	*
	ref	32	2	0	1	0	0	0	1	*
	ref	33	1	0	0	0	0	0	1	*
	ref	34	1	0	0	0	0	0	1	*
	ref	35	1	0	0	0	1	0	0	*
	ref	36	2	0	2	0	0	0	0	*
	ref	37	2	2	0	0	0	0	0	*
	ref	38	2	0	0	2	0	0	0	*
	ref	39	2	0	2	0	0	0	0	*
	ref	40	1	0	0	1	0	0	0	*
	ref	41	1	0	1	0	0	0	0	*
	ref	42	1	0	1	0	0	0	0	*
	ref	43	1	1	0	0	0	0	0	*
	ref	44	1	0	0	0	1	0	0	*
	Processing reference #2 (ref2)
	ref2	0	1	1	0	0	0	0	0	*
	ref2	1	2	0	0	2	0	0	0	*
	ref2	2	2	0	0	2	0	0	0	*
	ref2	3	2	0	0	0	2	0	0	*
	ref2	4	2	0	0	0	2	0	0	*
	ref2	5	3	0	0	0	3	0	0	*
	ref2	6	3	0	0	0	3	0	0	*
	ref2	7	3	3	0	0	0	0	0	*
	ref2	8	3	0	0	0	3	0	0	*
	ref2	9	4	3	1	0	0	0	0	*
	ref2	10	4	4	0	0	0	0	0	*
	ref2	11	5	5	0	0	0	0	0	*
	ref2	12	5	5	0	0	0	0	0	*
	ref2	13	6	0	3	0	3	0	0	*
	ref2	14	6	6	0	0	0	0	0	*
	ref2	15	6	6	0	0	0	0	0	*
	ref2	16	6	2	0	0	4	0	0	*
	ref2	17	6	0	0	0	6	0	0	*
	
	Thread 6 "sambamba" received signal SIG35, Real-time event 35.
	[Switching to Thread 0x7ffff4ffb6c0 (LWP 931343)]
	0x00007ffff76eb051 in __GI___sigsuspend (set=0x7ffff4ff9368) at ../sysdeps/unix/sysv/linux/sigsuspend.c:26
	26	in ../sysdeps/unix/sysv/linux/sigsuspend.c
	(gdb) 
	Continuing.
	ref2	18	6	6	0	0	0	0	0	*
	ref2	19	6	6	0	0	0	0	0	*
	ref2	20	5	0	0	4	1	0	0	*
	ref2	21	5	0	0	0	5	0	0	*
	ref2	22	4	0	4	0	0	0	0	*
	ref2	23	4	0	0	0	4	0	0	*
	ref2	24	4	4	0	0	0	0	0	*
	ref2	25	4	0	4	0	0	0	0	*
	ref2	26	4	4	0	0	0	0	0	*
	ref2	27	3	0	0	3	0	0	0	*
	ref2	28	3	3	0	0	0	0	0	*
	ref2	29	3	0	0	3	0	0	0	*
	ref2	30	3	0	3	0	0	0	0	*
	ref2	31	3	3	0	0	0	0	0	*
	ref2	32	3	3	0	0	0	0	0	*
	ref2	33	3	0	3	0	0	0	0	*
	ref2	34	2	0	0	0	2	0	0	*
	ref2	35	1	1	0	0	0	0	0	*
	[Thread 0x7ffff6fff6c0 (LWP 931339) exited]
	[Thread 0x7ffff67fe6c0 (LWP 931340) exited]
	[Thread 0x7ffff5ffd6c0 (LWP 931341) exited]
	[Thread 0x7ffff57fc6c0 (LWP 931342) exited]
	[Thread 0x7fffdffff6c0 (LWP 931344) exited]
	[Thread 0x7fffd75ff6c0 (LWP 931345) exited]
	[Thread 0x7fffdf7fe6c0 (LWP 931346) exited]
	[Thread 0x7fffdeffd6c0 (LWP 931347) exited]
	[Thread 0x7fffde7fc6c0 (LWP 931348) exited]
	[Thread 0x7fffddffb6c0 (LWP 931349) exited]
	[Thread 0x7fffdd7fa6c0 (LWP 931350) exited]
	
	Thread 6 "sambamba" received signal SIGSEGV, Segmentation fault.
	0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
	(gdb) bt
	#0  0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
	#1  0x00007ffff7aa97d1 in ?? () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
	#2  0x00007ffff7aaa759 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO) delegate) () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
	#3  0x00007ffff7a93320 in thread_entryPoint () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
	#4  0x00007ffff773d083 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
	#5  0x00007ffff77bb7b8 in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

I'm not sure how to parse this output overall.  For all I can
tell, there seems to be something like a race condition in the
handling of the multiple threads of execution.

Anyways, for information, just in case, it could very well have
been missed, should no regression have occurred in debci.  I put
the severity as "wishlist" item, given how hard it has been to
reproduce the problem in order to avoid having it forgotten.
Maybe it would be of greater interest upstream.

In hope this helps,
-- 
  .''`.  Étienne Mollier <emollier at debian.org>
 : :' :  pgp: 8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
 `. `'   sent from /dev/pts/3, please excuse my verbosity
   `-
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/debian-med-packaging/attachments/20250207/f22a7eb8/attachment.sig>


More information about the Debian-med-packaging mailing list