Bug#852013: Patch to prevent segfaults on signal

Ximin Luo infinity0 at debian.org
Thu Feb 9 19:21:00 UTC 2017


The bug is closed, but I took a closer look at the issue to learn more about the situation.

Chris' commit fixed the cleanup issue, but I think Brett's FIFO patch was probably also needed to deal with the segfaults.

Brett Smith:
> [..] While I was debugging, I added the line
> `traceback.print_stack(stack_frame, file=sys.stderr)` to the top of
> sigterm_handler, before the exit.
> 
> When Python *did* segfault, it was always in the middle of trying to
> acquire a threading.Lock object.  [..]
> 
> I bring all this up to say, I'm not shocked that we might see different
> results on different systems.  Python itself and potentially glibc are
> involved here too.  For example, one theory I had is that the glibc call
> underlying the lock.acquire() method is not reentrant, but Python was
> trying to reenter it after handling SIGTERM in the main thread.
> 
> [..]

I'm not sure if your method of adding pdb calls will reliably catch SEGV at the right place exactly where they occur. It's better to use gdb:


$ diffoscope --version
diffoscope 67
$ apt-get install libc6-dbg python3-dbg
$ apt-get source glibc python3.5
$ diffoscope /usr/bin/pandoc /usr/bin/git-annex &
[..]

$ gdb -q python3 $(pgrep -f diffoscope) -d glibc-2.24/debian -d python3.5-3.5.3/debian
Reading symbols from python3...Reading symbols from /usr/lib/debug/.build-id/db/fc2e1a3c58b6d241b3f9af7b2fb3a24b81b90e.debug...done.
done.
Attaching to program: /usr/bin/python3, process 19406
[New LWP 19444]
[New LWP 19447]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fa48b9f3536 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564087d107b0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205	  int err = lll_futex_timed_wait_bitset (futex_word, expected, abstime,


# Send INT to python3
# Have to do it this way when gdb is attached, kill(1) doesn't work
(gdb) signal 2
Continuing with signal SIGINT.

Thread 3 "diffoscope" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fa4794ce700 (LWP 19447)]
__memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/../multiarch/memmove-vec-unaligned-erms.S:333
333		VMOVU	%VEC(0), (%rdi)


# Python stack trace
(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 209, in feeder
    out_file.write(out)
  File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 186, in feeder
    end_nl = make_feeder_from_raw_reader(command.stdout, command.filter)(out_file)
  File "/usr/lib/python3/dist-packages/diffoscope/diff.py", line 197, in feed
    end_nl = feeder(f)
  File "/usr/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3/dist-packages/diffoscope/diff.py", line 213, in run
    super().run(*args, **kwargs)
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.5/threading.py", line 882, in _bootstrap
    self._bootstrap_inner()

(gdb) py-locals
out_file = <_io.BufferedWriter at remote 0x7fa47ac099e8>
max_lines = 1048576
line_count = 287187
end_nl = False
h = <_hashlib.HASH at remote 0x7fa47abea328>
buf = b'0000000004295240  0000000000000008 R_X86_64_RELATIVE                         4295202\n'
out = b'0000000004295240  0000000000000008 R_X86_64_RELATIVE                         4295202\n'

(gdb) py-bt-full
#5 Frame 0x7fa47abdea20, for file /usr/lib/python3/dist-packages/diffoscope/difference.py, line 209, in feeder (out_file=<_io.BufferedWriter at remote 0x7fa47ac099e8>, max_lines=1048576, line_count=287187, end_nl=False, h=<_hashlib.HASH at remote 0x7fa47abea328>, buf=b'0000000004295240  0000000000000008 R_X86_64_RELATIVE                         4295202\n', out=b'0000000004295240  0000000000000008 R_X86_64_RELATIVE                         4295202\n')
    out_file.write(out)
#9 Frame 0x7fa474001e88, for file /usr/lib/python3/dist-packages/diffoscope/difference.py, line 186, in feeder (out_file=<_io.BufferedWriter at remote 0x7fa47ac099e8>)
    end_nl = make_feeder_from_raw_reader(command.stdout, command.filter)(out_file)
#13 Frame 0x7fa474001c78, for file /usr/lib/python3/dist-packages/diffoscope/diff.py, line 197, in feed (feeder=<function at remote 0x7fa47a989488>, f=<_io.BufferedWriter at remote 0x7fa47ac099e8>, end_nl_q=<Queue(queue=<collections.deque at remote 0x7fa47a991048>, [.. etc ..] ...(truncated)
    end_nl = feeder(f)
#19 Frame 0x7fa47a98b238, [.. etc ..]


# Machine stack trace
# Frames 0-5 are Python executing `out_file.write(out)`
(gdb) bt
#0  __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/../multiarch/memmove-vec-unaligned-erms.S:333
#1  0x00005640852b72e7 in memcpy () at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#2  _io_BufferedWriter_write_impl.isra.2 (self=0x7fa47ac099e8) at ../Modules/_io/bufferedio.c:1952
#3  _io_BufferedWriter_write () at ../Modules/_io/clinic/bufferedio.c.h:376
#4  0x0000564085316cac in call_function (oparg=<optimized out>, pp_stack=0x7fa4794ccfc0) at ../Python/ceval.c:4705
#5  PyEval_EvalFrameEx () at ../Python/ceval.c:3251
#6  0x000056408531b7bf in _PyEval_EvalCodeWithName () at ../Python/ceval.c:4033
#7  0x00005640853174c9 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7fa4794cd1d0, func=<optimized out>) at ../Python/ceval.c:4828
#8  call_function (oparg=<optimized out>, pp_stack=0x7fa4794cd1d0) at ../Python/ceval.c:4745
#9  PyEval_EvalFrameEx () at ../Python/ceval.c:3251
[.. etc ..]


(gdb) list
328		VMOVU	(VEC_SIZE * 3)(%rsi), %VEC(3)
329		VMOVU	-VEC_SIZE(%rsi,%rdx), %VEC(4)
330		VMOVU	-(VEC_SIZE * 2)(%rsi,%rdx), %VEC(5)
331		VMOVU	-(VEC_SIZE * 3)(%rsi,%rdx), %VEC(6)
332		VMOVU	-(VEC_SIZE * 4)(%rsi,%rdx), %VEC(7)
333		VMOVU	%VEC(0), (%rdi)
334		VMOVU	%VEC(1), VEC_SIZE(%rdi)
335		VMOVU	%VEC(2), (VEC_SIZE * 2)(%rdi)
336		VMOVU	%VEC(3), (VEC_SIZE * 3)(%rdi)
337		VMOVU	%VEC(4), -VEC_SIZE(%rdi,%rdx)
(gdb) up
#1  0x00005640852b72e7 in memcpy () at /usr/include/x86_64-linux-gnu/bits/string3.h:53
53	  return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) up
#2  _io_BufferedWriter_write_impl.isra.2 (self=0x7fa47ac099e8) at ../Modules/_io/bufferedio.c:1952
1952	        memcpy(self->buffer + self->pos, buffer->buf, buffer->len);
(gdb) list
1947	        self->pos = 0;
1948	        self->raw_pos = 0;
1949	    }
1950	    avail = Py_SAFE_DOWNCAST(self->buffer_size - self->pos, Py_off_t, Py_ssize_t);
1951	    if (buffer->len <= avail) {
1952	        memcpy(self->buffer + self->pos, buffer->buf, buffer->len);
1953	        if (!VALID_WRITE_BUFFER(self) || self->write_pos > self->pos) {
1954	            self->write_pos = self->pos;
1955	        }
1956	        ADJUST_POSITION(self, self->pos + buffer->len);
(gdb) print self->buffer
$1 = 0x0
(gdb) print self->pos
$2 = 0
(gdb) print buffer->buf
value has been optimized out
(gdb) up
#3  _io_BufferedWriter_write () at ../Modules/_io/clinic/bufferedio.c.h:376
376	    return_value = _io_BufferedWriter_write_impl(self, &buffer);
(gdb) print buffer->buf
$3 = (void *) 0x7fa47ac1bb18
(gdb) print buffer->len
$4 = 85
(gdb) print self
$5 = (buffered *) 0x7fa47ac099e8
(gdb) print self->buffer
$6 = 0x0
(gdb) print self->pos
$7 = 0

In other words, the segfault occurs because python calls memcpy(NULL, *, *), when trying to write to a _io.BufferedWriter. I dug a little bit deeper, and the "file objects being misused across thread boundaries" theory seems plausible, but I didn't follow this all the way to the end to be sure. I'll take a closer look at that part of the code going forward, though.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



More information about the Reproducible-builds mailing list