Bug#852013: Patch to prevent segfaults on signal
Ximin Luo
infinity0 at debian.org
Thu Feb 9 19:21:00 UTC 2017
The bug is closed, but I took a closer look at the issue to learn more about the situation.
Chris' commit fixed the cleanup issue, but I think Brett's FIFO patch was probably also needed to deal with the segfaults.
Brett Smith:
> [..] While I was debugging, I added the line
> `traceback.print_stack(stack_frame, file=sys.stderr)` to the top of
> sigterm_handler, before the exit.
>
> When Python *did* segfault, it was always in the middle of trying to
> acquire a threading.Lock object. [..]
>
> I bring all this up to say, I'm not shocked that we might see different
> results on different systems. Python itself and potentially glibc are
> involved here too. For example, one theory I had is that the glibc call
> underlying the lock.acquire() method is not reentrant, but Python was
> trying to reenter it after handling SIGTERM in the main thread.
>
> [..]
I'm not sure if your method of adding pdb calls will reliably catch SEGV at the right place exactly where they occur. It's better to use gdb:
$ diffoscope --version
diffoscope 67
$ apt-get install libc6-dbg python3-dbg
$ apt-get source glibc python3.5
$ diffoscope /usr/bin/pandoc /usr/bin/git-annex &
[..]
$ gdb -q python3 $(pgrep -f diffoscope) -d glibc-2.24/debian -d python3.5-3.5.3/debian
Reading symbols from python3...Reading symbols from /usr/lib/debug/.build-id/db/fc2e1a3c58b6d241b3f9af7b2fb3a24b81b90e.debug...done.
done.
Attaching to program: /usr/bin/python3, process 19406
[New LWP 19444]
[New LWP 19447]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fa48b9f3536 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x564087d107b0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205 int err = lll_futex_timed_wait_bitset (futex_word, expected, abstime,
# Send INT to python3
# Have to do it this way when gdb is attached, kill(1) doesn't work
(gdb) signal 2
Continuing with signal SIGINT.
Thread 3 "diffoscope" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fa4794ce700 (LWP 19447)]
__memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/../multiarch/memmove-vec-unaligned-erms.S:333
333 VMOVU %VEC(0), (%rdi)
# Python stack trace
(gdb) py-bt
Traceback (most recent call first):
File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 209, in feeder
out_file.write(out)
File "/usr/lib/python3/dist-packages/diffoscope/difference.py", line 186, in feeder
end_nl = make_feeder_from_raw_reader(command.stdout, command.filter)(out_file)
File "/usr/lib/python3/dist-packages/diffoscope/diff.py", line 197, in feed
end_nl = feeder(f)
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3/dist-packages/diffoscope/diff.py", line 213, in run
super().run(*args, **kwargs)
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 882, in _bootstrap
self._bootstrap_inner()
(gdb) py-locals
out_file = <_io.BufferedWriter at remote 0x7fa47ac099e8>
max_lines = 1048576
line_count = 287187
end_nl = False
h = <_hashlib.HASH at remote 0x7fa47abea328>
buf = b'0000000004295240 0000000000000008 R_X86_64_RELATIVE 4295202\n'
out = b'0000000004295240 0000000000000008 R_X86_64_RELATIVE 4295202\n'
(gdb) py-bt-full
#5 Frame 0x7fa47abdea20, for file /usr/lib/python3/dist-packages/diffoscope/difference.py, line 209, in feeder (out_file=<_io.BufferedWriter at remote 0x7fa47ac099e8>, max_lines=1048576, line_count=287187, end_nl=False, h=<_hashlib.HASH at remote 0x7fa47abea328>, buf=b'0000000004295240 0000000000000008 R_X86_64_RELATIVE 4295202\n', out=b'0000000004295240 0000000000000008 R_X86_64_RELATIVE 4295202\n')
out_file.write(out)
#9 Frame 0x7fa474001e88, for file /usr/lib/python3/dist-packages/diffoscope/difference.py, line 186, in feeder (out_file=<_io.BufferedWriter at remote 0x7fa47ac099e8>)
end_nl = make_feeder_from_raw_reader(command.stdout, command.filter)(out_file)
#13 Frame 0x7fa474001c78, for file /usr/lib/python3/dist-packages/diffoscope/diff.py, line 197, in feed (feeder=<function at remote 0x7fa47a989488>, f=<_io.BufferedWriter at remote 0x7fa47ac099e8>, end_nl_q=<Queue(queue=<collections.deque at remote 0x7fa47a991048>, [.. etc ..] ...(truncated)
end_nl = feeder(f)
#19 Frame 0x7fa47a98b238, [.. etc ..]
# Machine stack trace
# Frames 0-5 are Python executing `out_file.write(out)`
(gdb) bt
#0 __memmove_sse2_unaligned_erms () at ../sysdeps/x86_64/multiarch/../multiarch/memmove-vec-unaligned-erms.S:333
#1 0x00005640852b72e7 in memcpy () at /usr/include/x86_64-linux-gnu/bits/string3.h:53
#2 _io_BufferedWriter_write_impl.isra.2 (self=0x7fa47ac099e8) at ../Modules/_io/bufferedio.c:1952
#3 _io_BufferedWriter_write () at ../Modules/_io/clinic/bufferedio.c.h:376
#4 0x0000564085316cac in call_function (oparg=<optimized out>, pp_stack=0x7fa4794ccfc0) at ../Python/ceval.c:4705
#5 PyEval_EvalFrameEx () at ../Python/ceval.c:3251
#6 0x000056408531b7bf in _PyEval_EvalCodeWithName () at ../Python/ceval.c:4033
#7 0x00005640853174c9 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7fa4794cd1d0, func=<optimized out>) at ../Python/ceval.c:4828
#8 call_function (oparg=<optimized out>, pp_stack=0x7fa4794cd1d0) at ../Python/ceval.c:4745
#9 PyEval_EvalFrameEx () at ../Python/ceval.c:3251
[.. etc ..]
(gdb) list
328 VMOVU (VEC_SIZE * 3)(%rsi), %VEC(3)
329 VMOVU -VEC_SIZE(%rsi,%rdx), %VEC(4)
330 VMOVU -(VEC_SIZE * 2)(%rsi,%rdx), %VEC(5)
331 VMOVU -(VEC_SIZE * 3)(%rsi,%rdx), %VEC(6)
332 VMOVU -(VEC_SIZE * 4)(%rsi,%rdx), %VEC(7)
333 VMOVU %VEC(0), (%rdi)
334 VMOVU %VEC(1), VEC_SIZE(%rdi)
335 VMOVU %VEC(2), (VEC_SIZE * 2)(%rdi)
336 VMOVU %VEC(3), (VEC_SIZE * 3)(%rdi)
337 VMOVU %VEC(4), -VEC_SIZE(%rdi,%rdx)
(gdb) up
#1 0x00005640852b72e7 in memcpy () at /usr/include/x86_64-linux-gnu/bits/string3.h:53
53 return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
(gdb) up
#2 _io_BufferedWriter_write_impl.isra.2 (self=0x7fa47ac099e8) at ../Modules/_io/bufferedio.c:1952
1952 memcpy(self->buffer + self->pos, buffer->buf, buffer->len);
(gdb) list
1947 self->pos = 0;
1948 self->raw_pos = 0;
1949 }
1950 avail = Py_SAFE_DOWNCAST(self->buffer_size - self->pos, Py_off_t, Py_ssize_t);
1951 if (buffer->len <= avail) {
1952 memcpy(self->buffer + self->pos, buffer->buf, buffer->len);
1953 if (!VALID_WRITE_BUFFER(self) || self->write_pos > self->pos) {
1954 self->write_pos = self->pos;
1955 }
1956 ADJUST_POSITION(self, self->pos + buffer->len);
(gdb) print self->buffer
$1 = 0x0
(gdb) print self->pos
$2 = 0
(gdb) print buffer->buf
value has been optimized out
(gdb) up
#3 _io_BufferedWriter_write () at ../Modules/_io/clinic/bufferedio.c.h:376
376 return_value = _io_BufferedWriter_write_impl(self, &buffer);
(gdb) print buffer->buf
$3 = (void *) 0x7fa47ac1bb18
(gdb) print buffer->len
$4 = 85
(gdb) print self
$5 = (buffered *) 0x7fa47ac099e8
(gdb) print self->buffer
$6 = 0x0
(gdb) print self->pos
$7 = 0
In other words, the segfault occurs because python calls memcpy(NULL, *, *), when trying to write to a _io.BufferedWriter. I dug a little bit deeper, and the "file objects being misused across thread boundaries" theory seems plausible, but I didn't follow this all the way to the end to be sure. I'll take a closer look at that part of the code going forward, though.
X
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
More information about the Reproducible-builds
mailing list