Bug#1014073: llvm-toolchain-14: FTBFS on i386

Simon McVittie smcv at debian.org
Thu Jul 21 00:04:22 BST 2022


On Wed, 29 Jun 2022 at 21:07:37 +0200, Sebastian Ramacher wrote:
> Testing: 0  2  4  6  8  10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 
> E: Build killed with signal TERM after 150 minutes of inactivity

I know basically nothing about LLVM, but this has been making amd64
and i386 packages like Mesa non-co-installable in unstable for several
weeks, so I tried to look into this.

I was able to reproduce the hang on barriere. What seems to be happening
is this process hierarchy:

ninja -v -C build-llvm/tools/clang/stage2-bins check-mlir
\- ...
   \- .../tools/clang/stage2-bins/./bin/llvm-lit -sv .../tools/clang/stage2-bins/tools/mlir/test
      \- /bin/bash .../tools/clang/stage2-bins/tools/mlir/test/mlir-cpu-runner/Output/async-value.mlir.script
         \- mlir-cpu-runner
         \- FileCheck

async-value.mlir.script is basically this pipeline, after simplifying
by removing paths and options:

    mlir-opt .../async-value.mlir | mlir-cpu-runner | FileCheck .../async-value.mlir

The mlir-cpu-runner seems to be deadlocking while attempting to print a
backtrace: a call to malloc() has hit an assertion failure inside libc,
which enters LLVM's SIGABRT handler, which tries to print a backtrace,
which calls malloc(), which deadlocks because the original malloc() is
already holding the lock. See also
<https://github.com/llvm/llvm-project/issues/34066> and
<https://github.com/llvm/llvm-project/issues/43714> upstream.

A *lot* of these "mlir" tests seem to be failing with signs of memory
corruption on i386 (you can see one in the log snippet that
Sebastian Ramacher pasted, with "free(): invalid next size (fast)").
Are they expected to be crashing like this? Previous, successful builds
on i386 also have these tests crashing, but the deadlock that was fatal
to this particular build didn't happen.

I notice that debian/rules runs a lot of tests with "|| true", so if they
fail, the package builds successfully anyway. If some of the tests are
not expected to pass, perhaps they could be skipped, or at least run under
a timeout that is long enough to accommodate slow architectures but short
enough to not trigger the buildd's indefinite-hang detection?

Or if the crashing feature is optional and doesn't work / isn't needed on
i386, perhaps it could be disabled in i386 builds?

The brute-force way to make the build proceed would be to make

        ninja $(VERBOSE) -C $(TARGET_BUILD_STAGE2) check-mlir || true

conditional, and not even try it on i386.

    smcv

----

Full backtrace of mlir-cpu-runner:

Thread 5 (Thread 0xeb28bac0 (LWP 8378) "mlir-cpu-runner"):
#0  0xf7f0f069 in __kernel_vsyscall ()
#1  0xf7eefb44 in __futex_abstimed_wait_common64 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xf7ee85a7 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xf0afa9ee in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xf1524778 in void* llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void*) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#5  0xf7ee1e6c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6  0xf082d626 in clone () from /lib/i386-linux-gnu/libc.so.6

Thread 4 (Thread 0xebd5dac0 (LWP 8357) "mlir-cpu-runner"):
#0  0xf7f0f069 in __kernel_vsyscall ()
#1  0xf7eefb44 in __futex_abstimed_wait_common64 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xf7ee85a7 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xf0afa9ee in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xf1524778 in void* llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void*) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#5  0xf7ee1e6c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6  0xf082d626 in clone () from /lib/i386-linux-gnu/libc.so.6

Thread 3 (Thread 0xec55eac0 (LWP 8356) "mlir-cpu-runner"):
#0  0xf7f0f069 in __kernel_vsyscall ()
#1  0xf7eefb44 in __futex_abstimed_wait_common64 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xf7ee85a7 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xf0afa9ee in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xf1524778 in void* llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void*) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#5  0xf7ee1e6c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6  0xf082d626 in clone () from /lib/i386-linux-gnu/libc.so.6

Thread 2 (Thread 0xecd5fac0 (LWP 8355) "mlir-cpu-runner"):
#0  0xf7f0f069 in __kernel_vsyscall ()
#1  0xf7eefb44 in __futex_abstimed_wait_common64 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xf7ee85a7 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xf0afa9ee in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xf1524778 in void* llvm::thread::ThreadProxy<std::tuple<llvm::ThreadPool::grow(int)::$_0> >(void*) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#5  0xf7ee1e6c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6  0xf082d626 in clone () from /lib/i386-linux-gnu/libc.so.6

Thread 1 (Thread 0xecd63280 (LWP 8340) "mlir-cpu-runner"):
#0  0xf7f0f069 in __kernel_vsyscall ()
#1  0xf07a668c in ?? () from /lib/i386-linux-gnu/libc.so.6
#2  0xf07acbaa in malloc () from /lib/i386-linux-gnu/libc.so.6
#3  0xf0ad4b97 in operator new(unsigned int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xf0ad4c08 in operator new[](unsigned int) () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#5  0xf15679ab in llvm::raw_ostream::SetBuffered() () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#6  0xf1568bd2 in llvm::raw_ostream::write(char const*, unsigned int) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#7  0xf158fe06 in printSymbolizedStackTrace(llvm::StringRef, void**, int, llvm::raw_ostream&) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#8  0xf15919ef in llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#9  0xf1591be0 in PrintStackTraceSignalHandler(void*) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#10 0xf158f37e in llvm::sys::RunSignalHandlers() () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#11 0xf1591f47 in SignalHandler(int) () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/bin/../lib/libLLVM-14.so.1
#12 <signal handler called>
#13 0xf7f0f069 in __kernel_vsyscall ()
#14 0xf075c8f6 in raise () from /lib/i386-linux-gnu/libc.so.6
#15 0xf074530b in abort () from /lib/i386-linux-gnu/libc.so.6
#16 0xf079f94c in ?? () from /lib/i386-linux-gnu/libc.so.6
#17 0xf07a808d in ?? () from /lib/i386-linux-gnu/libc.so.6
#18 0xf07ab52b in ?? () from /lib/i386-linux-gnu/libc.so.6
#19 0xf07acb20 in malloc () from /lib/i386-linux-gnu/libc.so.6
#20 0xf0794d7b in _IO_file_doallocate () from /lib/i386-linux-gnu/libc.so.6
#21 0xf07a405d in _IO_doallocbuf () from /lib/i386-linux-gnu/libc.so.6
#22 0xf07a30a9 in _IO_file_overflow () from /lib/i386-linux-gnu/libc.so.6
#23 0xf07781a2 in ?? () from /lib/i386-linux-gnu/libc.so.6
#24 0xf07790e7 in __printf_fp () from /lib/i386-linux-gnu/libc.so.6
#25 0xf078d660 in ?? () from /lib/i386-linux-gnu/libc.so.6
#26 0xf077bf07 in fprintf () from /lib/i386-linux-gnu/libc.so.6
#27 0xeb36b922 in printF32 () from /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/lib/libmlir_c_runner_utils.so
#28 0xf7f06097 in main () at /home/smcv/llvm-toolchain-14/build-llvm/tools/clang/stage2-bins/tools/mlir/test/mlir-cpu-runner/<stdin>:25
#29 0xf7f06598 in _mlir_main ()
#30 0x080dae87 in compileAndExecute((anonymous namespace)::Options&, mlir::ModuleOp, llvm::StringRef, (anonymous namespace)::CompileAndExecuteConfig, void**) ()
#31 0x080d7c82 in compileAndExecuteVoidFunction((anonymous namespace)::Options&, mlir::ModuleOp, llvm::StringRef, (anonymous namespace)::CompileAndExecuteConfig) ()
#32 0x080d6723 in mlir::JitRunnerMain(int, char**, mlir::DialectRegistry const&, mlir::JitRunnerConfig) ()
#33 0x080588e5 in main ()

----

Full backtrace of FileCheck: looks fine, it's just reading from stdin

Thread 1 (Thread 0xf61e3700 (LWP 8341) "FileCheck"):
#0  0xf7fad069 in __kernel_vsyscall ()
#1  0xf7f8a133 in read () from /lib/i386-linux-gnu/libpthread.so.0
#2  0x080a99f7 in llvm::sys::fs::readNativeFileToEOF(int, llvm::SmallVectorImpl<char>&, int) ()
#3  0x08088902 in getMemoryBufferForStream(int, llvm::Twine const&) ()
#4  0x08087eab in llvm::MemoryBuffer::getFileOrSTDIN(llvm::Twine const&, bool, bool) ()
#5  0x08050ea9 in main ()



More information about the Pkg-llvm-team mailing list