[Pkg-opencl-devel] Bug#997908: pocl: Assertion `td->printf_buffer != NULL' failed on armel/armhf
Andreas Beckmann
anbe at debian.org
Thu Oct 28 18:11:57 BST 2021
On 26/10/2021 23.29, Rebecca N. Palmer wrote:
> (Filing this as something to refer to when I disable those tests; it may
> or may not actually be worth fixing.)
Let me try to figure out why this happens ...
> Since some time in August, all the packages that use pocl for their
> autopkgtests have been failing on armel/armhf (but not arm64) with either
> Assertion `td->printf_buffer != NULL' (libgpuarray, theano, examl)
> or
> Fatal Python error: Aborted (pyopencl, gpyfft).
I'm trying to reproduce this ...
Using theano since it shows the error you reported:
https://ci.debian.net/data/autopkgtest/testing/armhf/t/theano/16143356/log.gz
python3: ./lib/CL/devices/pthread/pthread_scheduler.c:512:
pocl_pthread_driver_thread: Assertion `td->printf_buffer != NULL' failed.
On the armhf porter box abel.d.o in a sid chroot:
* install python3-theano g++ python3-pydot python3-nose
python3-parameterized python3-pkg-resources cython3 graphviz
pocl-opencl-icd python3-pygpu libclblast-dev libgpuarray-dev
* export AUTOPKGTEST_TMP=$(mktemp -d)
* apt-get source theano
* cd theano-1.0.5+dfsg/
* debian/tests/smoketestgpu
usr/lib/python3/dist-packages/theano/gpuarray/__init__.py:86:
UserWarning: Theano's OpenCL support is incomplete and may contain bugs
(some tests fail)
warnings.warn("Theano's OpenCL support is incomplete and may contain
bugs (some tests fail)")
Mapped name None to device opencl0:0: pthread-0x584
WARNING (theano.bin.theano-nose): KnownFailure plugin from NumPy could
not be imported. Use --without-knownfailure to disable this warning.
test_adv_sub_slice
(theano.gpuarray.tests.test_subtensor.G_advancedsubtensor) ... ok
theano.gpuarray.tests.test_rng_mrg.test_GPUA_full_fill ... ok
----------------------------------------------------------------------
Ran 2 tests in 71.174s
OK
I'm similarily unable to reproduce the errors with libgpuarray or
pyopencl, didn't try further.
Is there something "special" about the autopkgtest environment that
differs from a plain chroot?
> However, build-time tests using it (pocl itself, pyopencl) pass.
Then we have to add more ;-)
But for that, I would need to be able to reproduce this somehow s.t. I
can reduce it to a plain C test in pocl.
Let me look at the code
td->printf_buffer = pocl_aligned_malloc (MAX_EXTENDED_ALIGNMENT,
scheduler.printf_buf_size);
assert (td->printf_buffer != NULL);
So we are somehow out of memory ...
The default size is 16 MB. Per thread. This is also what clinfo reports
in the pocl build time tests. This was also the setting in pocl 1.6
Is the autopkgtest environment extremely memory constrained?
Is the test so memory hungry that it does not leave anything for pocl?
You could prepend a call to clinfo (and maybe ulimit -a) to the
autopkgtest scripts ...
Do the tests pass with POCL_DEVICES=basic?
Just tried again on abel while reducing the memory size with ulimit -m
and ulimit -v ... at some point I run into python backtraces, but not
the exact error you saw.
Andreas
More information about the Pkg-opencl-devel
mailing list