[Pkg-opencl-devel] Bug#997908: pocl: Assertion `td->printf_buffer != NULL' failed on armel/armhf

Andreas Beckmann anbe at debian.org
Thu Oct 28 18:11:57 BST 2021


On 26/10/2021 23.29, Rebecca N. Palmer wrote:
> (Filing this as something to refer to when I disable those tests; it may 
> or may not actually be worth fixing.)

Let me try to figure out why this happens ...

> Since some time in August, all the packages that use pocl for their 
> autopkgtests have been failing on armel/armhf (but not arm64) with either
> Assertion `td->printf_buffer != NULL' (libgpuarray, theano, examl)
> or
> Fatal Python error: Aborted (pyopencl, gpyfft).

I'm trying to reproduce this ...

Using theano since it shows the error you reported:
https://ci.debian.net/data/autopkgtest/testing/armhf/t/theano/16143356/log.gz

python3: ./lib/CL/devices/pthread/pthread_scheduler.c:512: 
pocl_pthread_driver_thread: Assertion `td->printf_buffer != NULL' failed.

On the armhf porter box abel.d.o in a sid chroot:

* install python3-theano g++ python3-pydot python3-nose 
python3-parameterized python3-pkg-resources cython3 graphviz 
pocl-opencl-icd python3-pygpu libclblast-dev libgpuarray-dev
* export AUTOPKGTEST_TMP=$(mktemp -d)
* apt-get source theano
* cd theano-1.0.5+dfsg/
* debian/tests/smoketestgpu
usr/lib/python3/dist-packages/theano/gpuarray/__init__.py:86: 
UserWarning: Theano's OpenCL support is incomplete and may contain bugs 
(some tests fail)
   warnings.warn("Theano's OpenCL support is incomplete and may contain 
bugs (some tests fail)")
Mapped name None to device opencl0:0: pthread-0x584
WARNING (theano.bin.theano-nose): KnownFailure plugin from NumPy could 
not be imported. Use --without-knownfailure to disable this warning.
test_adv_sub_slice 
(theano.gpuarray.tests.test_subtensor.G_advancedsubtensor) ... ok
theano.gpuarray.tests.test_rng_mrg.test_GPUA_full_fill ... ok

----------------------------------------------------------------------
Ran 2 tests in 71.174s

OK

I'm similarily unable to reproduce the errors with libgpuarray or 
pyopencl, didn't try further.

Is there something "special" about the autopkgtest environment that 
differs from a plain chroot?

> However, build-time tests using it (pocl itself, pyopencl) pass.

Then we have to add more ;-)
But for that, I would need to be able to reproduce this somehow s.t. I 
can reduce it to a plain C test in pocl.

Let me look at the code

   td->printf_buffer = pocl_aligned_malloc (MAX_EXTENDED_ALIGNMENT,
                                            scheduler.printf_buf_size);
   assert (td->printf_buffer != NULL);

So we are somehow out of memory ...

The default size is 16 MB. Per thread. This is also what clinfo reports 
in the pocl build time tests. This was also the setting in pocl 1.6
Is the autopkgtest environment extremely memory constrained?
Is the test so memory hungry that it does not leave anything for pocl?

You could prepend a call to clinfo (and maybe ulimit -a) to the 
autopkgtest scripts ...
Do the tests pass with POCL_DEVICES=basic?

Just tried again on abel while reducing the memory size with ulimit -m 
and ulimit -v ... at some point I run into python backtraces, but not 
the exact error you saw.


Andreas



More information about the Pkg-opencl-devel mailing list