Bug#949767: clblas: *gemm wrong answers in out-of-order queues

Rebecca N. Palmer rebecca_palmer at zoho.com
Mon Jan 27 19:37:49 GMT 2020


Control: retitle -1 clblas: *gemm wrong answers in out-of-order queues
Control: reassign -1 src:clblas
Control: found -1 2.12-1

I think I've found the actual bug, in clblas src/library/blas/xgemm.cc: 
clblasGemm (with a single command queue) enqueues up to 4 kernels and 
returns an event that depends on only the last of them, so if the queue 
is out-of-order, waiting on this event doesn't necessarily wait for all 
of them to finish.

This was previously noticed in 
https://github.com/clMathLibraries/clBLAS/issues/269#issuecomment-225453543 
, but not actually reported as a bug.

clblas includes a client/performance tester that creates an out-of-order 
queue (at src/client/clfunc_common.hpp:306), implying that it intends to 
allow such queues.  (We don't run clblas' own tests, possibly because of 
https://github.com/clMathLibraries/clBLAS/issues/338.)

The real fix would be to return an event that depends on all the 
kernels' events (e.g. created with clEnqueueMarkerWithWaitList).

As a workaround for now, I intend to disable out-of-order queues in 
libgpuarray.  (It appears to be the only reverse dependency of clblas 
that also uses out-of-order queues.)



More information about the debian-science-maintainers mailing list