[debian-mysql] Bug#832931: mariadb built successfully on powerpc

Thadeu Lima de Souza Cascardo cascardo at debian.org
Wed Nov 9 14:05:11 UTC 2016


On Wed, Nov 09, 2016 at 03:17:06PM +0200, Otto Kekäläinen wrote:
> 2016-11-08 11:08 GMT+02:00 Thadeu Lima de Souza Cascardo <cascardo at debian.org>:
> > Hey,
> >
> > I built mariadb on my powerpc G4, it took a while and I got some OOM
> > during some of the tests. So those tests failed, but the package got
> > built anyway. I wonder if a simple rebuild would make it work on the
> > build machine.
> >
> > I will see if I can get it to build on one of the porter machines.
> 
> Thanks for looking into the issue. To me it looks like powerpc status
> is already good
> (https://buildd.debian.org/status/package.php?p=mariadb-10.0) but it
> is good if you can check it and perhaps do some improvements.

So this requires further investigation. After running a build on a
jessie schroot on partch, during the tests, mysqld is deadlocked.

Investigating, I found out there were at least two threads locked on the
same lock under jemalloc malloc/free. The reason for such deadlock was
that during the exit of one of the threads, which took the lock when
destroying tcache, there was a segfault. That segfault was caught up by
a signal handler from mariadb, which ended up calling malloc, which
tried to lock the same mutex, hence the deadlock.

Now, of course a signal handler must take care of what it's doing, so at
least this must be fixed. But the root cause is the segfault, which
should not have happened.

I wrote some tests using jemalloc and pthreads and found a small
reproducer, which will cause a crash, though in a different point. Note
that despite this using a single thread (the main task only calling
pthread_join, but not using malloc/free directly), I can't reproduce the
segfault on my single CPU. But this reproduces fairly well on partch.

Using a sid schroot, this doesn't reproduce. As jemalloc has not changed
much between jessie and sid (though upstream is fairly different and has
a patch that does not apply to 3.6 regarding pthread
__nptl_deallocate_tsd), I can only consider glibc as a possible
difference that would explain it.

It is very possible the root cause here is some odd interaction between
glibc nptl code and jemalloc.

Regards.
Cascardo.

---
#include <pthread.h>
#include <stdlib.h>

void * thread_run(void * arg)
{
	int i;
	for (i = 2; i < 10000; i++) {
		free(malloc(i * 4));
	}
	return NULL;
}

int main(int argc, char **argv)
{
	pthread_t t1;
	pthread_create(&t1, NULL, thread_run, NULL);
	pthread_join(t1, NULL);
	return 0;
}
---



More information about the pkg-mysql-maint mailing list