[Pkg-openldap-devel] Bug#827135: Bug#827135: slapd won't stop (shutdown) on multi-core system under stress
Ryan Tandy
ryan at nardis.ca
Sat Jul 29 02:58:30 UTC 2017
Control: tag -1 moreinfo unreproducible
Hi Zvika,
My apologies for taking so long to get back to you on this.
On Sun, Jun 12, 2016 at 07:37:36PM +0000, Zvika Ferentz wrote:
>How to reproduced it:
>-------------------------------
>I guess that there are a few ways to reproduce it , I managed to easily reproduce it with two terminates - one producing "ldapsearch" stress and the other restarting slapd :
>- Open two terminals.
>- On terminal #1 i'm just manually running "slapd restart" commands:
> # /etc/init.d/slapd status ; /etc/init.d/slapd restart
>- On terminal #2 i'm running a infinite loops of simple "ldapsearch" (100
>concurrent processes running loops of ldapsearch). Terminal #2 is trying to
>simulate many concurrent read operations. see "more information" later for the
>exact scripts that i used.
>
>Incorrect behavior:
>-------------------------
>The "slapd restart" works a few times, and then the "stop" operation fails.
>The stop continues to fail even if i stop all "stress" and terminate all
>ldapsearch/connections (CPU is 99% idle !)
>
>Expected Behavior:
>-------------------------
>All slapd stop/restart operations complete successfully
>
>
>More Information (optional - my exact scripts):
>----------------------------------------------------------------
>On terminal#2 i used a very simple script to generate a "read only" stress:
># cat > ldaploop.sh << EOF
>#!/bin/sh
>while true ; do ldapsearch -x -Z ; done
>EOF
>
># cat > manyloops.sh << "EOF"
>#!/bin/sh
>for i in `seq 1 100` ; do ( ./ldaploop.sh &) ; done
>EOF
>
>As previously mentioned, i ran the "manyloops.sh" to generate 100 running
>processes where each one simply runs "ldapsearch" (locally).
Thanks a lot for the detailed steps to reproduce. I got access to a VM
with 16 CPUs where I could try this. It doesn't have a wheezy chroot any
longer, but I tried the jessie version (2.4.40+dfsg-1+deb8u3).
I'm afraid I have not been able to trigger any hangs, even using your
exact scripts and after restarting slapd many times.
I'm testing with the following, very simple, config:
include /etc/ldap/schema/core.schema
include /etc/ldap/schema/cosine.schema
include /etc/ldap/schema/nis.schema
include /etc/ldap/schema/inetorgperson.schema
tlscertificatefile ssl-cert-snakeoil.pem
tlscertificatekeyfile ssl-cert-snakeoil.key
moduleload back_mdb
database mdb
suffix dc=example,dc=com
directory db
index objectClass eq
and a database of 1000 entries. I tried both the hdb and mdb backends.
Do you still encounter this bug on jessie or stretch? Is there more to
your configuration than the simple config I posted, that might be
relevant?
If you can still reproduce the bug, it would be great if you could
install slapd-dbg and libldap-2.4-2-dbg, cause slapd to hang, and then
capture a backtrace with gdb while it's stuck:
gdb -p $(pidof slapd)
thread apply all bt
thanks,
Ryan
More information about the Pkg-openldap-devel
mailing list