[Pkg-samba-maint] Bug#801690: Bug#801690: 'smbstatus -b' leads to broken ctdb cluster
adi at kriegisch.at
Wed Nov 4 09:14:46 UTC 2015
Thanks for getting back to me! :)
> > I recently upgraded a samba cluster from Wheezy (with Kernel, ctdb, samba
> > and glusterfs from backports) to Jessie. The cluster itself is way older
> > and basically always worked. Since the upgrade to Jessie 'smbstatus -b'
> > (almost always) just hangs the whole cluster; I need to interrupt the call
> > with ctrl+c (or run with 'timeout 2') to avoid a complete cluster lockup
> > leading to the other cluster nodes being banned and the node I run smbstatus
> > on to have ctdbd run at 100% load but not being able to recover.
> How do you recover then? KILL-ing ctdbd?
Killing the loaded node is the easiest; manual unbanning of the other nodes
is still required. Combinations of enabling and disabling nodes may fix the
> > Calling 'smbstatus --locks' and 'smbstatus --shares' works just fine.
> Have you tried which of --processes, --notify hangs? Does it hangs
> with "-b --fast"?
Ah, I missed that: '--brief --fast' works just fine. So obviously the
validation does not work...
> > 'strace'ing ctdbd leads to a massive amount of these messages:
> > | write(58,"\240\4\0\0BDTC\1\0\0\0\215U\336\25\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> > | 1184) = -1 EAGAIN (Resource temporarily unavailable)
> fd 58 is probably the ctdb socket. Can you confirm?
> To have more usefull info, can you install gdb, ctdb-dbg and samba-dbg
> and send the stacktrace of ctdbd at the write?
Ok, I will report back the stack traces in a few days (I'm afraid I can
only do these during the weekend).
All the best,
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 827 bytes
Desc: Digital signature
More information about the Pkg-samba-maint