[Debian-ha-maintainers] Bug#962454: Link failures after upgrade to +deb10u1

wferi at niif.hu wferi at niif.hu
Sat Jul 4 12:51:49 BST 2020


Alberto Gonzalez Iniesta <agi at inittab.org> writes:

> Hi, again. After some days with libknet1=1.16-2, I'm still getting
> errors from time to time (less than with buster's libknet1):
> Jun 22 01:16:54 selma corosync[28610]:   [KNET  ] link: host: 1 link: 0 is down
> Jun 22 01:16:54 selma corosync[28610]:   [KNET  ] host: host: 1 (passive) best link: 1 (pri: 1)
> Jun 22 01:16:55 selma corosync[28610]:   [KNET  ] rx: host: 1 link: 0 is up
> Jun 22 01:16:55 selma corosync[28610]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)
> Jun 24 06:50:30 selma corosync[28610]:   [KNET  ] link: host: 1 link: 0 is down
> Jun 24 06:50:30 selma corosync[28610]:   [KNET  ] host: host: 1 (passive) best link: 1 (pri: 1)
> Jun 24 06:50:31 selma corosync[28610]:   [KNET  ] rx: host: 1 link: 0 is up
> Jun 24 06:50:31 selma corosync[28610]:   [KNET  ] host: host: 1 (passive) best link: 0 (pri: 1)

Hi,

Is this with corosync 3.0.1-2+deb10u1?
Please enable debug logging and collect full info for a link flap.
Do these correlate with some cluster or host activity?
Does your network link transfer big packets (up to 65536 bytes) reliably?
Can you capture the Kronosnet network traffic (preferably on both nodes)
around the time of a link flap?

I won't be able to interpret such detailed information, but I suggest
you open an issue at https://github.com/kronosnet/kronosnet/issues
complete with it.
-- 
Regards,
Feri



More information about the Debian-ha-maintainers mailing list