[Debian-ha-maintainers] Bug#690517: corosync daemon vanishing

Andreas Pflug pgadmin at pse-consulting.de
Mon Oct 15 07:04:29 UTC 2012


Package: corosync
Version: 1.4.2-3
Severity: normal

I'm running corosync with COROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser (as performed by aisexec) for clvm, with the these services:

service {
        ver:    0
        name:   openais_ckpt
}
quorum {
	provider: corosync_votequorum
	expected_votes: 2
}

There are 6 machines in the ring.

>From time to time (once or twice a week) one machine (here: ...102) leaves the ring and forms a new one:


Oct 14 07:07:44 corosync [CLM   ] CLM CONFIGURATION CHANGE
Oct 14 07:07:44 corosync [CLM   ] New Configuration:
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.102) 
Oct 14 07:07:44 corosync [CLM   ] Members Left:
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.101) 
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.110) 
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.112) 
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.116) 
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.117) 
Oct 14 07:07:44 corosync [CLM   ] Members Joined:
Oct 14 07:07:44 corosync [VOTEQ ] quorum lost, blocking activity
Oct 14 07:07:44 corosync [QUORUM] This node is within the non-primary component and will NOT provide any services.
Oct 14 07:07:44 corosync [QUORUM] Members[1]: 1722525888
Oct 14 07:07:44 corosync [CLM   ] CLM CONFIGURATION CHANGE
Oct 14 07:07:44 corosync [CLM   ] New Configuration:
Oct 14 07:07:44 corosync [CLM   ] 	r(0) ip(192.168.171.102) 
Oct 14 07:07:44 corosync [CLM   ] Members Left:
Oct 14 07:07:44 corosync [CLM   ] Members Joined:
Oct 14 07:07:45 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Oct 14 07:07:45 corosync [CLM   ] CLM CONFIGURATION CHANGE
Oct 14 07:07:45 corosync [CLM   ] New Configuration:
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.102) 
Oct 14 07:07:45 corosync [CLM   ] Members Left:
Oct 14 07:07:45 corosync [CLM   ] Members Joined:
Oct 14 07:07:45 corosync [CLM   ] CLM CONFIGURATION CHANGE
Oct 14 07:07:45 corosync [CLM   ] New Configuration:
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.101) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.102) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.110) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.112) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.116) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.117) 
Oct 14 07:07:45 corosync [CLM   ] Members Left:
Oct 14 07:07:45 corosync [CLM   ] Members Joined:
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.101) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.110) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.112) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.116) 
Oct 14 07:07:45 corosync [CLM   ] 	r(0) ip(192.168.171.117) 
Oct 14 07:07:45 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1705748672 1722525888 1856743616 1890298048 1957406912 1974184\
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1974184128
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1705748672 1722525888 1856743616 1890298048 1957406912 1974184\
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1974184128
Oct 14 07:07:45 corosync [VOTEQ ] quorum regained, resuming activity
Oct 14 07:07:45 corosync [QUORUM] This node is within the primary component and will provide service.
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1705748672 1722525888 1856743616 1890298048 1957406912 1974184\
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1974184128
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1705748672 1722525888 1856743616 1890298048 1957406912 1974184\
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1974184128
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1705748672 1722525888 1856743616 1890298048 1957406912 1974184\
Oct 14 07:07:45 corosync [QUORUM] Members[6]: 1974184128
Oct 14 07:07:45 corosync [CPG   ] chosen downlist: sender r(0) ip(192.168.171.101) ; members(old:5 left:0)

After that, all ring member machines drop their corosync processes without any syslog message.


-- System Information:
Debian Release: 6.0.5
  APT prefers stable
  APT policy: (990, 'stable'), (100, 'stable-updates'), (100, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-0.bpo.3-amd64 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash

Versions of packages corosync depends on:
ii  adduser                 3.112+nmu2       add and remove users and groups
ii  libc6                   2.13-35          Embedded GNU C Library: Shared lib
ii  libcfg4                 1.4.2-3          Standards-based cluster framework,
ii  libconfdb4              1.4.2-3          Standards-based cluster framework,
ii  libcoroipcc4            1.4.2-3          Standards-based cluster framework,
ii  libcoroipcs4            1.4.2-3          Standards-based cluster framework,
ii  libcpg4                 1.4.2-3          Standards-based cluster framework,
ii  libevs4                 1.4.2-3          Standards-based cluster framework,
ii  liblogsys4              1.4.2-3          Standards-based cluster framework,
ii  libpload4               1.4.2-3          Standards-based cluster framework,
ii  libquorum4              1.4.2-3          Standards-based cluster framework,
ii  libsam4                 1.4.2-3          Standards-based cluster framework,
ii  libtotem-pg4            1.4.2-3          Standards-based cluster framework,
ii  libvotequorum4          1.4.2-3          Standards-based cluster framework,
ii  lsb-base                3.2-23.2squeeze1 Linux Standard Base 3.2 init scrip

corosync recommends no packages.

corosync suggests no packages.

-- Configuration Files:
/etc/corosync/corosync.conf changed:
totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 192.168.171.0
		mcastaddr: 226.94.1.1
		mcastport: 5171
	}
}
logging {
	fileline: off
	to_stderr: yes
	to_logfile: yes
	to_syslog: yes
	logfile: /var/log/corosync/corosync.log
	debug: off
	timestamp: on
	logger {
		ident: AMF
		debug: off
		tags: enter|leave|trace1|trace2|trace3|trace4|trace6
	}
}
amf {
	mode: disabled
}

/etc/default/corosync changed:
START=yes
export COROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser"


-- no debconf information



More information about the Debian-ha-maintainers mailing list