[Debian-ha-maintainers] Bug#974563: corosync unable to communicate with pacemaker 1.1.16-1+deb9u1 which contains the fix for CVE-2020-25654
Louis Sautier
sautier.louis at gmail.com
Thu Nov 12 10:07:33 GMT 2020
Package: pacemaker
Version: 1.1.16-1+deb9u1
Severity: grave
X-Debbugs-CC: apo at debian.org
Hi,
I am running corosync 2.4.2-3+deb9u1 with pacemaker and the last run of
unattended-upgrades broke the cluster (downgrading pacemaker to 1.1.16-1
fixed it immediately).
The logs contain a lot of warnings that seem to point to a permission
problem, such as "Rejecting IPC request 'lrmd_rsc_info' from
unprivileged client crmd". I am not using ACLs so the patch should not
impact my system.
Here is an excerpt from the logs after the upgrade:
Nov 12 06:26:05 cluster-1 crmd[20868]: notice: State transition
S_PENDING -> S_NOT_DC
Nov 12 06:26:05 cluster-1 crmd[20868]: notice: State transition
S_NOT_DC -> S_PENDING
Nov 12 06:26:05 cluster-1 attrd[20866]: notice: Defaulting to uname -n
for the local corosync node name
Nov 12 06:26:05 cluster-1 crmd[20868]: notice: State transition
S_PENDING -> S_NOT_DC
Nov 12 06:26:06 cluster-1 lrmd[20865]: warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]: warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]: warning: Rejecting IPC request
'lrmd_rsc_register' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 lrmd[20865]: warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 crmd[20868]: error: Could not add resource
service to LRM cluster-1
Nov 12 06:26:06 cluster-1 crmd[20868]: error: Invalid resource
definition for service
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input
<create_request_adv origin="te_rsc_command" t="crmd" version="3.0.11"
subt="request" reference="lrm_invoke-tengine-xxx-29"
crm_task="lrm_invoke" crm_sys_to="lrmd" crm_sys_from="tengine"
crm_host_to="cluster-1" src="cluster-2" acl_target="hacluster"
crm_user="hacluster">
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input <crm_xml>
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input <rsc_op
id="5" operation="monitor" operation_key="service:1_monitor_0"
on_node="cluster-1" on_node_uuid="xxx" transition-key="xxx">
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input
<primitive id="service" long-id="service:1" class="systemd" type="service"/>
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input
<attributes CRM_meta_clone="1" CRM_meta_clone_max="2"
CRM_meta_clone_node_max="1" CRM_meta_globally_unique="false"
CRM_meta_notify="false" CRM_meta_op_target_rc="7"
CRM_meta_timeout="15000" crm_feature_set="3.0.11"/>
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input </rsc_op>
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input </crm_xml>
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: bad input
</create_request_adv>
Nov 12 06:26:06 cluster-1 lrmd[20865]: warning: Rejecting IPC request
'lrmd_rsc_info' from unprivileged client crmd
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: Resource service no
longer exists in the lrmd
Nov 12 06:26:06 cluster-1 crmd[20868]: error: Result of probe
operation for service on cluster-1: Error
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: Input I_FAIL received
in state S_NOT_DC from get_lrm_resource
Nov 12 06:26:06 cluster-1 crmd[20868]: notice: State transition
S_NOT_DC -> S_RECOVERY
Nov 12 06:26:06 cluster-1 crmd[20868]: warning: Fast-tracking shutdown
in response to errors
Nov 12 06:26:06 cluster-1 crmd[20868]: error: Input I_TERMINATE
received in state S_RECOVERY from do_recover
Nov 12 06:26:06 cluster-1 crmd[20868]: notice: Disconnected from the LRM
Nov 12 06:26:06 cluster-1 crmd[20868]: notice: Disconnected from Corosync
Nov 12 06:26:06 cluster-1 crmd[20868]: error: Could not recover from
internal error
Nov 12 06:26:06 cluster-1 pacemakerd[20857]: error: The crmd process
(20868) exited: Generic Pacemaker error (201)
Nov 12 06:26:06 cluster-1 pacemakerd[20857]: notice: Respawning failed
child process: crmd
My corosync.conf is quite standard:
totem {
version: 2
cluster_name: debian
token: 0
token_retransmits_before_loss_const: 10
clear_node_high_bit: yes
crypto_cipher: aes256
crypto_hash: sha256
interface {
ringnumber: 0
bindnetaddr: xxx
mcastaddr: yyy
mcastport: 5405
ttl: 1
}
}
logging {
fileline: off
to_stderr: yes
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: QUORUM
debug: off
}
}
quorum {
provider: corosync_votequorum
expected_votes: 2
}
So is my crm configuration:
node xxx: cluster-1 \
attributes standby=off
node xxx: cluster-2 \
attributes standby=off
primitive service systemd:service \
meta failure-timeout=30 \
op monitor interval=5 on-fail=restart timeout=15s
primitive vip-1 IPaddr2 \
params ip=xxx cidr_netmask=32 \
op monitor interval=10s
primitive vip-2 IPaddr2 \
params ip=xxx cidr_netmask=32 \
op monitor interval=10s
clone clone_service service
colocation service_vip-1 inf: vip-1 clone_service
colocation service_vip-2 inf: vip-2 clone_service
order kot_before_vip-1 inf: clone_service vip-1
order kot_before_vip-2 inf: clone_service vip-2
location prefer-cluster1-vip-1 vip-1 1: cluster-1
location prefer-cluster2-vip-2 vip-2 1: cluster-2
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.16-94ff4df \
cluster-infrastructure=corosync \
cluster-name=debian \
stonith-enabled=false \
no-quorum-policy=ignore \
cluster-recheck-interval=1m \
last-lrm-refresh=1605159600
rsc_defaults rsc-options: \
failure-timeout=5m \
migration-threshold=1
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/debian-ha-maintainers/attachments/20201112/7342bcc8/attachment.sig>
More information about the Debian-ha-maintainers
mailing list