[Debian-ha-maintainers] Bug#953111: pacemaker: post-start got lost
Lukas Straub
lukasstraub2 at web.de
Wed Mar 4 17:29:24 GMT 2020
Package: pacemaker
Version: 2.0.3-3
Severity: normal
Dear Maintainer,
I have a master-slave resource which relies on the post-start notification to start replication
to the slave. This works well, but every now and then the post-start notification isn't sent.
Note the line
Discarding attempt to perform action notify on colo_test in state S_INTEGRATION
below.
Mar 04 16:28:50 tele-clu-03 python[8351]: (colo_test) DEBUG: notify called: action: pre-start, master_uname: tele-clu-03, start_uname: tele-clu-02, stop_uname: tele-clu-01, shutdown_guest: False
Mar 04 16:28:50 tele-clu-03 pacemaker-controld[538]: notice: Result of notify operation for colo_test on tele-clu-03: 0 (ok)
Mar 04 16:28:50 tele-clu-03 pacemaker-controld[538]: notice: Initiating start operation colo_test_start_0 on tele-clu-02
Mar 04 16:28:54 tele-clu-03 corosync[481]: [KNET ] rx: host: 1 link: 0 is up
Mar 04 16:28:54 tele-clu-03 corosync[481]: [KNET ] host: host: 1 (passive) best link: 0 (pri: 0)
Mar 04 16:28:54 tele-clu-03 corosync[481]: [TOTEM ] A new membership (1.1cc0) was formed. Members joined: 1
Mar 04 16:28:54 tele-clu-03 corosync[481]: [CPG ] downlist left_list: 0 received
Mar 04 16:28:54 tele-clu-03 corosync[481]: [CPG ] downlist left_list: 0 received
Mar 04 16:28:54 tele-clu-03 corosync[481]: [CPG ] downlist left_list: 0 received
Mar 04 16:28:55 tele-clu-03 corosync[481]: [QUORUM] Members[3]: 1 2 3
Mar 04 16:28:55 tele-clu-03 corosync[481]: [MAIN ] Completed service synchronization, ready to provide service.
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Node tele-clu-01 state is now member
Mar 04 16:28:55 tele-clu-03 pacemakerd[532]: notice: Node tele-clu-01 state is now member
Mar 04 16:28:55 tele-clu-03 pacemaker-based[533]: notice: Node tele-clu-01 state is now member
Mar 04 16:28:55 tele-clu-03 pacemaker-attrd[536]: notice: Node tele-clu-01 state is now member
Mar 04 16:28:55 tele-clu-03 pacemaker-attrd[536]: notice: Setting #attrd-protocol[tele-clu-01]: (unset) -> 2
Mar 04 16:28:55 tele-clu-03 pacemaker-fenced[534]: notice: Node tele-clu-01 state is now member
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Transition 12 aborted: Peer Halt
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Initiating notify operation colo_test_post_notify_start_0 locally on tele-clu-03
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Discarding attempt to perform action notify on colo_test in state S_INTEGRATION (shutdown=false)
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Initiating notify operation colo_test_post_notify_start_0 on tele-clu-02
Mar 04 16:28:55 tele-clu-03 pacemaker-controld[538]: notice: Transition 12 (Complete=22, Pending=0, Fired=0, Skipped=1, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-warn-93.bz2): Stopped
Mar 04 16:28:57 tele-clu-03 pacemaker-schedulerd[537]: notice: Watchdog will be used via SBD if fencing is required and stonith-watchdog-timeout is nonzero
Mar 04 16:28:57 tele-clu-03 pacemaker-schedulerd[537]: notice: Calculated transition 13, saving inputs in /var/lib/pacemaker/pengine/pe-input-1569.bz2
Mar 04 16:28:57 tele-clu-03 pacemaker-controld[538]: notice: Initiating monitor operation colo_test_monitor_0 on tele-clu-01
Mar 04 16:28:57 tele-clu-03 pacemaker-controld[538]: notice: Initiating monitor operation colo_test_monitor_10000 on tele-clu-02
Mar 04 16:28:58 tele-clu-03 pacemaker-controld[538]: notice: Initiating monitor operation colo_small_test_monitor_0 on tele-clu-01
Mar 04 16:28:58 tele-clu-03 pacemaker-controld[538]: notice: Transition 13 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1569.bz2): Complete
Mar 04 16:28:58 tele-clu-03 pacemaker-controld[538]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE
Mar 04 16:29:08 tele-clu-03 pacemaker-controld[538]: notice: High CPU load detected: 2.750000
-- System Information:
Debian Release: bullseye/sid
APT prefers testing
APT policy: (990, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 5.3.0-2-amd64 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages pacemaker depends on:
ii corosync 3.0.2-1+b1
ii dbus 1.12.16-2
ii init-system-helpers 1.57
ii libc6 2.29-10
ii libcfg7 3.0.2-1+b1
ii libcib27 2.0.3-3
ii libcmap4 3.0.2-1+b1
ii libcorosync-common4 3.0.2-1+b1
ii libcrmcluster29 2.0.3-3
ii libcrmcommon34 2.0.3-3
ii libcrmservice28 2.0.3-3
ii libglib2.0-0 2.62.5-1
ii libgnutls30 3.6.12-2
ii liblrmd28 2.0.3-3
ii libpacemaker1 2.0.3-3
ii libpam0g 1.3.1-5
ii libpe-rules26 2.0.3-3
ii libpe-status28 2.0.3-3
ii libqb0 1.0.5-1
ii libstonithd26 2.0.3-3
ii lsb-base 11.1.0
ii pacemaker-common 2.0.3-3
ii pacemaker-resource-agents 2.0.3-3
ii python3 3.7.5-3
Versions of packages pacemaker recommends:
pn fence-agents <none>
ii pacemaker-cli-utils 2.0.3-3
Versions of packages pacemaker suggests:
ii cluster-glue 1.0.12-15
ii crmsh 4.2.0-2
-- no debconf information
More information about the Debian-ha-maintainers
mailing list