[Debian-ha-maintainers] Bug#1095183: pcs: autopkgtest fails in unstable
Lucas Nussbaum
lucas at debian.org
Thu Feb 6 22:40:11 GMT 2025
On 04/02/25 at 22:04 +0100, Lucas Nussbaum wrote:
> Source: pcs
> Version: 0.11.7-2
> Severity: serious
>
> Hi,
>
> See https://ci.debian.net/packages/p/pcs/
> It fails with:
>
> test setup: test run
> 113s autopkgtest [13:34:07]: test setup: [-----------------------
> 113s Warning: Unable to read the known-hosts file: No such file or directory: '/var/lib/pcsd/known-hosts'
> 113s node1: Not authorized
> 114s Error: Unable to communicate with node1
> 114s Nodes to authorize: node1
> 115s passwd: password changed.
> 115s autopkgtest [13:34:09]: test setup: -----------------------]
> ▾ test setup: test results
> 115s autopkgtest [13:34:09]: test setup: - - - - - - - - - - results - - - - - - - - - -
> 115s setup FAIL non-zero exit status 1
>
> I can reproduce the failure locally, but for me it also fails on
> testing, while it does not on ci.debian.net. So I'm not sure where to go
> from there.
I don't understand how the 'setup' test is supposed to work.
The test is supposed to interact with a pcsd daemon, but it is never
started. I think something along the lines of this is needed:
--------------------------------------------------->8
--- debian/tests/setup.orig 2025-02-06 22:31:16.966589481 +0000
+++ debian/tests/setup 2025-02-06 22:32:29.543402182 +0000
@@ -1,26 +1,32 @@
#!/bin/sh
set -e
ulimit -H -l unlimited 2>/dev/null || {
# https://bugs.launchpad.net/bugs/1828228
echo "test disabled for unprivileged namespaces"
exit 77
}
cleanup () {
service pcsd stop
service pacemaker stop
service corosync stop
passwd --delete --lock hacluster
}
trap "cleanup" 0 2 3 15
+service corosync start
+service pacemaker start
+service pcsd start
+
+sleep 60
+
echo hacluster:hacluster | chpasswd
pcs cluster auth -u hacluster -p hacluster
pcs cluster setup debian node1 --start --force
sleep 60
pcs pcsd status | grep -20 "node1: Online"
--------------------------------------------------->8
With that, the test goes a bit further but still fails:
--------------------------------------------------->8
# debian/tests/setup
Starting corosync daemon: corosyncFeb 06 22:31:52.186 notice [MAIN ] Corosync Cluster Engine 3.1.8 starting up
Feb 06 22:31:52.186 info [MAIN ] Corosync built-in features: dbus monitoring watchdog augeas systemd xmlconf vqsim nozzle snmp pie relro bindnow
.
Starting Pacemaker Cluster Manager[ OK ]
Starting Pacemaker & Corosync configuration daemon: pcsd.
Warning: Unable to read the known-hosts file: No such file or directory: '/var/lib/pcsd/known-hosts'
node1: Not authorized
Nodes to authorize: node1
Error: Unable to communicate with node1
Signaling Pacemaker Cluster Manager to terminate[ OK ]
Waiting for cluster services to unload..................[ OK ]
Stopping corosync daemon: corosync.
passwd: password changed.
--------------------------------------------------->8
The other mystery is: how can it work on testing? Hypothesis: when using
the lxc backend, services are started on installation (so pcsd is
started despite the lack of 'service' calls).
Lucas
More information about the Debian-ha-maintainers
mailing list