[Debian-ha-maintainers] Bug#1095183: pcs: autopkgtest fails in unstable

Lucas Nussbaum lucas at debian.org
Thu Feb 6 22:40:11 GMT 2025


On 04/02/25 at 22:04 +0100, Lucas Nussbaum wrote:
> Source: pcs
> Version: 0.11.7-2
> Severity: serious
> 
> Hi,
> 
> See https://ci.debian.net/packages/p/pcs/
> It fails with:
> 
>  test setup: test run
> 113s autopkgtest [13:34:07]: test setup: [-----------------------
> 113s Warning: Unable to read the known-hosts file: No such file or directory: '/var/lib/pcsd/known-hosts'
> 113s node1: Not authorized
> 114s Error: Unable to communicate with node1
> 114s Nodes to authorize: node1
> 115s passwd: password changed.
> 115s autopkgtest [13:34:09]: test setup: -----------------------]
> ▾ test setup: test results
> 115s autopkgtest [13:34:09]: test setup:  - - - - - - - - - - results - - - - - - - - - -
> 115s setup                FAIL non-zero exit status 1
> 
> I can reproduce the failure locally, but for me it also fails on
> testing, while it does not on ci.debian.net. So I'm not sure where to go
> from there.

I don't understand how the 'setup' test is supposed to work.
The test is supposed to interact with a pcsd daemon, but it is never
started. I think something along the lines of this is needed:

--------------------------------------------------->8
--- debian/tests/setup.orig	2025-02-06 22:31:16.966589481 +0000
+++ debian/tests/setup	2025-02-06 22:32:29.543402182 +0000
@@ -1,26 +1,32 @@
 #!/bin/sh

 set -e

 ulimit -H -l unlimited 2>/dev/null || {
     # https://bugs.launchpad.net/bugs/1828228
     echo "test disabled for unprivileged namespaces"
     exit 77
 }

 cleanup () {
   service pcsd stop
   service pacemaker stop
   service corosync stop
   passwd --delete --lock hacluster
 }

 trap "cleanup" 0 2 3 15

+service corosync start
+service pacemaker start
+service pcsd start
+
+sleep 60
+
 echo hacluster:hacluster | chpasswd
 pcs cluster auth -u hacluster -p hacluster
 pcs cluster setup debian node1 --start --force

 sleep 60

 pcs pcsd status | grep -20 "node1: Online"

 --------------------------------------------------->8

 With that, the test goes a bit further but still fails:

 --------------------------------------------------->8
 # debian/tests/setup
Starting corosync daemon: corosyncFeb 06 22:31:52.186 notice  [MAIN  ] Corosync Cluster Engine 3.1.8 starting up
Feb 06 22:31:52.186 info    [MAIN  ] Corosync built-in features: dbus monitoring watchdog augeas systemd xmlconf vqsim nozzle snmp pie relro bindnow
.
Starting Pacemaker Cluster Manager[  OK  ]
Starting Pacemaker & Corosync configuration daemon: pcsd.
Warning: Unable to read the known-hosts file: No such file or directory: '/var/lib/pcsd/known-hosts'
node1: Not authorized
Nodes to authorize: node1
Error: Unable to communicate with node1
Signaling Pacemaker Cluster Manager to terminate[  OK  ]
Waiting for cluster services to unload..................[  OK  ]
Stopping corosync daemon: corosync.
passwd: password changed.
 --------------------------------------------------->8

The other mystery is: how can it work on testing? Hypothesis: when using
the lxc backend, services are started on installation (so pcsd is
started despite the lack of 'service' calls).

Lucas



More information about the Debian-ha-maintainers mailing list