Bug#1114459: pmix breaks slurm-wlm autopkgtest: causes test to time out

Paul Gevers elbrus at debian.org
Fri Sep 5 20:11:38 BST 2025


Source: pmix, slurm-wlm
Control: found -1 pmix/6.0.0-3
Control: found -1 slurm-wlm/24.11.5-4
Severity: serious
Tags: sid trixie
User: debian-ci at lists.debian.org
Usertags: breaks needs-update

Dear maintainer(s),

With a recent upload of pmix the autopkgtest of slurm-wlm fails in 
testing when that autopkgtest is run with the binary packages of pmix 
from unstable. It times out after 2:47h, where normally it only takes 
minutes. It passes when run with only packages from testing. In tabular 
form:

                        pass            fail
pmix                   from testing    6.0.0-3
slurm-wlm              from testing    24.11.5-4
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration of pmix to testing 
[1]. Due to the nature of this issue, I filed this bug report against 
both packages. Can you please investigate the situation and reassign the 
bug to the right package?

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=pmix

https://ci.debian.net/data/autopkgtest/testing/amd64/s/slurm-wlm/64130829/log.gz

383s ● slurmctld.service - Slurm controller daemon
383s      Loaded: loaded (/usr/lib/systemd/system/slurmctld.service; 
enabled; preset: enabled)
383s      Active: active (running) since Fri 2025-09-05 03:51:07 UTC; 
10s ago
383s  Invocation: 612aa5cddd6f46faaa0671f23b1f95eb
383s        Docs: man:slurmctld(8)
383s    Main PID: 3312 (slurmctld)
383s       Tasks: 88
383s      Memory: 5.2M (peak: 9M)
383s         CPU: 84ms
383s      CGroup: /system.slice/slurmctld.service
383s              ├─3312 /usr/sbin/slurmctld --systemd
383s              └─3379 "slurmctld: slurmscriptd"
383s 383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: No 
job state file (/var/lib/slurm/slurmctld/job_state.old) to recover
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: error: 
Could not open reservation state file 
/var/lib/slurm/slurmctld/resv_state: No such file or directory
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: error: 
NOTE: Trying backup state save file. Reservations may be lost
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: No 
reservation state file (/var/lib/slurm/slurmctld/resv_state.old) to recover
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: error: 
Could not open trigger state file 
/var/lib/slurm/slurmctld/trigger_state: No such file or directory
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: error: 
NOTE: Trying backup state save file. Triggers may be lost!
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: No 
trigger state file (/var/lib/slurm/slurmctld/trigger_state.old) to recover
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: 
read_slurm_conf: backup_controller not specified
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: 
Reinitializing job accounting state
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmctld[3312]: slurmctld: Running 
as primary controller
383s ● slurmd.service - Slurm node daemon
383s      Loaded: loaded (/usr/lib/systemd/system/slurmd.service; 
enabled; preset: enabled)
383s      Active: active (running) since Fri 2025-09-05 03:51:07 UTC; 
10s ago
383s  Invocation: 91f78f38727e43a1b6b612ea9ff72296
383s        Docs: man:slurmd(8)
383s    Main PID: 3406 (slurmd)
383s       Tasks: 12
383s      Memory: 2.2M (peak: 3.8M)
383s         CPU: 62ms
383s      CGroup: /system.slice/slurmd.service
383s              └─3406 /usr/sbin/slurmd --systemd
383s 383s Sep 05 03:51:07 ci-248-6c8bbe56 systemd[1]: Starting 
slurmd.service - Slurm node daemon...
383s Sep 05 03:51:07 ci-248-6c8bbe56 (slurmd)[3406]: slurmd.service: 
Referenced but unset environment variable evaluates to an empty string: 
SLURMD_OPTIONS
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: 
_read_slurm_cgroup_conf: No cgroup.conf file (/etc/slurm/cgroup.conf), 
using defaults
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: 
_read_slurm_cgroup_conf: No cgroup.conf file (/etc/slurm/cgroup.conf), 
using defaults
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: error: Node 
configuration differs from hardware: CPUs=1:64(hw) Boards=1:1(hw) 
SocketsPerBoard=1:1(hw) CoresPerSocket=1:32(hw) ThreadsPerCore=1:2(hw)
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: CPU frequency 
setting not configured for this node
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: slurmd 
version 24.11.5 started
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: slurmd 
started on Fri, 05 Sep 2025 03:51:07 +0000
383s Sep 05 03:51:07 ci-248-6c8bbe56 systemd[1]: Started slurmd.service 
- Slurm node daemon.
383s Sep 05 03:51:07 ci-248-6c8bbe56 slurmd[3406]: slurmd: CPUs=1 
Boards=1 Sockets=1 Cores=1 Threads=1 Memory=257333 TmpDisk=256000 
Uptime=1279 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)
383s PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
383s test*        up   infinite      1   idle localhost
383s NODELIST   NODES PARTITION STATE 383s localhost      1     test* 
idle  10374s autopkgtest [06:37:48]: ERROR: timed out on command "su -s 
/bin/bash root -c set -e; exec 
/tmp/autopkgtest-lxc.a9795u61/downtmp/wrapper.sh 
--artifacts=/tmp/autopkgtest-lxc.a9795u61/downtmp/mpi-artifacts 
--chdir=/tmp/autopkgtest-lxc.a9795u61/downtmp/build.CW7/src 
--env=AUTOPKGTEST_TESTBED_ARCH=amd64 --env=AUTOPKGTEST_TEST_ARCH=amd64 
--env=DEB_BUILD_OPTIONS=parallel=64 --env=DEBIAN_FRONTEND=noninteractive 
--env=LANG=C.UTF-8 --unset-env=LANGUAGE --unset-env=LC_ADDRESS 
--unset-env=LC_ALL --unset-env=LC_COLLATE --unset-env=LC_CTYPE 
--unset-env=LC_IDENTIFICATION --unset-env=LC_MEASUREMENT 
--unset-env=LC_MESSAGES --unset-env=LC_MONETARY --unset-env=LC_NAME 
--unset-env=LC_NUMERIC --unset-env=LC_PAPER --unset-env=LC_TELEPHONE 
--unset-env=LC_TIME --script-pid-file=/tmp/autopkgtest_script_pid 
--source-profile 
--stderr=/tmp/autopkgtest-lxc.a9795u61/downtmp/mpi-stderr 
--stdout=/tmp/autopkgtest-lxc.a9795u61/downtmp/mpi-stdout 
--tmp=/tmp/autopkgtest-lxc.a9795u61/downtmp/autopkgtest_tmp 
--env=AUTOPKGTEST_NORMAL_USER=debci --env=ADT_NORMAL_USER=debci 
--make-executable=/tmp/autopkgtest-lxc.a9795u61/downtmp/build.CW7/src/debian/tests/mpi 
-- /tmp/autopkgtest-lxc.a9795u61/downtmp/build.CW7/src/debian/tests/mpi" 
(kind: test)
10374s autopkgtest [06:37:48]: test mpi

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 585 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/debian-science-maintainers/attachments/20250905/660d39f3/attachment-0001.sig>


More information about the debian-science-maintainers mailing list