Bug#980033: libucx0: UCX ERROR rdma_create_event_channel failed: No such device
Drew Parsons
dparsons at debian.org
Wed Jan 13 09:16:05 GMT 2021
Package: libucx0
Version: 1.10.0~rc1-2
Severity: serious
Justification: debci
Our next round of whack-a-mole comes from the new UCX.
pmix 4.0.0-3 seems to have fixed the pmix error from bug#979744.
debci tests next report a problem with UCX, with
openmpi 4.1.0-5
pmix 4.0.0-3
ucx 1.10.0~rc1-2
The openmpi debci test at
https://ci.debian.net/data/autopkgtest/testing/arm64/o/openmpi/9650495/log.gz
reports:
autopkgtest [15:16:16]: test hello4: [-----------------------
[1610522176.588740] [ci-013-36a60f22:1417 :0] rdmacm_cm.c:638 UCX ERROR rdma_create_event_channel failed: No such device
[1610522176.588779] [ci-013-36a60f22:1417 :0] ucp_worker.c:1432 UCX ERROR failed to open CM on component rdmacm with status Input/output error
[ci-013-36a60f22:01417] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
node 0 : Hello world
autopkgtest [15:16:17]: test hello4: -----------------------]
autopkgtest [15:16:18]: test hello4: - - - - - - - - - - results - - - - - - - - - -
hello4 FAIL stderr: [ci-013-36a60f22:01417] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
autopkgtest [15:16:18]: test hello4: - - - - - - - - - - stderr - - - - - - - - - -
[ci-013-36a60f22:01417] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
autopkgtest [15:16:18]: @@@@@@@@@@@@@@@@@@@@ summary
hello1 FAIL stderr: [ci-013-36a60f22:01292] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
hello2 FAIL stderr: [ci-013-36a60f22:01218] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
hello4 FAIL stderr: [ci-013-36a60f22:01417] ../../../../../../ompi/mca/pml/ucx/pml_ucx.c:273 Error: Failed to create UCP worker
Other client applications fail with the same error.
-- System Information:
Debian Release: bullseye/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 5.10.0-1-amd64 (SMP w/8 CPU threads)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8), LANGUAGE=en_AU:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
Versions of packages libucx0 depends on:
ii ibverbs-providers 33.0-1
ii libbinutils 2.35.1-7
ii libc6 2.31-9
ii libibverbs1 33.0-1
ii libnuma1 2.0.12-1+b1
ii librdmacm1 33.0-1
libucx0 recommends no packages.
libucx0 suggests no packages.
-- no debconf information
More information about the debian-science-maintainers
mailing list