[Pkg-libvirt-maintainers] Bug#910857: Race condition when rapidly starting many KVMs: binding socket to 127.0.0.1:5900 failed
Andreas Krüger
andreas.krueger at famsik.de
Wed Nov 14 19:03:54 GMT 2018
Package: libvirt-daemon
Version: 3.0.0-4+deb9u3
Followup-For: Bug #910857
Hello,
I ran into a race condition today which may be the same as discussed
in this bug.
What I did: I started several KVM VMs in parallel. This rather
reliably triggers an error. The error message I see (split into
shorter lines for readability):
ERROR internal error: process exited while connecting to monitor:
((null):24104): Spice-Warning **:
reds.c:2524:reds_init_socket:
reds_init_socket: binding socket to 127.0.0.1:5900 failed
Workaround: Start the machines one at a time.
How to reproduce the error:
# for i in 1 2 3; do virt-install --name node$i
--network=bridge:docker0,mac=52:54:00:a1:9c:0$i --boot=hd,network --memory=2048
--vcpus=1 --disk pool=default,size=10 --os-type=linux --os-variant=generic
--noautoconsole --events on_poweroff=preserve & done
Replacing the "&" with a ";" cures the problem.
Background information: I'm experimenting with a DHCP server that is
running in a docker container. I expect my newly-born KVM nodes to
interact with that DHCP server.
The error message as quoted seems to come from a qemu-system-x86_64
process. This I conclude from looking what listens on port 5900 after
a successful startup. With three KVMs running, I have three
qemu-system-x86_64 processes, listening on ports 5900, 5901, 5902.
Checking the command line of those instances, I find that the port
number is not their choice. They are given command line arguments
like, e.g.,
-spice
port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on
(among many others).
After a bit of poking around, I noticed that on my system, the present
pid of the libvirtd process is 1227. I started to trace that process
via
strace -ff -e trace=process -e abbrev=none -o /home/user/junk/strace.log -p 1227
and start one more KVM instance. In one of the files written, I find
(abbreviated):
execve("/usr/bin/qemu-system-x86_64",
["qemu-system-x86_64",
(many lines omitted)
"-spice",
"port=5900,addr=127.0.0.1,disable"...,
So I speculate as follows: The problem may be caused by libvirtd
deciding to assign the same port to different KVMs, if the latter are
started in rapid succession. All affected qemu-processes try to fetch
that port from the host OS, and all but one fail.
Regards,
and thank you for providing fine software
Andreas
-- System Information:
Debian Release: 9.5
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 4.18.0-0.bpo.1-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8),
LANGUAGE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages libvirt-daemon depends on:
ii libapparmor1 2.11.0-3+deb9u2
ii libaudit1 1:2.6.7-2
ii libavahi-client3 0.6.32-2
ii libavahi-common3 0.6.32-2
ii libblkid1 2.29.2-1+deb9u1
ii libc6 2.24-11+deb9u3
ii libcap-ng0 0.7.7-3+b1
ii libdbus-1-3 1.10.26-0+deb9u1
ii libdevmapper1.02.1 2:1.02.137-2
ii libfuse2 2.9.7-1+deb9u1
ii libgnutls30 3.5.8-5+deb9u3
ii libnetcf1 1:0.2.8-1+b2
ii libnl-3-200 3.2.27-2
ii libnl-route-3-200 3.2.27-2
ii libnuma1 2.0.11-2.1
ii libparted2 3.2-17
ii libpcap0.8 1.8.1-3
ii libpciaccess0 0.13.4-1+b2
ii librados2 10.2.11-1
ii librbd1 10.2.11-1
ii libsasl2-2 2.1.27~101-g0780600+dfsg-3
ii libselinux1 2.6-3+b3
ii libssh2-1 1.7.0-1
ii libudev1 232-25+deb9u4
ii libvirt0 3.0.0-4+deb9u3
ii libxen-4.8 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10
ii libxenstore3.0 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10
ii libxml2 2.9.4+dfsg1-2.2+deb9u2
ii libyajl2 2.1.0-2+b3
Versions of packages libvirt-daemon recommends:
ii libxml2-utils 2.9.4+dfsg1-2.2+deb9u2
ii netcat-openbsd 1.130-3
ii qemu-kvm 1:2.8+dfsg-6+deb9u5
Versions of packages libvirt-daemon suggests:
ii libvirt-daemon-system 3.0.0-4+deb9u3
pn numad <none>
-- no debconf information
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/pkg-libvirt-maintainers/attachments/20181114/10de923e/attachment.sig>
More information about the Pkg-libvirt-maintainers
mailing list