[Pkg-libvirt-maintainers] Bug#848317: not arch dependent but racy

Christian Ehrhardt christian.ehrhardt at canonical.com
Mon Dec 19 13:59:04 UTC 2016


Hi,
I think we can forget my former suggestion for isolation at least for now.
Thanks for closing 848319 btw that explanation gave me the confidence to
continue debugging this case.

Now I found that I seem to be "able to" run into this issue on x86 as well
- so not arch dependent at all.
Still it is weird, in a dep8 environment I sem to run into this 100% while
I never do when I redo the same steps on my system.

I logged into the dep8 KVM guest and checked how reproducible it is.
It turns out that after the FIRST restart the lxc-guest is killed and
listed as shut-down then.
I lists an error like this:
Dec 19 14:40:19 autopkgtest libvirtd[4500]: internal error: No valid cgroup
for machine sl
Dec 19 14:40:19 autopkgtest libvirtd[4500]: End of file while reading data:
Input/output error

That is somewhat familiar if you know
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=774237

But then we are clearly above that systemd level.
Now what makes this even more interesting is that AFTER the issue happened
it seems to be ok.

So afterwards doing
export ...
virsh start sl
# now I can restart libvirt without affecting guest "sl"

The following cleanup and re-define gets me back to the situation where a
following restart will destroy the guest and throw the error listed above
to the log:
virsh destroy sl; virsh undefine sl; rm -rf /etc/libvirt; apt-get remove
--purge libvirt-daemon-system libvirt-clients libxml2-utils; apt-get
install libvirt-daemon-system libvirt-clients libxml2-utils; virsh define
smoke-lxc.xml; virsh start sl; virsh list --all
# now trigger the fail with
/etc/init.d/libvirtd restart

After this I have again libvirt running, but not really:
systemctl status libvirtd
● libvirtd.service - Virtualization daemon
  Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor
preset: enabled)
  Active: active (running) since Mon 2016-12-19 14:52:51 CET; 14s ago
    Docs: man:libvirtd(8)
          http://libvirt.org
Main PID: 7608 (libvirtd)
   Tasks: 18
  CGroup: /system.slice/libvirtd.service
          ├─2035 /usr/sbin/dnsmasq
--conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
--dhcp-script=/usr/lib/libvirt/libvirt_leases
          ├─2036 /usr/sbin/dnsmasq
--conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
--dhcp-script=/usr/lib/libvirt/libvirt_leases
          └─7608 /usr/sbin/libvirtd

Dec 19 14:52:51 autopkgtest systemd[1]: Starting Virtualization daemon...
Dec 19 14:52:51 autopkgtest systemd[1]: Started Virtualization daemon.
Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read /etc/hosts - 8 addresses
Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read
/var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Dec 19 14:52:52 autopkgtest dnsmasq-dhcp[2035]: read
/var/lib/libvirt/dnsmasq/default.hostsfile
Dec 19 14:52:52 autopkgtest libvirtd[7608]: libvirt version: 2.5.0,
package: 1ubuntu1~ppa3 (Christian Ehrhardt <christian.ehrhardt at canonical.com>

Dec 19 14:52:52 autopkgtest libvirtd[7608]: hostname:
autopkgtest.localdomain
Dec 19 14:52:52 autopkgtest libvirtd[7608]: internal error: No valid cgroup
for machine sl
Dec 19 14:52:52 autopkgtest libvirtd[7608]: End of file while reading data:
Input/output error

I say "not really" even if systemd says active here because a virsh list
now looks the following way:
$ virsh list --all
Id    Name                           State
----------------------------------------------------

That is it, the guest is completely gone.
Restarting the service again lets it start normally and the guest returns -
although in stopped state:

$ /etc/init.d/libvirtd restart
$ virsh list --all
Id    Name                           State
----------------------------------------------------
-     sl                             shut off

I can start the guest again now, and from this point on it is resilient
against restarts.
$ virsh start sl
$ /etc/init.d/libvirtd restart
$ virsh list --all
Id    Name                           State
----------------------------------------------------
9020  sl                             running


So much for now trying to gather some extra debug data next ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-libvirt-maintainers/attachments/20161219/4a5b0e3a/attachment-0001.html>


More information about the Pkg-libvirt-maintainers mailing list