[Pkg-libvirt-maintainers] Bug#848317: Bug#848317: not arch dependent but racy

Guido Günther agx at sigxcpu.org
Mon Dec 19 17:37:11 UTC 2016


Hi,
On Mon, Dec 19, 2016 at 02:59:04PM +0100, Christian Ehrhardt wrote:
> Hi,
> I think we can forget my former suggestion for isolation at least for now.
> Thanks for closing 848319 btw that explanation gave me the confidence to
> continue debugging this case.
> 
> Now I found that I seem to be "able to" run into this issue on x86 as well
> - so not arch dependent at all.
> Still it is weird, in a dep8 environment I sem to run into this 100% while
> I never do when I redo the same steps on my system.
> 
> I logged into the dep8 KVM guest and checked how reproducible it is.
> It turns out that after the FIRST restart the lxc-guest is killed and
> listed as shut-down then.
> I lists an error like this:
> Dec 19 14:40:19 autopkgtest libvirtd[4500]: internal error: No valid cgroup
> for machine sl
> Dec 19 14:40:19 autopkgtest libvirtd[4500]: End of file while reading data:
> Input/output error
> 
> That is somewhat familiar if you know
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=774237
> 
> But then we are clearly above that systemd level.
> Now what makes this even more interesting is that AFTER the issue happened
> it seems to be ok.
> 
> So afterwards doing
> export ...
> virsh start sl
> # now I can restart libvirt without affecting guest "sl"
> 
> The following cleanup and re-define gets me back to the situation where a
> following restart will destroy the guest and throw the error listed above
> to the log:
> virsh destroy sl; virsh undefine sl; rm -rf /etc/libvirt; apt-get remove
> --purge libvirt-daemon-system libvirt-clients libxml2-utils; apt-get
> install libvirt-daemon-system libvirt-clients libxml2-utils; virsh define
> smoke-lxc.xml; virsh start sl; virsh list --all
> # now trigger the fail with
> /etc/init.d/libvirtd restart
> 
> After this I have again libvirt running, but not really:
> systemctl status libvirtd
> ● libvirtd.service - Virtualization daemon
>   Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor
> preset: enabled)
>   Active: active (running) since Mon 2016-12-19 14:52:51 CET; 14s ago
>     Docs: man:libvirtd(8)
>           http://libvirt.org
> Main PID: 7608 (libvirtd)
>    Tasks: 18
>   CGroup: /system.slice/libvirtd.service
>           ├─2035 /usr/sbin/dnsmasq
> --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
> --dhcp-script=/usr/lib/libvirt/libvirt_leases
>           ├─2036 /usr/sbin/dnsmasq
> --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
> --dhcp-script=/usr/lib/libvirt/libvirt_leases
>           └─7608 /usr/sbin/libvirtd
> 
> Dec 19 14:52:51 autopkgtest systemd[1]: Starting Virtualization daemon...
> Dec 19 14:52:51 autopkgtest systemd[1]: Started Virtualization daemon.
> Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read /etc/hosts - 8 addresses
> Dec 19 14:52:52 autopkgtest dnsmasq[2035]: read
> /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
> Dec 19 14:52:52 autopkgtest dnsmasq-dhcp[2035]: read
> /var/lib/libvirt/dnsmasq/default.hostsfile
> Dec 19 14:52:52 autopkgtest libvirtd[7608]: libvirt version: 2.5.0,
> package: 1ubuntu1~ppa3 (Christian Ehrhardt <christian.ehrhardt at canonical.com>
> 
> Dec 19 14:52:52 autopkgtest libvirtd[7608]: hostname:
> autopkgtest.localdomain
> Dec 19 14:52:52 autopkgtest libvirtd[7608]: internal error: No valid cgroup
> for machine sl
> Dec 19 14:52:52 autopkgtest libvirtd[7608]: End of file while reading data:
> Input/output error
> 
> I say "not really" even if systemd says active here because a virsh list
> now looks the following way:
> $ virsh list --all
> Id    Name                           State
> ----------------------------------------------------
> 
> That is it, the guest is completely gone.
> Restarting the service again lets it start normally and the guest returns -
> although in stopped state:
> 
> $ /etc/init.d/libvirtd restart
> $ virsh list --all
> Id    Name                           State
> ----------------------------------------------------
> -     sl                             shut off
> 
> I can start the guest again now, and from this point on it is resilient
> against restarts.
> $ virsh start sl
> $ /etc/init.d/libvirtd restart
> $ virsh list --all
> Id    Name                           State
> ----------------------------------------------------
> 9020  sl                             running
> 
> 
> So much for now trying to gather some extra debug data next ...

Great. Now that you found a way to reproduce it should be possible to
cook a patch for upstream that fixes this!
Cheers,
 -- Guido



More information about the Pkg-libvirt-maintainers mailing list