[Pkg-xen-devel] Bug#701744: [xen] Update to hypervisor 4.0.1-5.6 or linux-image-2.6.32-5-xen-amd64 2.6.32-48 causes networking (VIF) failures
Ingo Juergensmann
ij at 2013.bluespice.org
Tue Feb 26 17:42:11 UTC 2013
Package: xen
Version: 4.0.1-5.5
Severity: critical
--- Please enter the report below this line. ---
Hi!
Since the update last weekind in stable/squeeze I'm experiencing
problems with running Xen on amd64 and multiple domUs losing their
network connection/VIFs.
From
http://blog.windfluechter.net/content/blog/2013/02/26/1597-xen-problems-vms-2632-5-xen-amd64
Unfortunately this update appears to be problematic on my Xen hosting
server. This night it happened the second time that some of the virtual
network interfaces disappeared or turned out to be non-working. For
example I have two VMs: one running the webserver and one running the
databases. Between these two VMs there's a bridge on the dom0 and both
VMs have a VIF to that (internal) bridge. What happens is that this
bridge becomes inaccessible from within the webserver VM.
Sadly there's not much to see in the log files. I just spotted this on
dom0:
Feb 26 01:01:29 gate kernel: [12697.907512] vif3.1: Frag is bigger
than frame.
Feb 26 01:01:29 gate kernel: [12697.907550] vif3.1: fatal error;
disabling device
Feb 26 01:01:29 gate kernel: [12697.919921] xenbr1: port 3(vif3.1)
entering disabled state
Feb 26 01:22:00 gate kernel: [13928.644888] vif2.1: Frag is bigger
than frame.
Feb 26 01:22:00 gate kernel: [13928.644920] vif2.1: fatal error;
disabling device
Feb 26 01:22:00 gate kernel: [13928.663571] xenbr1: port 2(vif2.1)
entering disabled state
Feb 26 01:40:44 gate kernel: [15052.629280] vif7.1: Frag is bigger
than frame.
Feb 26 01:40:44 gate kernel: [15052.629314] vif7.1: fatal error;
disabling device
Feb 26 01:40:44 gate kernel: [15052.641725] xenbr1: port 6(vif7.1)
entering disabled state
This corresponds to the number of VMs having lost their internal
connection to the bridge. On the webserver VM I see this output:
Feb 26 01:59:01 vserv1 kernel: [16113.539767] IPv6: sending
pkt_too_big to self
Feb 26 01:59:01 vserv1 kernel: [16113.539794] IPv6: sending
pkt_too_big to self
Feb 26 02:30:54 vserv1 kernel: [18026.407517] IPv6: sending
pkt_too_big to self
Feb 26 02:30:54 vserv1 kernel: [18026.407546] IPv6: sending
pkt_too_big to self
Feb 26 02:30:54 vserv1 kernel: [18026.434761] IPv6: sending
pkt_too_big to self
Feb 26 02:30:54 vserv1 kernel: [18026.434787] IPv6: sending
pkt_too_big to self
Feb 26 03:39:16 vserv1 kernel: [22128.768214] IPv6: sending
pkt_too_big to self
Feb 26 03:39:16 vserv1 kernel: [22128.768240] IPv6: sending
pkt_too_big to self
Feb 26 04:39:51 vserv1 kernel: [25764.250170] IPv6: sending
pkt_too_big to self
Feb 26 04:39:51 vserv1 kernel: [25764.250196] IPv6: sending
pkt_too_big to self
Rebooting the VMs will result in a non-working VM as it will get paused
on creation and Xen scripts complain about not working hotplug scripts
and Xen logs shows this:
[2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:101)
XendDomainInfo.create(['vm', ['name', 'vserv1'], ['memory', '2048'],
['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash',
'restart'], ['on_xend_start', 'ignore'], ['on_xend_stop', 'ignore'],
['vcpus', '2'], ['oos', 1], ['bootloader',
'/usr/lib/xen-4.0/bin/pygrub'],
['bootloader_args', ''], ['image', ['linux', ['root', '/dev/xvdb '],
['videoram', 4], ['tsc_mode', 0], ['nomigrate', 0]]],
['s3_integrity', 1],
['device', ['vbd', ['uname', 'phy:/dev/lv/vserv1-boot'], ['dev',
'xvda'],
['mode', 'w']]], ['device', ['vbd', ['uname',
'phy:/dev/lv/vserv1-disk'],
['dev', 'xvdb'], ['mode', 'w']]], ['device', ['vbd', ['uname',
'phy:/dev/lv/vserv1-swap'], ['dev', 'xvdc'], ['mode', 'w']]],
['device',
['vbd', ['uname', 'phy:/dev/lv/vserv1mirror'], ['dev', 'xvdd'],
['mode',
'w']]]])
[2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2508)
XendDomainInfo.constructDomain
[2013-02-25 13:06:34 5470] DEBUG (balloon:220) Balloon: 2100000 KiB
free;
need 16384; done.
[2013-02-25 13:06:34 5470] DEBUG (XendDomain:464) Adding Domain: 39
[2013-02-25 13:06:34 5470] DEBUG (XendDomainInfo:2818)
XendDomainInfo.initDomain: 39 256
[2013-02-25 13:06:34 5781] DEBUG (XendBootloader:113) Launching
bootloader
as ['/usr/lib/xen-4.0/bin/pygrub', '--args=root=/dev/xvdb ',
'--output=/var/run/xend/boot/xenbl.6040', '/dev/lv/vserv1-boot'].
[2013-02-25 13:06:39 5470] DEBUG (XendDomainInfo:2845)
_initDomain:shadow_memory=0x0, memory_static_max=0x80000000,
memory_static_min=0x0.
[2013-02-25 13:06:39 5470] INFO (image:182) buildDomain os=linux dom=39
vcpus=2
[2013-02-25 13:06:39 5470] DEBUG (image:721) domid = 39
[2013-02-25 13:06:39 5470] DEBUG (image:722) memsize = 2048
[2013-02-25 13:06:39 5470] DEBUG (image:723) image =
/var/run/xend/boot/boot_kernel.xj7W_t
[2013-02-25 13:06:39 5470] DEBUG (image:724) store_evtchn = 1
[2013-02-25 13:06:39 5470] DEBUG (image:725) console_evtchn = 2
[2013-02-25 13:06:39 5470] DEBUG (image:726) cmdline =
root=UUID=ed71a39f-fd2e-4035-8557-493686baa151 ro root=/dev/xvdb
[2013-02-25 13:06:39 5470] DEBUG (image:727) ramdisk =
/var/run/xend/boot/boot_ramdisk.QavuAo
[2013-02-25 13:06:39 5470] DEBUG (image:728) vcpus = 2
[2013-02-25 13:06:39 5470] DEBUG (image:729) features =
[2013-02-25 13:06:39 5470] DEBUG (image:730) flags = 0
[2013-02-25 13:06:39 5470] DEBUG (image:731) superpages = 0
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice:
vbd :
{'uuid': '04d99772-cf27-aecf-2d1b-c73eaf657410', 'bootable': 1,
'driver':
'paravirtualised', 'dev': 'xvda', 'uname': 'phy:/dev/lv/vserv1-boot',
'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController:
writing
{'virtual-device': '51712', 'device-type': 'disk', 'protocol':
'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
'/local/domain/0/backend/vbd/39/51712'} to
/local/domain/39/device/vbd/51712.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController:
writing
{'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51712',
'uuid': '04d99772-cf27-aecf-2d1b-c73eaf657410', 'bootable': '1', 'dev':
'xvda', 'state': '1', 'params': '/dev/lv/vserv1-boot', 'mode': 'w',
'online': '1', 'frontend-id': '39', 'type': 'phy'} to
/local/domain/0/backend/vbd/39/51712.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice:
vbd :
{'uuid': 'e46cb89f-3e54-41d2-53bd-759ed6c690d2', 'bootable': 0,
'driver':
'paravirtualised', 'dev': 'xvdb', 'uname': 'phy:/dev/lv/vserv1-disk',
'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController:
writing
{'virtual-device': '51728', 'device-type': 'disk', 'protocol':
'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
'/local/domain/0/backend/vbd/39/51728'} to
/local/domain/39/device/vbd/51728.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController:
writing
{'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51728',
'uuid': 'e46cb89f-3e54-41d2-53bd-759ed6c690d2', 'bootable': '0', 'dev':
'xvdb', 'state': '1', 'params': '/dev/lv/vserv1-disk', 'mode': 'w',
'online': '1', 'frontend-id': '39', 'type': 'phy'} to
/local/domain/0/backend/vbd/39/51728.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice:
vbd :
{'uuid': 'e2d61860-7448-1843-3935-6b63c5d2878e', 'bootable': 0,
'driver':
'paravirtualised', 'dev': 'xvdc', 'uname': 'phy:/dev/lv/vserv1-swap',
'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController:
writing
{'virtual-device': '51744', 'device-type': 'disk', 'protocol':
'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
'/local/domain/0/backend/vbd/39/51744'} to
/local/domain/39/device/vbd/51744.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController:
writing
{'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51744',
'uuid': 'e2d61860-7448-1843-3935-6b63c5d2878e', 'bootable': '0', 'dev':
'xvdc', 'state': '1', 'params': '/dev/lv/vserv1-swap', 'mode': 'w',
'online': '1', 'frontend-id': '39', 'type': 'phy'} to
/local/domain/0/backend/vbd/39/51744.
[2013-02-25 13:06:40 5470] INFO (XendDomainInfo:2367) createDevice:
vbd :
{'uuid': 'd314a46e-1ce9-0e8d-b009-3f08e29735f5', 'bootable': 0,
'driver':
'paravirtualised', 'dev': 'xvdd', 'uname': 'phy:/dev/lv/vserv1mirror',
'mode': 'w'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController:
writing
{'virtual-device': '51760', 'device-type': 'disk', 'protocol':
'x86_64-abi', 'backend-id': '0', 'state': '1', 'backend':
'/local/domain/0/backend/vbd/39/51760'} to
/local/domain/39/device/vbd/51760.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController:
writing
{'domain': 'vserv1', 'frontend': '/local/domain/39/device/vbd/51760',
'uuid': 'd314a46e-1ce9-0e8d-b009-3f08e29735f5', 'bootable': '0', 'dev':
'xvdd', 'state': '1', 'params': '/dev/lv/vserv1mirror', 'mode': 'w',
'online': '1', 'frontend-id': '39', 'type': 'phy'} to
/local/domain/0/backend/vbd/39/51760.
[2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:3400) Storing VM
details:
{'on_xend_stop': 'ignore', 'shadow_memory': '0', 'uuid':
'04541225-6d3c-3cae-a4c4-0b6d4ccfac7a', 'on_reboot': 'restart',
'start_time': '1361794000.37', 'on_poweroff': 'destroy',
'bootloader_args':
'', 'on_xend_start': 'ignore', 'on_crash': 'restart',
'xend/restart_count':
'0', 'vcpus': '2', 'vcpu_avail': '3', 'bootloader':
'/usr/lib/xen-4.0/bin/pygrub', 'image': "(linux (kernel ) (args
'root=/dev/xvdb ') (superpages 0) (tsc_mode 0) (videoram 4) (pci ())
(nomigrate 0) (notes (HV_START_LOW 18446603336221196288) (FEATURES
'!writable_page_tables|pae_pgdir_above_4gb') (VIRT_BASE
18446744071562067968) (GUEST_VERSION 2.6) (PADDR_OFFSET 0)
(GUEST_OS linux)
(HYPERCALL_PAGE 18446744071578882048) (LOADER generic)
(SUSPEND_CANCEL 1)
(PAE_MODE yes) (ENTRY 18446744071584289280) (XEN_VERSION xen-3.0)))",
'name': 'vserv1'}
[2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:1804) Storing domain
details: {'console/ring-ref': '2143834', 'image/entry':
'18446744071584289280', 'console/port': '2', 'store/ring-ref':
'2143835',
'image/loader': 'generic', 'vm':
'/vm/04541225-6d3c-3cae-a4c4-0b6d4ccfac7a',
'control/platform-feature-multiprocessor-suspend': '1',
'image/hv-start-low': '18446603336221196288', 'image/guest-os':
'linux',
'cpu/1/availability': 'online', 'image/virt-base':
'18446744071562067968',
'memory/target': '2097152', 'image/guest-version': '2.6',
'image/pae-mode':
'yes', 'description': '', 'console/limit': '1048576',
'image/paddr-offset':
'0', 'image/hypercall-page': '18446744071578882048',
'image/suspend-cancel': '1', 'cpu/0/availability': 'online',
'image/features/pae-pgdir-above-4gb': '1',
'image/features/writable-page-tables': '0', 'console/type':
'xenconsoled',
'name': 'vserv1', 'domid': '39', 'image/xen-version': 'xen-3.0',
'store/port': '1'}
[2013-02-25 13:06:40 5470] DEBUG (DevController:95) DevController:
writing
{'protocol': 'x86_64-abi', 'state': '1', 'backend-id': '0', 'backend':
'/local/domain/0/backend/console/39/0'} to
/local/domain/39/device/console/0.
[2013-02-25 13:06:40 5470] DEBUG (DevController:97) DevController:
writing
{'domain': 'vserv1', 'frontend': '/local/domain/39/device/console/0',
'uuid': 'c8819aed-c78f-02b8-0ef7-1600abd15add', 'frontend-id': '39',
'state': '1', 'location': '2', 'online': '1', 'protocol': 'vt100'} to
/local/domain/0/backend/console/39/0.
[2013-02-25 13:06:40 5470] DEBUG (XendDomainInfo:1891)
XendDomainInfo.handleShutdownWatch
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for
devices
vif2.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for
devices
vif.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for
devices
vscsi.
[2013-02-25 13:06:40 5470] DEBUG (DevController:139) Waiting for
devices
vbd.
[2013-02-25 13:06:40 5470] DEBUG (DevController:144) Waiting for 51712.
[2013-02-25 13:06:40 5470] DEBUG (DevController:628)
hotplugStatusCallback
/local/domain/0/backend/vbd/39/51712/hotplug-status.
From my point of view, either Xen hypervisor or the kernel seems to be
broken, but it's hard to tell for me.
I suspect the problem within the Xen kernel part of VIF code as a reboot
of the dom0 solves this problem temporarily without touching the domUs.
But within some hours (<6 hrs) the issue re-appears.
Although I assume that xend is responsible for adding/removing VIFs a
restart of xend doesn't help at all. That's why I assume a kernel
problem within the dom0.
I'm running 8 domUs at the moment, each of them is connected to the
outer world through xenbr0 and connected to the internal world through
xenbr1 and RFC1918 addresses. I'm running a mixed setup of routed and
bridged config:
(vif-script vif-bridge)
(network-script network-route)
But the server ran several years with that setup without any problems,
so I don't think that's an issue.
For now I'm forced to go back to a working kernel as I need to keep the
server up and running.
--- System information. ---
Architecture: amd64
Kernel: Linux 2.6.32-5-xen-amd64
gate:~# dpkg -l | grep xen
ii libxenstore3.0 4.0.1-5.6
Xenstore communications library for Xen
ii linux-image-2.6.32-5-xen-amd64 2.6.32-48
Linux 2.6.32 for 64-bit PCs, Xen dom0 support
ii xen-hypervisor-4.0-amd64 4.0.1-5.6
The Xen Hypervisor on AMD64
ii xen-linux-system-2.6-xen-amd64 2.6.32+29
Xen system with Linux 2.6 for 64-bit PCs (meta-package)
ii xen-linux-system-2.6.32-5-xen-amd64 2.6.32-48
Xen system with Linux 2.6.32 on 64-bit PCs (meta-package)
ii xen-tools 4.2-1
Tools to manage Xen virtual servers
ii xen-utils-4.0 4.0.1-5.6
XEN administrative tools
ii xen-utils-common 4.0.0-1
XEN administrative tools - common files
ii xenstore-utils 4.0.1-5.6
Xenstore utilities for Xen
ii xenwatch 0.5.4-2
Virtualization utilities, mostly for Xen
--
Ciao... // Fon: 0381-2744150
Ingo \X/ http://blog.windfluechter.net
Please don't share this address with Facebook or Google!
gpg pubkey: http://www.juergensmann.de/ij_public_key.asc
More information about the Pkg-xen-devel
mailing list