[Pkg-xen-devel] Bug#1020787: linux-image-5.19.0-2-amd64: After updating to 5.19 kernel the VMs are started without XSAVE CPU flags
Ps Ps
mailto_ps at gmx.net
Tue Sep 27 08:08:54 BST 2022
Am Dienstag, dem 27.09.2022 um 01:39 +0200 schrieb Diederik de Haas:
> Which version of Xen are you using?
>
It's the current debian sid state:
dpkg -l | grep xen
ii grub-xen-bin 2.06-4 amd64 GRand Unified Bootloader, version 2 (Xen modules)
ii grub-xen-host 2.06-4 amd64 GRand Unified Bootloader, version 2 (Xen host version)
ii libxencall1:amd64 4.16.2-1 amd64 Xen runtime library - libxencall
ii libxendevicemodel1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxendevicemodel
ii libxenevtchn1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxenevtchn
ii libxenforeignmemory1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxenforeignmemory
ii libxengnttab1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxengnttab
ii libxenhypfs1:amd64 4.16.2-1 amd64 Xen runtime library - libxenhypfs
ii libxenmisc4.16:amd64 4.16.2-1 amd64 Xen runtime libraries - miscellaneous, versioned ABI
ii libxenstore4:amd64 4.16.2-1 amd64 Xen runtime libraries - libxenstore
ii libxentoolcore1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxentoolcore
ii libxentoollog1:amd64 4.16.2-1 amd64 Xen runtime libraries - libxentoollog
ii qemu-system-xen 1:7.1+dfsg-2 amd64 QEMU full system emulation (Xen helper package)
ii xen-hypervisor-4.16-amd64 4.16.2-1 amd64 Xen Hypervisor on AMD64
ii xen-hypervisor-common 4.16.2-1 amd64 Xen Hypervisor - common files
ii xen-system-amd64 4.16.2-1 amd64 Xen System on AMD64 (metapackage)
ii xen-tools 4.9.1-1 all Tools to manage Xen virtual servers
ii xen-utils-4.16 4.16.2-1 amd64 Xen administrative tools
ii xen-utils-common 4.16.2-1 amd64 Xen administrative tools - common files
ii xenstore-utils 4.16.2-1 amd64 Xenstore command line utilities for Xen
>
> Is this all about the dom0 kernel or is it all/some about using 5.19 as
> domU kernel? Are the issues happing on dom0 or inside domU?
>
It seems to happen to both, dom0 and domU. It seems also to affect everything which is using gnutls, which is maybe evaluating the cpu flags, as far as I understood the linked ticket.
> If the issues happen inside (a) domU, can you share a (minimal) domU
> configuration file so it becomes easier to replicate?
>
sure. (Currently the old kernel 5.18 is configured.)
name="mail"
on_xend_stop="shutdown"
memory=12288
maxmem=12288
vcpus=4
cpus="2-5"
kernel="/etc/xen/vm/boot/vmlinuz-5.18.0-4-amd64"
ramdisk="/etc/xen/vm/boot/initrd.img-5.18.0-4-amd64"
root="/dev/xvda1"
disk=[ '/dev/mapper/mail,,xvda1' ]
vif=[ 'mac=00:16:3e:00:00:05, bridge=xenbr0, vifname=mail.0', 'mac=00:16:3e:00:ff:05, bridge=xenbrlo, vifname=mail.lo' ]
extra="lockd.nlm_tcpport=61053 lockd.nlm_udpport=61053 ipv6.disable=1 net.ifnames=0 xen_blkfront.max_queues=3"
> > And indeed there is some difference in /proc/cpuinfo:
> > The flags for "fma xsave avx2 bmi2 xsaveopt xsavec xgetbv1 md_clear" are
> > missing, which might result in gnutls failures.
>
> In kernel 5.19 the following commits were added under ``arch/x86/kernel/fpu/``:
>
> b91c0922bf1ed15b67a6faa404bc64e3ed532ec2 x86/fpu: Cleanup variable shadowing
> 8ad7e8f696951f192c6629a0cbda9ac94c773159 x86/fpu/xsave: Support XSAVEC in the kernel
> f5c0b4f30416c670408a77be94703d04d22b57df x86/prctl: Remove pointless task argument
>
> Of these, the first 2 seem like possible candidates that caused the issue.
> https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
> describes a way to apply a simple patch to a kernel.
> What you could try, is creating a patch from reverting one of the earlier
> mentioned commits and use that with 'test-patches'.
>
OK, sounds like a reasonable test. I will report it in the next mail.
> It's probably also useful to know what CPU(s) are in the machine (dom0).
>
This is the output of one entry in /proc/cpuinfo (with Kernel 5.18):
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 165
model name : Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
stepping : 5
microcode : 0xe2
cpu MHz : 2903.996
cache size : 16384 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq
monitor est ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 avx2 bmi2 erms rdseed
adx clflushopt xsaveopt xsavec xgetbv1 md_clear arch_capabilities
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit srbds mmio_stale_data retbleed eibrs_pbrsb
bogomips : 5807.99
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
> On Monday, 26 September 2022 20:31:17 CEST Ps Ps wrote:
> > On Xen Hypervisor I just found this logs:
>
> So this is on dom0? In which log file did you find it?
>
Yes, its dom0
> Generally: be as specific as you can be and describe *exactly* what you did
> and the exact results (if any). Also try to make it as easy as possible for
> others to reproduce what you're experiencing.
>
> FTR: I did not see this issue on my dom0 (Xen 4.16.2-1; kernel 5.19.11-1):
>
> root at dom0:~# dmesg
> [ 0.000000] Linux version 5.19.0-2-amd64 (debian-kernel at lists.debian.org) (gcc-11 (Debian 11.3.0-6) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP PREEMPT_DYNAMIC Debian
> 5.19.11-1 (2022-09-24)
> [ 0.000000] Command line: placeholder root=UUID=8008723b-668f-43f6-b432-8c56ed53f48a ro quiet net.ifnames=0
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
> [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
> [ 0.000000] signal: max sigframe size: 1776
> [ 0.000000] Released 0 page(s)
>
> root at dom0:~# grep flag /proc/cpuinfo | uniq
> flags : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid tsc_known_freq pni pclmulqdq
> monitor est ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 bmi2 erms rtm rdseed adx
> xsaveopt md_clear
>
>
> Also found this patch which should make the error msg more informative ...
> https://lore.kernel.org/all/20220810221909.12768-1-andrew.cooper3@citrix.com/
> Even though I haven't experienced it (yet?), the language of this patch
> seems to indicate you're not alone with it.
I will add this patch too, maybe that provides some more informations.
Thanks for your hints!
Regards
Patrick
More information about the Pkg-xen-devel
mailing list