[Pkg-xen-devel] Bug#988333: libxenmisc4.16: libxl fails to grant necessary I/O memory access for gfx_passthru of Intel IGD
Chuck Zmudzinski
brchuckz at netscape.net
Mon Mar 7 14:59:26 GMT 2022
The bug's title is re-named to focus it on a single problem that needs
to be fixed.
The bug is marked as affecting the Linux kernel because it causes the
i915.ko module to crash in some configurations.
The bug defined as failure to grant the necessary I/O memory access to a
Linux HVM domU for gfx_passthru of the Intel IGD is fixed by the patch
at the end of this message, so added tag patch.
With that patch applied, passthrough of a Haswell Intel IGD to a
bullseye HVM domU works as expected for the case when the
qemu-xen-traditional device model is used. However, this alone does not
provide the feature of passthrough of the Intel IGD to Linux on Debian
because Debian does not provide the traditional Qemu device model, but
source code is available for it from the upstream Xen project and it can
be built for Debian as it was when Wheezy was released. So please note:
The fact that this patch can only be verified by testing with the Qemu
traditional device model which is a piece of software available from the
upstream Xen project but not provided by the Debian project, makes this
patch as a fix to this bug somewhat difficult to reproduce and verify on
Debian until after the Qemu traditional device model is custom-built for
Debian.
Added tag moreinfo because a complete solution to this problem requires
further investigations about why fixing this bug does not prevent a Call
Trace and failure to boot when using the Qemu upstream device model
instead of the Qemu traditional device model. Most likely there is
another distinct bug yet to be clearly identified and defined.
Summary of my recent tests with a Haswell Intel IGD:
Working configurations:
1. Sid/Xen-4.16 with the patch at the end of this
message/Qemu-xen-traditional as dom0 and a bullseye HVM.
2. Bullseye/Xen-4.14 with the patch at the end of this message adapted
for Xen-4.14/Qemu-xen-traditional as dom0 and a bullseye HVM. In both
these cases, the only binary package that needs to be installed is
libxenmisc4.14 if Bullseye is the dom0 and libxenmisc-4.16 if Sid is the
dom0.
Broken configuration (presumably caused by a yet to be identified bug):
1. Sid/Xen-4.16 with the patch at the end of this message/Debian's
qemu-6.2 as dom0 and a bullseye HVM.
It behaves similarly to the original bug report - there is a very slow
booting process which never completes, a message is displayed on the
dom0 console after a while that states that the IRQ #16 is being
disabled, and there is a Call Trace in the dmesg of the dom0:
[ 842.446490] Call Trace:
[ 842.446496] <IRQ>
[ 842.446503] dump_stack_lvl+0x48/0x5e
[ 842.446517] __report_bad_irq+0x35/0xa7
[ 842.446530] note_interrupt.cold+0xb/0x61
[ 842.446540] handle_irq_event+0xa3/0xb0
[ 842.446551] handle_fasteoi_irq+0x90/0x1e0
[ 842.446562] handle_irq_desc+0x36/0x40
[ 842.446569] __evtchn_fifo_handle_events+0x195/0x1b0
[ 842.446582] __xen_evtchn_do_upcall+0x72/0xc0
[ 842.446595] __xen_pv_evtchn_do_upcall+0x39/0x60
[ 842.446606] xen_pv_evtchn_do_upcall+0xd7/0x100
[ 842.446619] </IRQ>
[ 842.446622] <TASK>
[ 842.446625] exc_xen_hypervisor_callback+0x8/0x10
[ 842.446638] RIP: e030:xen_hypercall_sched_op+0xa/0x20
[ 842.446651] Code: 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f
05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[ 842.446657] RSP: e02b:ffffffff82803d58 EFLAGS: 00000246
[ 842.446665] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
ffffffff8193a3aa
[ 842.446669] RDX: ffffffff82819940 RSI: 0000000000000000 RDI:
0000000000000001
[ 842.446673] RBP: ffffffff82819940 R08: 00000066a1713428 R09:
0000000000000000
[ 842.446677] R10: 0000000000000001 R11: 0000000000000246 R12:
0000000000000000
[ 842.446681] R13: 0000000000000000 R14: ffffffff82819110 R15:
0000000000000000
[ 842.446687] ? xen_hypercall_sched_op+0xa/0x20
[ 842.446701] ? xen_safe_halt+0xc/0x20
[ 842.446710] ? default_idle+0xa/0x10
[ 842.446717] ? default_idle_call+0x33/0xe0
[ 842.446724] ? do_idle+0x215/0x2a0
[ 842.446732] ? cpu_startup_entry+0x19/0x20
[ 842.446738] ? start_kernel+0x6b7/0x6dc
[ 842.446750] ? xen_start_kernel+0x6a4/0x6b1
[ 842.446762] ? startup_xen+0x3e/0x3e
[ 842.446773] </TASK>
[ 842.446776] handlers:
[ 842.446784] [<0000000074c02061>] usb_hcd_irq [usbcore]
[ 842.446843] [<00000000c81c8287>] ath_isr [ath9k]
[ 842.446870] Disabling IRQ #16
Not tested: Bullseye/Xen-4.14 with the patch at the end of this message
adapted for Xen 4.14/Debian's Qemu 5.2 - no need to test this until Sid
is working as the dom0 with Debian's Qemu 6.2 for Sid and Intel IGD
passthrough to a Bullseye HVM domU.
More information is needed to determine the exact nature of the bug that
causes the Call Trace listed above that occurs with Qemu 6.2 and Xen
4.16 on Sid, but not with the traditional Qemu device model on Sid. Most
likely it will be a bug related to this bug.
I will try to investigate the cause of this Call Trace by comparing the
code in Qemu 6.2 with the code in Qemu xen-traditional provided by the
Xen project, and I also may take a look at or try the upstream qemu-xen
build that is provided by the upstream Xen project.
I will also try again to contact the Xen users/developers on the Xen
mailing lists and see if they can provide some insight.
I will provide a detailed description of how I developed the patch that
fixes the bug and enables the passthrough feature of the Intel IGD to a
Bullseye HVM when using the Qemu traditional device model in a
subsequent message.
Here is the patch for the current Xen 4.16 package (version
4.16.0+51-g0941d6cb-1) for Sid:
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -2502,6 +2502,7 @@
for (i = 0 ; i < d_config->num_pcidevs ; i++) {
uint64_t vga_iomem_start = 0xa0000 >> XC_PAGE_SHIFT;
+ uint64_t vga_iomem2_start = 0xcc490; /* Probably IRQ data, nr =
0x2 */
uint32_t stubdom_domid;
libxl_device_pci *pci = &d_config->pcidevs[i];
unsigned long pci_device_class;
@@ -2531,6 +2532,25 @@
domid, vga_iomem_start, (vga_iomem_start + 0x20 - 1));
return ret;
}
+ ret = xc_domain_iomem_permission(CTX->xch, stubdom_domid,
+ vga_iomem2_start, 0x2, 1);
+ if (ret < 0) {
+ LOGED(ERROR, domid,
+ "failed to give stubdom%d access to iomem range "
+ "%"PRIx64"-%"PRIx64" for VGA passthru",
+ stubdom_domid,
+ vga_iomem2_start, (vga_iomem2_start + 0x2 - 1));
+ return ret;
+ }
+ ret = xc_domain_iomem_permission(CTX->xch, domid,
+ vga_iomem2_start, 0x2, 1);
+ if (ret < 0) {
+ LOGED(ERROR, domid,
+ "failed to give dom%d access to iomem range "
+ "%"PRIx64"-%"PRIx64" for VGA passthru",
+ domid, vga_iomem2_start, (vga_iomem2_start + 0x2 - 1));
+ return ret;
+ }
break;
}
More information about the Pkg-xen-devel
mailing list