[Pkg-xen-devel] Bug#988333: libxenmisc4.16: libxl fails to grant necessary I/O memory access for gfx_passthru of Intel IGD

Chuck Zmudzinski brchuckz at netscape.net
Mon Mar 7 14:59:26 GMT 2022


The bug's title is re-named to focus it on a single problem that needs 
to be fixed.

The bug is marked as affecting the Linux kernel because it causes the 
i915.ko module to crash in some configurations.

The bug defined as failure to grant the necessary I/O memory access to a 
Linux HVM domU for gfx_passthru of the Intel IGD is fixed by the patch 
at the end of this message, so added tag patch.

With that patch applied, passthrough of a Haswell Intel IGD to a 
bullseye HVM domU works as expected for the case when the 
qemu-xen-traditional device model is used. However, this alone does not 
provide the feature of passthrough of the Intel IGD to Linux on Debian 
because Debian does not provide the traditional Qemu device model, but 
source code is available for it from the upstream Xen project and it can 
be built for Debian as it was when Wheezy was released. So please note: 
The fact that this patch can only be verified by testing with the Qemu 
traditional device model which is a piece of software available from the 
upstream Xen project but not provided by the Debian project, makes this 
patch as a fix to this bug somewhat difficult to reproduce and verify on 
Debian until after the Qemu traditional device model is custom-built for 
Debian.

Added tag moreinfo because a complete solution to this problem requires 
further investigations about why fixing this bug does not prevent a Call 
Trace and failure to boot when using the Qemu upstream device model 
instead of the Qemu traditional device model. Most likely there is 
another distinct bug yet to be clearly identified and defined.

Summary of my recent tests with a Haswell Intel IGD:

Working configurations:

1. Sid/Xen-4.16 with the patch at the end of this 
message/Qemu-xen-traditional as dom0 and a bullseye HVM.
2. Bullseye/Xen-4.14 with the patch at the end of this message adapted 
for Xen-4.14/Qemu-xen-traditional as dom0 and a bullseye HVM. In both 
these cases, the only binary package that needs to be installed is 
libxenmisc4.14 if Bullseye is the dom0 and libxenmisc-4.16 if Sid is the 
dom0.

Broken configuration (presumably caused by a yet to be identified bug):

1. Sid/Xen-4.16 with the patch at the end of this message/Debian's 
qemu-6.2 as dom0 and a bullseye HVM.

It behaves similarly to the original bug report - there is a very slow 
booting process which never completes, a message is displayed on the 
dom0 console after a while that states that the IRQ #16 is being 
disabled, and there is a Call Trace in the dmesg of the dom0:

[  842.446490] Call Trace:
[  842.446496]  <IRQ>
[  842.446503]  dump_stack_lvl+0x48/0x5e
[  842.446517]  __report_bad_irq+0x35/0xa7
[  842.446530]  note_interrupt.cold+0xb/0x61
[  842.446540]  handle_irq_event+0xa3/0xb0
[  842.446551]  handle_fasteoi_irq+0x90/0x1e0
[  842.446562]  handle_irq_desc+0x36/0x40
[  842.446569]  __evtchn_fifo_handle_events+0x195/0x1b0
[  842.446582]  __xen_evtchn_do_upcall+0x72/0xc0
[  842.446595]  __xen_pv_evtchn_do_upcall+0x39/0x60
[  842.446606]  xen_pv_evtchn_do_upcall+0xd7/0x100
[  842.446619]  </IRQ>
[  842.446622]  <TASK>
[  842.446625]  exc_xen_hypervisor_callback+0x8/0x10
[  842.446638] RIP: e030:xen_hypercall_sched_op+0xa/0x20
[  842.446651] Code: 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 
05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[  842.446657] RSP: e02b:ffffffff82803d58 EFLAGS: 00000246
[  842.446665] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
ffffffff8193a3aa
[  842.446669] RDX: ffffffff82819940 RSI: 0000000000000000 RDI: 
0000000000000001
[  842.446673] RBP: ffffffff82819940 R08: 00000066a1713428 R09: 
0000000000000000
[  842.446677] R10: 0000000000000001 R11: 0000000000000246 R12: 
0000000000000000
[  842.446681] R13: 0000000000000000 R14: ffffffff82819110 R15: 
0000000000000000
[  842.446687]  ? xen_hypercall_sched_op+0xa/0x20
[  842.446701]  ? xen_safe_halt+0xc/0x20
[  842.446710]  ? default_idle+0xa/0x10
[  842.446717]  ? default_idle_call+0x33/0xe0
[  842.446724]  ? do_idle+0x215/0x2a0
[  842.446732]  ? cpu_startup_entry+0x19/0x20
[  842.446738]  ? start_kernel+0x6b7/0x6dc
[  842.446750]  ? xen_start_kernel+0x6a4/0x6b1
[  842.446762]  ? startup_xen+0x3e/0x3e
[  842.446773]  </TASK>
[  842.446776] handlers:
[  842.446784] [<0000000074c02061>] usb_hcd_irq [usbcore]
[  842.446843] [<00000000c81c8287>] ath_isr [ath9k]
[  842.446870] Disabling IRQ #16

Not tested: Bullseye/Xen-4.14 with the patch at the end of this message 
adapted for Xen 4.14/Debian's Qemu 5.2 - no need to test this until Sid 
is working as the dom0 with Debian's Qemu 6.2 for Sid and Intel IGD 
passthrough to a Bullseye HVM domU.

More information is needed to determine the exact nature of the bug that 
causes the Call Trace listed above that occurs with Qemu 6.2 and Xen 
4.16 on Sid, but not with the traditional Qemu device model on Sid. Most 
likely it will be a bug related to this bug.

I will try to investigate the cause of this Call Trace by comparing the 
code in Qemu 6.2 with the code in Qemu xen-traditional provided by the 
Xen project, and I also may take a look at or try the upstream qemu-xen 
build that is provided by the upstream Xen project.

I will also try again to contact the Xen users/developers on the Xen 
mailing lists and see if they can provide some insight.

I will provide a detailed description of how I developed the patch that 
fixes the bug and enables the passthrough feature of the Intel IGD to a 
Bullseye HVM when using the Qemu traditional device model in a 
subsequent message.

Here is the patch for the current Xen 4.16 package (version 
4.16.0+51-g0941d6cb-1) for Sid:

--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -2502,6 +2502,7 @@

      for (i = 0 ; i < d_config->num_pcidevs ; i++) {
          uint64_t vga_iomem_start = 0xa0000 >> XC_PAGE_SHIFT;
+        uint64_t vga_iomem2_start = 0xcc490; /* Probably IRQ data, nr = 
0x2 */
          uint32_t stubdom_domid;
          libxl_device_pci *pci = &d_config->pcidevs[i];
          unsigned long pci_device_class;
@@ -2531,6 +2532,25 @@
                    domid, vga_iomem_start, (vga_iomem_start + 0x20 - 1));
              return ret;
          }
+        ret = xc_domain_iomem_permission(CTX->xch, stubdom_domid,
+                                         vga_iomem2_start, 0x2, 1);
+        if (ret < 0) {
+            LOGED(ERROR, domid,
+                  "failed to give stubdom%d access to iomem range "
+                  "%"PRIx64"-%"PRIx64" for VGA passthru",
+                  stubdom_domid,
+                  vga_iomem2_start, (vga_iomem2_start + 0x2 - 1));
+            return ret;
+        }
+        ret = xc_domain_iomem_permission(CTX->xch, domid,
+                                         vga_iomem2_start, 0x2, 1);
+        if (ret < 0) {
+            LOGED(ERROR, domid,
+                  "failed to give dom%d access to iomem range "
+                  "%"PRIx64"-%"PRIx64" for VGA passthru",
+                  domid, vga_iomem2_start, (vga_iomem2_start + 0x2 - 1));
+            return ret;
+        }
          break;
      }



More information about the Pkg-xen-devel mailing list