[Pkg-xen-devel] [Xen-devel] [admin] [BUG] task jbd2/xvda4-8:174 blocked for more than 120 seconds.

Hans van Kranenburg hans at knorrie.org
Sat Feb 9 16:01:55 GMT 2019


Hi,

On 2/9/19 12:16 AM, Samuel Thibault wrote:
> 
> Hans van Kranenburg, le ven. 08 févr. 2019 20:18:44 +0100, a ecrit:
>> [...]
>>
>> On 2/8/19 6:13 PM, Samuel Thibault wrote:
>>>
>>> Sacha, le ven. 08 févr. 2019 18:00:22 +0100, a ecrit:
>>>> On  Debian GNU/Linux 9.7 (stretch) amd64, we have a bug on the last Xen
>>>> Hypervisor version:
>>>>
>>>>     xen-hypervisor-4.8-amd64 4.8.5+shim4.10.2+xsa282
>>>
>>> (Read: 4.8.5+shim4.10.2+xsa282-1+deb9u11)
>>>
>>>> The rollback on the previous package version corrected the problem:
>>>>
>>>>     xen-hypervisor-4.8-amd64 4.8.4+xsa273+shim4.10.1+xsa273-1+deb9u10
>>
>> Since this is the first message arriving about this in my inbox, can you
>> explain what "the problem" is?
> 
> I have forwarded the original mail: all VM I/O get stuck, and thus the
> VM becomes unusable.

These are in many cases the symptoms of running out of "grant frames".
So let's verify first if this is the case or not.

Your xen-utils-4.8 packages contains a program at
/usr/lib/xen-4.8/bin/xen-diag that you can use in the dom0 to gather
information.

e.g.

  -# ./xen-diag  gnttab_query_size 5
  domid=5: nr_frames=11, max_nr_frames=32

If this nr_frames hits the max allowed, then randomly things will stall.
This does not have to happen directly after domU boot, but it likely
happens later, when disks/cpus are actually used. There is no useful
message/hint at all in the domU kernel (yet) abuot this when it happens.

Can you verify if this is happening?

With Xen 4.8, you can add gnttab_max_frames=64 (or another number, but
higher than the default 32) to the xen hypervisor command line and reboot.

For Xen 4.11 which will be in Buster, the default is 64 and the way to
configure higher values/limits for dom0 and domU have changed. There
will be some text about this recurring problem in the README.Debian
under known issues.

>>> (Only the hypervisor needed to be downgraded to fix the issue)
>>>
>>>> The errors are on the domU a frozen file system until a kernel panic.
>>
>> Do you have a reproducable case that shows success with the previous Xen
>> hypervisor package and failure with the new one, while keeping all other
>> things the same?
> 
> We have a production system which gets to hang within about a day. We
> don't know what exactly triggers the issue.
> 
>> This seems like an upstream thing, because for 4.8, the Debian package
>> updates are almost exclusively shipping upstream stable udpates.
> 
> Ok.

Related:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=880554

Hans



More information about the Pkg-xen-devel mailing list