Bug#971343: Animated background/wallpaper (changing over time) causes system freeze on Nouveau

Simon McVittie smcv at debian.org
Wed Sep 30 00:14:02 BST 2020


Control: found -1 3.36.6-1
Control: found -1 3.38.0-2

This is on a GT218M GPU, which is a Tesla 2.0 device from 2007.
https://www.notebookcheck.net/NVIDIA-GeForce-310M.22439.0.html suggests
that this is comparable in performance to Intel integrated graphics from
the Ivybridge (2012) generation, but probably with a much higher power
consumption.

I don't think we can necessarily treat GNOME freezing on
more-than-10-year-old hardware as release-critical, particularly since
there's a workaround (not using the animated background).

Some other questions I should have asked:

* Are you using any GNOME Shell extensions? (If yes, please try
  disabling them all and see whether the problem persists.)

* Is this a new installation, or have you been using this hardware with
  Linux for a while? If you've been using it previously, have you had
  other graphics- or freeze-related issues with it?

* Has the animated/changing background worked in earlier versions of
  Debian and/or GNOME, or is 3.36 the oldest version you've tried?

* Are you able to do an installation of the Debian 10 'buster' stable
  release on this hardware, if you haven't already tried that? That would
  give us a baseline for whether this is a situation that has been there
  for a while.

On Tue, 29 Sep 2020 at 16:48:21 -0300, Leandro Cunha wrote:
> Graphics:
>   Device-1: NVIDIA GT218M [GeForce 310M] driver: nouveau v: kernel
>   Device-2: Suyin type: USB driver: uvcvideo

Is this Suyin USB device an input (camera), or an output (display)?
Is it part of the laptop, or a removable device?

If it's a removable output device, please check whether this issue still
occurs with the USB device disconnected and just the NVIDIA graphics
device, to keep things as simple as possible.

Also, if you have external screens attached, please check whether it
still occurs with just the laptop's built-in screen, again to try to
keep things as simple as possible.

> Sep 29 10:50:17 debian-pc kernel: [   60.247471] nouveau 0000:01:00.0: firmware: failed to load nouveau/nva8_fuc084 (-2)
> Sep 29 10:50:17 debian-pc kernel: [   60.247475] firmware_class: See https://wiki.debian.org/Firmware for information about missing firmware
> Sep 29 10:50:17 debian-pc kernel: [   60.247478] nouveau 0000:01:00.0: Direct firmware load for nouveau/nva8_fuc084 failed with error -2
> Sep 29 10:50:17 debian-pc kernel: [   60.247492] nouveau 0000:01:00.0: firmware: failed to load nouveau/nva8_fuc084d (-2)
> Sep 29 10:50:17 debian-pc kernel: [   60.247494] nouveau 0000:01:00.0: Direct firmware load for nouveau/nva8_fuc084d failed with error -2
> Sep 29 10:50:17 debian-pc kernel: [   60.247497] nouveau 0000:01:00.0: msvld: unable to load firmware data
> Sep 29 10:50:17 debian-pc kernel: [   60.247499] nouveau 0000:01:00.0: msvld: init failed, -19

This firmware blob is not available in Debian (it seems we cannot legally
distribute it, even in non-free, although it can be extracted from
proprietary NVIDIA drivers) but apparently it's for 2D video encoder/decoder
acceleration (VDPAU) rather than anything GNOME Shell would need, so this
warning is probably harmless?

> Sep 29 10:54:53 debian-pc gnome-shell[1633]: 0xa000f6: frame_complete callback never occurred for frame 3729
> Sep 29 10:54:54 debian-pc gnome-shell[1633]: 0xa000f6: frame_complete callback never occurred for frame 3735
> Sep 29 10:55:01 debian-pc gnome-shell[1633]: 0xa000f6: frame_complete callback never occurred for frame 3766

(and lots more)

This is maybe interesting. If the Shell isn't reliably getting frame
completion notifications back from the driver or hardaware, that might be
related to the display freezing.

> With Wayland it completely crashed and got no response to force the
> boot and had to use the on and off button. In Xorg I was able to
> restart with keyboard via tty, but after locking it would only start
> if I restored the original system settings and this is necessary for
> both. Now I am with wayland.

What do you mean by "restored the original system settings"?

What do you mean by "it would only start if [...]"? GNOME would only
start if you did that? The laptop would only boot up if you did that?
Something else?

> > If Ctrl+Alt+Delete doesn't work, does the system respond to
> > the "magic sysrq key" sequences, in particular AltGr+SysRq+o
> > (immediate power off) and AltGr+SysRq+b (immediate reboot)? (See
> > https://en.wikipedia.org/wiki/Magic_SysRq_key for more details)
>
> I manage to force the shutdown that way.

OK, so the kernel is still working to at least some extent, otherwise
those key sequences wouldn't work. The problem could be in (from highest
to lowest level) GNOME Shell; libmutter or some other library it uses;
the Mesa user-space graphics driver; or the Nouveau kernel-side graphics
driver.

> Log attached, after the problem occurs this log.

You opened a gnome-terminal at 12:44 and the Nautilus file manager at
12:46, a block of zero bytes was written at 12:47 (possibly corruption
caused by a system reset to recover from a freeze), and then the machine
rebooted at 12:49. Does 12:47 sound about the right time for when you
reproduced the bug?

Unfortunately there aren't any particularly obvious warnings near there.
I did notice this:

Sep 29 12:46:21 debian-pc gnome-shell[1633]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed

which might be near enough to be relevant? But from an upstream bug
report https://gitlab.gnome.org/GNOME/mutter/-/issues/930 about the same
assertion warning, it seems to be usually harmless.

Using AltGr+SysRq+s to synchronize disk writes, followed by AltGr+SysRq+o
to power off or AltGr+SysRq+b to reboot, might be one way to force more
information about the freeze to be written out to the log. I don't know
whether that'll work.

    smcv



More information about the pkg-gnome-maintainers mailing list