Bug#642497: Bug#644601: [xserver-xorg-core] hard lock-up: [mi] EQ overflowing
JS
jshaio at yahoo.com
Wed Oct 19 22:32:48 UTC 2011
Although the changes mentioned in the earlier mail (below) significantly improved
OpenGL handling, they were not a perfect solution.
I found the crash below today running /usr/lib/xscreensaver/circuit:
[ 93.931] (II) No input driver/identifier specified (ignoring)
[ 7923.634] nvLock: client timed out, taking the lock
[ 8068.591] nvLock: client timed out, taking the lock
[ 8097.178] nvLock: client timed out, taking the lock
[ 8224.735] nvLock: client timed out, taking the lock
[ 8352.518] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
[ 8352.518]
Backtrace:
[ 8352.533] 0: /usr/bin/X (xorg_backtrace+0x37) [0x80a7f47]
[ 8352.533] 1: /usr/bin/X (mieqEnqueue+0x1d1) [0x80a2231]
[ 8352.533] 2: /usr/bin/X (xf86PostMotionEventM+0xb0) [0x80c9130]
[ 8352.533] 3: /usr/bin/X (xf86PostMotionEventP+0x6f) [0x80c927f]
[ 8352.533] 4: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4ad5000+0x2cfe) [0xb4ad7cfe]
[ 8352.533] 5: /usr/lib/xorg/modules/input/evdev_drv.so (0xb4ad5000+0x3e1d) [0xb4ad8e1d]
[ 8352.533] 6: /usr/bin/X (0x8048000+0x6e081) [0x80b6081]
[ 8352.533] 7: /usr/bin/X (0x8048000+0x1282ef) [0x81702ef]
[ 8352.533] 8: (vdso) (__kernel_sigreturn+0x0) [0xb786c400]
[ 8352.533] 9: (vdso) (__kernel_vsyscall+0x10) [0xb786c424]
[ 8352.533] 10: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (nanosleep+0x20) [0xb75b23e0]
[ 8352.533] 11: /lib/i386-linux-gnu/i686/cmov/libc.so.6 (usleep+0x3c) [0xb75e166c]
[ 8352.533] 12: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0xb501d000+0x4540e1) [0xb54710e1]
[ 8352.533] 13: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0xb501d000+0x455871) [0xb5472871]
[ 8352.533] 14: /usr/lib/xorg/modules/drivers/nvidia_drv.so (0xb501d000+0x45bdc9) [0xb5478dc9]
[ 8353.519] nvLock: client timed out, taking the lock
[227640.543] (WW) NVIDIA(0): WAIT (2, 6, 0x8000, 0x00004fe4, 0x000051c4)
This one did freeze the Xserver and required a reboot.
Three days before I had noticed the same "EQ overflowing" message but the
X server did not crash, only the app, and the system continued to work fine.
With the package versions below I can run applications like googleearth
and fgfs without encountering problems in the first minute or two of
operation, but it appears the problem has not completely gone away.
--- On Sun, 10/16/11, JS <jshaio at yahoo.com> wrote:
> From: JS <jshaio at yahoo.com>
> Subject: Bug#642497: Bug#644601: [xserver-xorg-core] hard lock-up: [mi] EQ overflowing
> To: 642497 at bugs.debian.org
> Date: Sunday, October 16, 2011, 5:52 PM
> I'm not yet too familiar with the
> deeper issues you mention regarding
> adding conflicts (having only recently changed from an
> rpm-based system).
>
> However, trying to install the drivers directly from the
> NVIDIA .run file
> does bring up an explicit warning about the nouveau driver
> (and
> no warnings regarding other drivers, just this one). I just
> reviewed this
> warning again and it regards nouveau driver being in use
> even when X is
> not running, and potentially also being present in the
> initrd.
> NVIDIA: " If you have an initrd which loads the
> Nouveau driver, you will additionally
> need to ensure that Nouveau is disabled
> in the initrd. If your initrd
> understands the rdblacklist parameter,
> you can add the option
> rdblacklist=nouveau to your kernel's boot
> parameters."
> [I was mistaken when I said it was the nouveau shared libs
> that were the issue.]
>
> I do have this version of libdrm-nouveau1a installed (with
> no problems at all):
> ii libdrm-nouveau1a
> 2.4.26-1
>
>
>
> The problem I reported was purely X server; I could not use
> the keyboard
> to switch to another console. But there was never any
> problem getting in
> with ssh from another machine, examining logs and
> initiating a graceful
> restart.
>
> The set of packages related to this issue that I'm
> currently using
> (and are now pinned) is:
> ii glx-alternative-mesa
> 0.1.94
> ii glx-alternative-nvidia
> 0.1.94
>
> ii glx-diversions
>
> 0.1.94
>
> ii libdrm-nouveau1a
> 2.4.26-1
>
>
> ii libegl1-mesa
>
> 7.11-6
>
> ii libegl1-mesa-drivers
> 7.11-6
>
> ii libgl1-mesa-dri
> 7.11-6
>
>
> ii libgl1-mesa-glx
> 7.11-6
>
>
> ii libgl1-nvidia-alternatives
> 280.13.really.275.28-1
>
> ii libgl1-nvidia-glx
>
> 280.13.really.275.28-1
>
> ii libglapi-mesa
>
> 7.11-6
>
> ii libgles2-mesa
>
> 7.11-6
>
> ii libglu1-mesa
>
> 7.11-6
>
> ii libglw1-mesa
>
> 7.11-6
>
> ii libglx-nvidia-alternatives
> 280.13.really.275.28-1
>
> ii libopenvg1-mesa
> 7.11-6
>
>
> ii libosmesa6
>
> 7.11-6
>
> ii libva-glx1
>
> 1.0.12-2
>
>
> ii libxcb-glx0
>
> 1.7-3
>
> ii libxcb-glx0-dev
> 1.7-3
>
>
> ii mesa-common-dev
> 7.11-6
>
>
> ii mesa-utils
>
> 8.0.1-2+b1
>
>
> ii nvidia-alternative
>
> 280.13.really.275.28-1
>
> ii nvidia-detect
>
> 280.13.really.275.28-1
>
> ii nvidia-glx
>
> 280.13.really.275.28-1
>
> ii nvidia-installer-cleanup
> 20110729+2
>
> ii nvidia-kernel-common
> 20110729+2
>
> ii nvidia-kernel-dkms
>
> 280.13.really.275.28-1
>
> ii nvidia-settings
> 280.13-1
>
>
> ii nvidia-support
>
> 20110729+2
>
> ii nvidia-vdpau-driver
>
> 280.13.really.275.28-1
>
> ii nvidia-xconfig
>
> 280.13-1
>
>
> ii xserver-xorg-core
>
> 2:1.10.2.902-1
>
> ii xserver-xorg-video-nvidia
> 280.13.really.275.28-1
>
> [in addition, xserver-xorg-video-nouveau now has
> Pin-Priority=-1]
>
> --- On Sun, 10/16/11, Andreas Beckmann <debian at abeckmann.de>
> wrote:
>
> > From: Andreas Beckmann <debian at abeckmann.de>
> > Subject: Bug#642497: Bug#644601: [xserver-xorg-core]
> hard lock-up: [mi] EQ overflowing
> > To: "JS" <jshaio at yahoo.com>,
> 642497 at bugs.debian.org
> > Date: Sunday, October 16, 2011, 3:03 PM
> > [moving discussion back to your
> > report #642497]
> >
> > On 2011-10-13 12:44, JS wrote:
> > > Perhaps a conflict with nouveau should be added
> to the
> > nvidia package > to
> > > avoid this possibility, based on the warnings
> from
> > NVIDIA.
> >
> > Adding conflicts is not a good solution as we want to
> allow switching
> > between free and non-free drivers without requiring
> installing or
> > removing packages. Furthermore adding such a conflict
> may render a lot
> > of unrelated packages uninstallable. (Think of a live
> CD that has all
> > sorts of hardware support installed and some clever
> piece of hardware
> > detection that enables/disables the right things
> during boot. E.g.
> > nvidia and fglrx proprietary drivers can now be
> installed in parallel,
> > even if a "normal" system will use at most one of
> them.)
> > But eventually the problematic nouveau files can be
> diverted (like MESA
> > libGL) and be reenabled depending on the setting of
> the glx alternative
> > ... but first we have to find out whats causing the
> problems.
> >
> > On 2011-10-16 16:01, JS wrote:
> > > I had a similar problem with the nvidia driver
> which resulted in
> > > easy-to-reproduce X server lockups. The
> Xorg.0.log show the "EQ overflowing"
> > > message followed by message that the xserver was
> in an infinite loop.
> > >
> > > The bug is 642497:
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=642497
> > >
> > > It was fixed by purging the
> xserver-xorg-video-nouveau followed by reinstall
> > > of the nvidia drivers and xserver.
> >
> > Do you have libdrm-nouveau1a and/or libdrm-nouveau1
> still installed?
> >
> > > If one tries to install the nvidia driver
> directly from the NVIDIA blog,
> > > there is a warning that the presence of shared
> libs from nouveau may cause
> > > problems. After doing this reinstall I've tested
> extensively and had no problems.
> >
> > Purging and reinstalling should not be neccessary, a
> restart of the X
> > server following the package installation/removal
> should be sufficient.
> > Eventually a system reboot could be necessary to
> return the GPU into a
> > defined state (in case both nvidia-driver and
> nouveau-whatever tried to
> > initialize the card in "their" way).
> >
> > Since you had an easily reproducible way to trigger
> the problem ...
> > could you test something more? At the point where X
> hangs, is the
> > machine still usable? E.g. can you get into a console
> (Ctrl-Alt-F1 etc)
> > or SSH into the machine?
> >
> > If libdrm-nouveau1a is not installed - install it (but
> not
> > xserver-xorg-nouveau) and test again.
> > If this does not trigger the problem, add
> xserver-xorg-nouveau and test.
> > You should get back to "working" state by just
> uninstalling
> > these two packages.
> >
> > Once the xserver got stuck, run
> >
> > lsof -n -P | grep nouveau
> >
> > from a console/ssh to see whether something is
> currently
> > using a nouveau
> > file.
> >
> > Luckily there are only two nouveau specific
> libraries:
> > /usr/lib/xorg/modules/drivers/nouveau_drv.so
> > /usr/lib/x86_64-linux-gnu/libdrm_nouveau.so.1
> >
> >
> > Thanks.
> >
> > Andreas
> >
> >
> >
> > --
> > To unsubscribe, send mail to 642497-unsubscribe at bugs.debian.org.
> >
>
>
>
> --
> To unsubscribe, send mail to 642497-unsubscribe at bugs.debian.org.
>
More information about the pkg-nvidia-devel
mailing list