[Bug 933354] Re: GPU not found with ATI Hdxxxx series cards - discussion

MestreLion launchpad at rodrigosilva.com
Fri Apr 12 12:17:53 UTC 2013


I'm sorry it took so long for me to reply.

Well, I had to so some small changes to those lines for the test to work: normally, if the firls call to fglrxinfo is too soon and fails, and that failure is not handled, the entire init script will be aborted and boic will not start. So I changed the snippet to this:

start()
{
        # added by @@ rodrigo
        testlogfile=/var/log/boinc-fglrx.log
        echo "\nstart\n" > "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n5sec passed\n" >> "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n10sec passed\n" >> "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n15sec passed\n" >> "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n20sec passed\n" >> "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n25sec passed\n" >> "$testlogfile"
        sleep 5
        /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
        echo "\n30sec passed\n" >> "$testlogfile"

  log_begin_msg "Starting $DESC: $NAME"

...


And the results, identical and consistent in every boot for the last week (about a dozen or so) is, unfortunately:

start

Error: unable to open display (null)

5sec passed

Error: unable to open display (null)

10sec passed

Error: unable to open display (null)

15sec passed

Error: unable to open display (null)

20sec passed

Error: unable to open display (null)

25sec passed

Error: unable to open display (null)

30sec passed


The same script, when called in a terminal after my Desktop session starts:

start

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


5sec passed

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


10sec passed

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


15sec passed

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


20sec passed

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


25sec passed

display: :0  screen: 0
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon HD 7700 Series
OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002


30sec passed


Given this results, and since my boot is faster than 30 secinds, I think this is not a matter of timing: the script is unable to find a working flgrx no matter how much it sleeps waiting. And my suspition is: fglrx is only started with X, and boinc-client, being a sysV init is executed before X.

So boinc-client must be either moved from SysV to another initialization "chain", for example Upstart, or gdm/lightdm/whatever starts X.

Or, another approach would be keep it there but to re-start it at certain "checkpoints" if the environment allows more capabilities. So no GPU for x-less servers (like now), but it could test for GPU again after X session starts, and re-start client if it founds one.

ML


At 01:36 PM 3/10/2013, Christian Beer wrote:
>I'm sorry I mixed up the bugs. I meant 933354 GPU not found with ATI Hdxxxx series cards in my post. I did some more research and think that we may debug this with some
>
>sleep 5
>fglrxinfo >> /home/user/boinc-fglrx.log
>sleep 5
>fglrxinfo >> /home/user/boinc-fglrx.log
>
>calls in /etc/init-d/boinc-client. In the end we need to know which script needs to be started before boinc. I once read an article about bootchart ( <http://www.bootchart.org/>http://www.bootchart.org/) which may help us in finding the culprit on an affected system.
>
>What do you think about this?
>
>Regards
>Christian
>Am 10.03.2013 14:01, schrieb Christian Beer: 
>>Hi,
>>
>>I think this bug (1140597) is really a timing issue at boot time. As this seems to only affect ATI/AMD hardware I think that the drivers for these cards either get loaded after boinc or they take too long to load so they are not present when boinc tries to detect GPUs.
>>
>>My first try to see if this is correct would be to insert a time delay in the boinc init script and see if this helps. If boinc can successfully find the GPU after the delay we need to find the script that is loading the driver and instruct the boinc init script to load after this.
>>
>>If MestreLion would be so kind to try this out it would be great. You just need to make some local changes to your init script and restart your machine. If this is okay for you I will send you a line to add to your init script.
>>
>>@Gianfranco: does the boinc Ubuntu package use the same init script as the Debian package? I only have a working Debian installation.
>>
>>Regards
>>Christian
>>
>>Am 10.03.2013 12:02, schrieb Gianfranco Costamagna: 
>>>
>>>Hi MestreLion, I think this is the right place to discuss to.
>>>
>>>Unfortunately you script has a bad behaviour for people that DON'T have an usable GPU...
>>>
>>>I don't like to restart boinc to everybody just because the might suffer from bug #1140597
>>>
>>>What do you think about?
>>>
>>>My opinion is that there should a way for giving the right order to init scripts, but I don't know at the moment how to change the script right now...
>>>
>>>Gianfranco
>>>
>>>
>>>From: MestreLion <mailto:launchpad at rodrigosilva.com><launchpad at rodrigosilva.com>; 
>>>To: <mailto:pkg-boinc-devel at lists.alioth.debian.org><pkg-boinc-devel at lists.alioth.debian.org>; 
>>>Subject: [Bug 1140597] Re: "New version is available" notification should be disabled 
>>>Sent: Sun, Mar 10, 2013 12:05:22 AM 
>>>
>>>Whatever mailing list you guys use to continue this discussion, sign me
>>>in! :)
>>>
>>>My C++ skills are a bit rusty, but I may be able to help, if not in
>>>programming, at least in discussing strategies, approaches and
>>>packaging.
>>>
>>>Also, I've created a script (wrapper for boincmgr) as a workaround for
>>>some of the annoying boinc bugs: not detecting a GPU at boot and not re-
>>>starting boinc client after shutting its tasks, so maybe it will be
>>>useful for you guys: <https://github.com/MestreLion/scripts/blob/master>https://github.com/MestreLion/scripts/blob/master
>>>/boinc-manager
>>>
>>>-- 
>>>You received this bug notification because you are a member of Debian
>>>BOINC Maintainers, which is subscribed to boinc in Ubuntu.
>>><https://bugs.launchpad.net/bugs/1140597>https://bugs.launchpad.net/bugs/1140597
>>>
>>>Title:
>>>  "New version is available" notification should be disabled
>>>
>>>To manage notifications about this bug go to:
>>><https://bugs.launchpad.net/ubuntu/+source/boinc/+bug/1140597/+subscriptions>https://bugs.launchpad.net/ubuntu/+source/boinc/+bug/1140597/+subscriptions
>>>
>>>-- 
>>>pkg-boinc-devel mailing list
>>>pkg-boinc-devel at lists.alioth.debian.org
>>><http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-boinc-devel>http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-boinc-devel
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-boinc-devel/attachments/20130412/8da540ee/attachment.html>


More information about the pkg-boinc-devel mailing list