[Bug 933354] Re: GPU not found with ATI Hdxxxx series cards - discussion

Christian Beer djangofett at gmx.net
Fri Apr 12 12:40:06 UTC 2013


Hi,

that seems consistent with what I heard from a friend not long ago. It
seems that ATI/AMD drivers need an established X session (a logged in
user) in order to function properly. I guess there is no general way to
circumvent this restriction by the driver.

I also think that it would be good to recheck for fglrx after client
startup and see if the user logged in and GPU capability is there. From
the client perspective this is very annoying because there could be the
case the use disconnected the GPU and or replaced it and it still has
work for this resource. How long should the client wait for the GPU to
come back online? So in the end it is a workaround for a problem the
driver manufacturer has.

Thanks for helping in this matter. I wonder if someone with an Nvidia
Card who also has this problem could make the same test.

Regards
Christian

Am 12.04.2013 14:17, schrieb MestreLion:
> I'm sorry it took so long for me to reply.
>
> Well, I had to so some small changes to those lines for the test to
> work: normally, if the firls call to fglrxinfo is too soon and fails,
> and that failure is not handled, the entire init script will be
> aborted and boic will not start. So I changed the snippet to this:
>
> start()
> {
>         # added by @@ rodrigo
>          testlogfile=/var/log/boinc-fglrx.log
>         echo "\nstart\n" > "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n5sec passed\n" >> "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n10sec passed\n" >> "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n15sec passed\n" >> "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n20sec passed\n" >> "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n25sec passed\n" >> "$testlogfile"
>         sleep 5
>          /usr/bin/fglrxinfo >> "$testlogfile" 2>&1 || true
>         echo "\n30sec passed\n" >> "$testlogfile"
>
>   log_begin_msg "Starting $DESC: $NAME"
>
> ...
>
>
> And the results, identical and consistent in every boot for the last
> week (about a dozen or so) is, unfortunately:
>
> start
>
> Error: unable to open display (null)
>
> 5sec passed
>
> Error: unable to open display (null)
>
> 10sec passed
>
> Error: unable to open display (null)
>
> 15sec passed
>
> Error: unable to open display (null)
>
> 20sec passed
>
> Error: unable to open display (null)
>
> 25sec passed
>
> Error: unable to open display (null)
>
> 30sec passed
>
>
> The same script, when called in a terminal after my Desktop session
> starts:
>
> start
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 5sec passed
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 10sec passed
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 15sec passed
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 20sec passed
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 25sec passed
>
> display: :0  screen: 0
> OpenGL vendor string: Advanced Micro Devices, Inc.
> OpenGL renderer string: AMD Radeon HD 7700 Series
> OpenGL version string: 4.2.12002 Compatibility Profile Context 9.002
>
>
> 30sec passed
>
>
> Given this results, and since my boot is faster than 30 secinds, I
> think this is not a matter of timing: the script is unable to find a
> working flgrx no matter how much it sleeps waiting. And my suspition
> is: fglrx is only started with X, and boinc-client, being a sysV init
> is executed before X.
>
> So boinc-client must be either moved from SysV to another
> initialization "chain", for example Upstart, or gdm/lightdm/whatever
> starts X.
>
> Or, another approach would be keep it there but to re-start it at
> certain "checkpoints" if the environment allows more capabilities. So
> no GPU for x-less servers (like now), but it could test for GPU again
> after X session starts, and re-start client if it founds one.
>
> ML
>
>
> At 01:36 PM 3/10/2013, Christian Beer wrote:
>> I'm sorry I mixed up the bugs. I meant 933354 GPU not found with ATI
>> Hdxxxx series cards in my post. I did some more research and think
>> that we may debug this with some
>>
>> sleep 5
>> fglrxinfo >> /home/user/boinc-fglrx.log
>> sleep 5
>> fglrxinfo >> /home/user/boinc-fglrx.log
>>
>> calls in /etc/init-d/boinc-client. In the end we need to know which
>> script needs to be started before boinc. I once read an article about
>> bootchart ( http://www.bootchart.org/) which may help us in finding
>> the culprit on an affected system.
>>
>> What do you think about this?
>>
>> Regards
>> Christian
>> Am 10.03.2013 14:01, schrieb Christian Beer:
>>> Hi,
>>>
>>> I think this bug (1140597) is really a timing issue at boot time. As
>>> this seems to only affect ATI/AMD hardware I think that the drivers
>>> for these cards either get loaded after boinc or they take too long
>>> to load so they are not present when boinc tries to detect GPUs.
>>>
>>> My first try to see if this is correct would be to insert a time
>>> delay in the boinc init script and see if this helps. If boinc can
>>> successfully find the GPU after the delay we need to find the script
>>> that is loading the driver and instruct the boinc init script to
>>> load after this.
>>>
>>> If MestreLion would be so kind to try this out it would be great.
>>> You just need to make some local changes to your init script and
>>> restart your machine. If this is okay for you I will send you a line
>>> to add to your init script.
>>>
>>> @Gianfranco: does the boinc Ubuntu package use the same init script
>>> as the Debian package? I only have a working Debian installation.
>>>
>>> Regards
>>> Christian
>>>
>>> Am 10.03.2013 12:02, schrieb Gianfranco Costamagna:
>>>>
>>>> Hi MestreLion, I think this is the right place to discuss to.
>>>>
>>>> Unfortunately you script has a bad behaviour for people that DON'T
>>>> have an usable GPU...
>>>>
>>>> I don't like to restart boinc to everybody just because the might
>>>> suffer from bug #1140597
>>>>
>>>> What do you think about?
>>>>
>>>> My opinion is that there should a way for giving the right order to
>>>> init scripts, but I don't know at the moment how to change the
>>>> script right now...
>>>>
>>>> Gianfranco
>>>>
>>>>
>>>> *From: *MestreLion <launchpad at rodrigosilva.com>
>>>> <mailto:launchpad at rodrigosilva.com>;
>>>> *To: *<pkg-boinc-devel at lists.alioth.debian.org>
>>>> <mailto:pkg-boinc-devel at lists.alioth.debian.org>;
>>>> *Subject: *[Bug 1140597] Re: "New version is available"
>>>> notification should be disabled
>>>> *Sent: *Sun, Mar 10, 2013 12:05:22 AM
>>>>
>>>> Whatever mailing list you guys use to continue this discussion, sign me
>>>> in! :)
>>>>
>>>> My C++ skills are a bit rusty, but I may be able to help, if not in
>>>> programming, at least in discussing strategies, approaches and
>>>> packaging.
>>>>
>>>> Also, I've created a script (wrapper for boincmgr) as a workaround for
>>>> some of the annoying boinc bugs: not detecting a GPU at boot and
>>>> not re-
>>>> starting boinc client after shutting its tasks, so maybe it will be
>>>> useful for you guys: https://github.com/MestreLion/scripts/blob/master
>>>> /boinc-manager
>>>>
>>>> -- 
>>>> You received this bug notification because you are a member of Debian
>>>> BOINC Maintainers, which is subscribed to boinc in Ubuntu.
>>>> https://bugs.launchpad.net/bugs/1140597
>>>>
>>>> Title:
>>>>   "New version is available" notification should be disabled
>>>>
>>>> To manage notifications about this bug go to:
>>>> https://bugs.launchpad.net/ubuntu/+source/boinc/+bug/1140597/+subscriptions
>>>>
>>>>
>>>> -- 
>>>> pkg-boinc-devel mailing list
>>>> pkg-boinc-devel at lists.alioth.debian.org
>>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-boinc-devel
>>>>
>>>>
>>>
>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-boinc-devel/attachments/20130412/3711fffa/attachment-0001.html>


More information about the pkg-boinc-devel mailing list