Lapack compilation flags
Victor Liu
vkl at stanford.edu
Wed Nov 14 07:59:40 UTC 2012
On 11/13/2012 11:37 PM, Sylvestre Ledru wrote:
> Hello,
>
> On 14/11/2012 03:08, Victor Liu wrote:
>> Dear Science Maintainers,
>>
>> About two years ago, I reported a problem with Lapack and
>> multithreading, which has been accepted into the current errata:
>>
>> http://www.netlib.org/lapack/Errata/index2.html#_strong_span_class_red_bug0061_span_strong_zgehrd_f_is_overflowing
>>
>>
>> Since the developers have not addressed the issue yet, a relatively
>> simple fix for it on the packaging side is to simply add the
>> "-frecursive" flag to gfortran, forcing all local arrays to be allocated
>> on the stack (even if they are huge). This would make the compiled
>> Lapack library truly thread-safe, and save me much frustration. The only
>> adverse effect of this flag that I can think of is that potentially more
>> stack space is required. However, the largest single array size is
>> around 64 KB, while the typical stack limit is several MB, so this
>> should not pose a problem (there are already large arrays just under the
>> 32 KB default gfortran limit that slip through). With the current
>> repository libraries, it is impossible to perform certain computations
>> in a multithreaded environment due to memory corruption from variables
>> that get statically allocated by the compiler.
>>
>> Please let me know if you have any concerns, since I seem to be the only
>> one actively seeking a solution to this problem. I have submitted a
>> partial code fix to the developers, but I don't know if it will be
>> accepted or how long it would take to make it into a release.
>>
>
> I am not against adding this option but, first, could you share the
> argument of lapack dev for not doing it by default ?
>
> BTW, could you open a bug report on this subject ?
>
> Thanks,
> Sylvestre
>
Hi,
Basically, there are two ways to fix the bug: specify compiler flags, or
modify the codebase to use avoid using huge local arrays. I am
advocating for the former here since it appears the latter solution
would require an API change.
The fix I described in my previous message would be a change that you
would implement in the makefile (more precisely, make.inc), which I
would think is at the package maintainer's discretion. I imagine that
the devs would prefer not to include compiler specific build flags in
their default make.inc.
I only just found time recently to submit a patch to the Lapack forums,
despite the bug being open for more than two years now, so they haven't
gotten back to me about it yet. My fix does not resolve all the issues
(it only fixes what are in my opinion the more commonly used routines),
and indeed a real code fix would likely break API backwards
compatibility, so I can see why no progress has been made in the past
two years.
The compilation flag method would completely fix the problem for the
official Debian libraries, but still leave it open for other platforms.
This could potentially be confusing if people find that their
multithreaded code works on a Debian based distribution, but then breaks
when they transition to another platform, which is a much more common
thing to do in scientific computing.
I would be happy to open a bug report, but I am unsure of which package
I should file it under (liblapack-dev, liblapack3, liblapack3gf?). All
these different package names are very confusing to me.
-Victor
More information about the debian-science-maintainers
mailing list