[Debichem-devel] Bug#644397: Bug#644397: gromacs: pdb2gmx, version 4.5.5, fails with 'Illegal instruction' on i686, 3.0.0.1-686-pae (testing)

groenedraeck groenedraeck at gmail.com
Sat Oct 8 18:26:37 UTC 2011


On 10/6/2011 11:18, Steffen Möller wrote:
> Hello,
>
> On 10/05/2011 03:37 PM, groenedraeck wrote:
>> Package: gromacs
>> Version: 4.5.5-1
>> Severity: important
>>
>> It was found that the pdb2gmx (v4.5.5) from the GROMACS (gromacs v4.5.5-1) fails with an 'Illegal instruction' after the 'Making bonds...' operation (see below).
>>
>> The bug was first noted in gromacs 4.5.4-2 (bug report not filed, prior versions of gromacs not tested at this workstation), and persists in the recent gromacs 4.5.5-1. In both cases the gromacs packages and all dependencies were installed through aptitude. Although reproducible on a virtual machine (see below), the bug first appeared on a physical computer, equipped with a single 1 GHz AMD Duron processor, 514340k RAM, md0 raid1 controlled by mdadm.
> This could be a problem with some optimisation that goes beyond what the
> Duron can do. The folks from debichem may have a different opinion, but
> I suggest to compile it yourself to see what happens.  You may want to
> substitute the compiler with gcc-4.4, but I think, given how simple it
> seems to reproduce your problem on your platform, I would just start
> with the default:
>
>      apt-get source --build gromacs
>
> should be doing that. You may need to install some extra build
> dependencies. Then use dpkg -i to install the packages. If this solved
> the issue, then please report with your compiler version. If the error
> still persists, I see two things to do:
>   * use another compiler (clang, gcc-snapshot  or gcc-4.4)
>   * add a debug package for gromacs and start pdb2gmx from within gdb
>
> I have pasted a respective patch for the debug package below and have
> just built gromacs with it. Worked here. @Debichem: It would be nice to
> see a future version of gromacs with something like it. For the problem
> at hand I do not think the -dbg package to be helping too much since
> illegal instuctions just should not happen and may IMHO more be pointing
> to a compiler issue.
>
> --- debian/control    (revision 3036)
> +++ debian/control    (working copy)
> @@ -112,3 +112,12 @@
>    This package contains only the core simulation engine with parallel
>    support using the OpenMPI interface.  It is suitable for nodes of a
>    processing cluster, or for multiprocessor machines.
> +
> +Package: gromacs-dbg
> +Architecture: any
> +Depends: gromacs (= ${binary:Version})
> +Description: debug sysmbols for gromacs
> + This package contains information to identify problems with the executed
> + code. Those may be platform-specific issues or programming errors. You
> + may be asked to install the package prior to reproducing the issue you
> + were reporting.
> Index: debian/rules
> ===================================================================
> --- debian/rules    (revision 3036)
> +++ debian/rules    (working copy)
> @@ -239,7 +239,7 @@
>       dh_testroot -s
>       dh_installchangelogs -s
>       dh_installdocs -s
> -    dh_strip -A
> +    dh_strip -A --dbg-package=gromacs-dbg
>       dh_link -s
>       dh_compress -s
>       dh_fixperms -s
>
> Hope to have somewhat helped
>
> Steffen
>

Dear Steffen,

thank you for your quick and detailed reponse. I have little experience 
with non-trivial compiling issues. However, I have tried some of the 
things you suggested, those are reported below.

On the workstation for which the bug was reported, the installed gromacs 
and gromacs-data were purged. Hereafter, the following packages 
(including dependencies) were installed: dpkg-dev, debhelper, dpatch, 
libmpich2-dev, libopenmpi-dev, lesstif2-dev, libgsl0-dev, libxml2-dev, 
cmake. Thereafter, the system was restarted and the following command 
was issued:

$ apt-get source --build gromacs > log.20111007 2>&1

When finished, the log was inspected for apparent errors. The log file 
frequently contained a MPI implementation error:
 > -- Using manually set binary suffix: "_mpi.mpich"
 > -- Using manually set library suffix: "_mpi.mpich"
 > CMake Warning at CMakeLists.txt:192 (MESSAGE):
 >
 >
 >             There are known problems with some MPI implementations:
 >                        OpenMPI version < 1.4.1
 >                        MVAPICH2 version <= 1.4.1
 >
 >
 > -- Checking for MPI_IN_PLACE
 > -- Performing Test MPI_IN_PLACE_COMPILE_OK
 > -- Performing Test MPI_IN_PLACE_COMPILE_OK - Success
 > -- Checking for MPI_IN_PLACE - yes

The built packages were installed using:

$ dpkg -i gromacs_4.5.5-1_i386.deb gromacs-data_4.5.5-1_all.deb

The lysozyme (pdb: 1AKI) tutorial (http://goo.gl/dxzex) was followed and 
the bug reported above repeated:

$ pdb2gmx -f 1AKI_waterless.pdb -o 1AKI_processed.gro -water spce
[.. truncated log ..]
 > Linking CYS-76 SG-601 and CYS-94 SG-724...
 > Start terminus LYS-1: NH3+
 > End terminus LEU-129: COO-
 > Checking for duplicate atoms....
 > Now there are 129 residues with 1960 atoms
 > Making bonds...
 > Illegal instruction

The self-built and installed gromacs packages were purged, the official 
gromacs source package was obtained, extracted and the following two 
compilers were used:

== gcc-4.4 ==
$ ./configure CC=gcc-4.4 --prefix=/homedir --with-fft=fftw3
$ make CC=gcc-4.4
$ make install

The pdb2gmx program failed with the same error, at seemingly the same 
time. Install files were removed as well as the automake stuff($ make 
distclean)

== clang == (all dependencies were installed first)
$ ./configure CC=clang --prefix=/home/not/gromacs/install 
--with-fft=fftw3 && make CC=clang && make install

 From the start, this produced errors in the make phase, a lot of 
'clang: warning: argument unused during compilation' warnings were 
reported, i.e.:

 > /bin/bash ../../../../libtool   --mode=compile clang   -O3 
-fomit-frame-pointer -finline-functions -Wall -Wno-unused -msse2 
-std=gnu99 -pthread -I./include -c -o nb_kernel_ia32_sse_test_asm.lo 
nb_kernel_ia32_sse_test_asm.s
 >  clang -O3 -fomit-frame-pointer -finline-functions -Wall -Wno-unused 
-msse2 -std=gnu99 -pthread -I./include -c nb_kernel_ia32_sse_test_asm.s 
-o nb_kernel_ia32_sse_test_asm.o
 > clang: warning: argument unused during compilation: 
'-fomit-frame-pointer'
 > clang: warning: argument unused during compilation: '-finline-functions'
 > clang: warning: argument unused during compilation: '-Wall'
 > clang: warning: argument unused during compilation: '-Wno-unused'
 > clang: warning: argument unused during compilation: '-msse2'
 > clang: warning: argument unused during compilation: '-std=gnu99'
 > clang: warning: argument unused during compilation: '-pthread'
 > clang: warning: argument unused during compilation: '-I ./include'

This time, the pdb2gmx error was totally broken and returned to prompt 
after displaying the GROMACS motd (i.e. names team).

At this point I decided to stop trying to get things to work at the 1 
GHz Duron workstation, because I was not sure the CC=clang option was 
the right way to do it and after the command completed, results got worse.

So, to verify one of my earlier statements, in virtualbox-ose 
(3.2.10_OSE r66523), was started an i386 guest running Debian GNU/Linux 
testing (Wheezy), kernel 3.0.0-1-686-pae. Only the base system, standard 
system utilities and gromacs packages (including dependencies) were 
freshly installed. The error could not be reproduced (following the 
lysozyme (pdb: 1AKI) tutorial at http://goo.gl/dxzex).

Indeed, I am starting to think that the problem might be Duron-specific, 
as already suggested by Steffen. So, given the nature of the current 
problem, the age of the problematic workstation versus the highly 
advanced capabilities and applications of GROMACS, I think trying to 
squash this bug might not be worth the time and energy of those 
involved. At the other hand, however, it is the first occasion that a 
Debian package does not work appropriately on the computer I am  trying 
to run it (and that's what made me report the bug in the first place).

In conclusion, I am willing to contribute and help solving the problem 
documented in the bugreport on the given workstation, but only if the 
suggested steps are described in sufficient detail for me to quickly 
execute, compile, build, install or implement them. I have the given 
computer at my disposal, I am subscribed to this bug and -when time 
permits- gladly like to try whatever is suggested.





More information about the Debichem-devel mailing list