[Debichem-devel] Bug#725013: [Support] Bug#725013: gromacs-openmpi: grompp crashes with invalid opcode
Vassilis Virvilis
v.virvilis at biovista.com
Tue Oct 1 08:00:26 UTC 2013
On 09/30/2013 07:27 PM, Nicholas Breen wrote:
> reassign 725013 gromacs
> tags 725013 moreinfo
> thanks
>
>
> On Mon, Sep 30, 2013 at 04:39:48PM +0300, Vassilis Virvilis wrote:
>> Trying to run grompp grompp_d
>>
>> * What exactly did you do (or not do) that was effective (or
>> ineffective)?
>>
>> It crashes
>>
>> dmesg output:
>> [ 1699.966132] traps: grompp_d[9667] trap invalid opcode ip:7fb9311ac95d sp:7ffff7700ee8 error:0 in libgmx_d.so.8[7fb9310d0000+4e9000]
>> [ 1728.255893] traps: grompp[9684] trap invalid opcode ip:7f6807c2c65d sp:7fff560ed648 error:0 in libgmx.so.8[7f6807b51000+51b000]
>
> I can't reproduce this crash with my test data, and my system runs a similar
> Intel CPU (i5-2x00 series). Could you please attach a file that it crashes on
> (or a pdb2gmx/genbox/etc. sequence that creates one) and the exact command line
> that causes it to fail?
>
>
There is no need to have any test data. It crashes just by running it
and before printing the help. Here let me re iterate because I have done
some steps to pinpoint the bug and now that I am reading my bug reports
I can see I wasn't clear enough.
The story so far:
1) apt-get update; apt-get dist-upgrade (30/9/2013)
2) reboot (since we have now a new kernel)
3) Let's run staff
bill at odin:~$ grompp_d
:-) G R O M A C S (-:
Illegal instruction
bill at odin:~$ grompp
:-) G R O M A C S (-:
Illegal instruction
Here is the dmesg
>> [ 1699.966132] traps: grompp_d[9667] trap invalid opcode ip:7fb9311ac95d sp:7ffff7700ee8 error:0 in libgmx_d.so.8[7fb9310d0000+4e9000]
>> [ 1728.255893] traps: grompp[9684] trap invalid opcode ip:7f6807c2c65d sp:7fff560ed648 error:0 in libgmx.so.8[7f6807b51000+51b000]
4) ok let's see the debugger
bill at odin:~$ gdb grompp
GNU gdb (GDB) 7.6 (Debian 7.6-5)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/grompp...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/grompp
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
:-) G R O M A C S (-:
Program received signal SIGILL, Illegal instruction.
0x00007ffff6efe65d in rando () from /usr/lib/libgmx.so.8
(gdb) bt
#0 0x00007ffff6efe65d in rando () from /usr/lib/libgmx.so.8
#1 0x00007ffff6f6a14f in bromacs () from /usr/lib/libgmx.so.8
#2 0x00007ffff6f6ad0c in CopyRight () from /usr/lib/libgmx.so.8
#3 0x000055555555b3ab in cmain ()
#4 0x00007ffff657e995 in __libc_start_main (main=0x555555556f50 <main>,
argc=1, ubp_av=0x7fffffffe1d8, init=<optimized out>, fini=<optimized
out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe1c8) at
libc-start.c:260
#5 0x0000555555556f7e in _start ()
(gdb)
5) Does it happen if we build it ourselves. At least we could get line
information in the backtrace
$ apt-get source gromacs-openmpi
$ sudo apt-get build-dep gromacs-openmpi
$ cd gromacs-4.6.3/
$ cmake .
$ make
$ find -name grompp
$./src/kernel/grompp --------- It works (prints the help.) No crash.
6) ok. Let's a build a debian package
$ apt-get source gromacs-openmpi
$ sudo apt-get build-dep gromacs-openmpi
$ cd gromacs-4.6.3/
$ dpkg-buildpackage
$ cd ..
$ dpkg -i ../gromacs_4.6.3-4_amd64.deb
$ grompp ----------- It crashses the same way as the original package.
7) Now I am installing in i5
It works in my i5. Looks like the problem is only in i7. I have
tested in the two machines of the cluster. These are xeons that they
have the problem. Here is an excerpt from /proc/cpuinfo
processor : 23
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz
stepping : 2
microcode : 0x15
cpu MHz : 1600.000
cache size : 12288 KB
physical id : 1
siblings : 12
core id : 10
cpu cores : 6
apicid : 53
initial apicid : 53
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good
nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm ida
arat epb dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 5600.18
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
Thanks
--
__________________________________
Vassilis Virvilis Ph.D.
Head of IT
Biovista Inc.
US Offices
2421 Ivy Road
Charlottesville, VA 22903
USA
T: +1.434.971.1141
F: +1.434.971.1144
European Offices
34 Rodopoleos Street
Ellinikon, Athens 16777
GREECE
T: +30.210.9629848
F: +30.210.9647606
www.biovista.com
Biovista is a privately held biotechnology company that finds novel uses
for existing drugs, and profiles their side effects using their
mechanism of action. Biovista develops its own pipeline of drugs in CNS,
oncology, auto-immune and rare diseases. Biovista is collaborating with
biopharmaceutical companies on indication expansion and de-risking of
their portfolios and with the FDA on adverse event prediction.
More information about the Debichem-devel
mailing list