[Debichem-devel] Bug#683467: aces3: divide-by-zero error in many test cases when running on one core only
Michael Banck
mbanck at debian.org
Wed Aug 1 00:18:58 UTC 2012
package: aces3
severity: important
tags: upstream
version: 3.0.6-1
> I now ran the full testsuite, and unfortunately I get a floating point
> exception in tran_rhf_ao_sv1.sio:
>
> | An instruction timer report will be printed
> |
> | Gather on company_rank succeeded.
> | Static pre-defined array # 2 is first used on line
> |328
> |
> |Program received signal SIGFPE: Floating-point exception - erroneous
> |arithmetic operation.
> |
> |Backtrace for this error:
> |#0 0x2AEA062F8667
> |#1 0x2AEA062F8C34
> |#2 0x2AEA06F104EF
> |#3 0x447C1C in vtdemo_init_ at vtdemo_init.F:814
That line reads
iproc_company_rank = mod(next_server-1,niocompany)
> #0 vtdemo_init (optable=..., noptable=245, array_table=..., narray_table=<error reading variable: Cannot access memory at address 0xc9>, index_table=..., nindex_table=32,
> segment_table=..., nsegment_table=193, scalar_table=..., nscalar_table=13, block_map_table=..., nblock_map_table=27364, proctab=..., address_table=..., blocksize=1185921,
> end_nfps=..., nshells=16, scf_energy=438.55129855222071, totenerg=0, damp_init=0.19999998807907104, cc_conv=9.9999999999999995e-08, scf_conv=9.9999999999999995e-07,
> stabvalue=0, excite=0, eom_tol=0, eom_roots=0, io_company_id=2, niocompany=0, need_predef=..., npre_defined=19, dryrun=.TRUE.) at vtdemo_init.F:814
and niocompany is indeed zero.
Apparently this code is relevant, from , line 107:
niocompany = 0
do i = 1, nprocs
if (pst_get_company(i-1) .eq. io_company_id)
* niocompany = niocompany + 1
enddo
We did not add a machine-specific section to tests/Makefile, so they run
with the default MPIRUN of "mpirun ./xaces3 >./job.out", resulting in
one core being used:
> nprocs = 1
I can't remember whether the above do-loop will run at all if nprocs is
1 as well, and I am not sure what pst_get_company() is supposed to
return, but obviously niocompany should not stay zero or else the above
mod() function will divide by zero.
Indeed, if I run the testsuite manually with mpirun -np 2, I no longer
get a floating point exception.
More information about the Debichem-devel
mailing list