Bug#953116: [petsc-maint] 32 bit vs 64 bit PETSc

Sat May 23 16:45:05 BST 2020

On Sat, 23 May 2020, Drew Parsons wrote:

> On 2020-05-23 14:18, Jed Brown wrote:
> > Drew Parsons <dparsons at debian.org> writes:
> > 
> >> Hi, the Debian project is discussing whether we should start providing a
> >> 64 bit build of PETSc (which means we'd have to upgrade our entire
> >> computational library stack, starting from BLAS and going through MPI,
> >> MUMPS, etc).
> > 
> > You don't need to change BLAS or MPI.
> 
> I see, the PETSc API allows for PetscBLASInt and PetscMPIInt distinct from
> PetscInt. That gives us more flexibility. (In any case, the Debian BLAS
> maintainer is already providing blas64 packages. We've started discussions
> about MPI).
> 
> But what about MUMPS? Would MUMPS need to be built with 64 bit support to work
> with 64-bit PETSc?
> (the MUMPS docs indicate that its 64 bit support needs 64-bit versions of
> BLAS, SCOTCH, METIS and MPI).
> 
> 
> 
> >> A default PETSc build uses 32 bit addressing to index vectors and
> >> matrices.  64 bit addressing can be switched on by configuring with
> >> --with-64-bit-indices=1, allowing much larger systems to be handled.
> >> 
> >> My question for petsc-maint is, is there a reason why 64 bit indexing is
> >> not already activated by default on 64-bit systems?  Certainly C
> >> pointers and type int would already be 64 bit on these systems.
> > 
> > Umm, x86-64 Linux is LP64, so int is 32-bit.  ILP64 is relatively exotic
> > these days.
> 
> 
> oh ok. I had assumed int was 64 bit on x86-64. Thanks for the correction.
> 
> 
> >> Is it a question of performance?  Is 32 bit indexing executed faster (in
> >> the sense of 2 operations per clock cycle), such that 64-bit addressing
> >> is accompanied with a drop in performance?
> > 
> > Sparse iterative solvers are entirely limited by memory bandwidth;
> > sizeof(double) + sizeof(int64_t) = 16 incurs a performance hit relative
> > to 12 for int32_t.  It has nothing to do with clock cycles for
> > instructions, just memory bandwidth (and usage, but that is less often
> > an issue).
> > 
> >> In that case we'd only want to use 64-bit PETSc if the system being
> >> modelled is large enough to actually need it. Or is there a different
> >> reason that 64 bit indexing is not switched on by default?
> > 
> > It's just about performance, as above.
> 
> 
> Thanks Jed.  That's good justification for us to keep our current 32-bit built
> then, and provide a separate 64-bit build alongside it.

One more issue: Most externalpackages don't support 64bit-indices.

Note: OpenBLAS supports 64bit indices. MKL has bunch of packages built as ILP64

[MPICH/OpenMPI - as far as I know is LP64]

The primary reason PETSc defaults to 32bit indices is - this is the compiler default on LP64 systems.

If debian is building ILP64 system [with compilers defaulting to 64-bit integers] - that would mean all packages would be ILP64 [obviously most packages are not tested in this mode - so might break]

--with-64-bit-indices is the option for 64-bit-indices on LP64 systems which most packages don't support - hence the usage of PetscBLASInt PetscMPIInt etc.

Satish

> 
> 
> > There are two situations in
> > which 64-bit is needed.  Historically (supercomputing with thinner
> > nodes), it has been that you're solving problems with more than 2B dofs.
> > In today's age of fat nodes, it also happens that a matrix on a single
> > MPI rank has more than 2B nonzeros.  This is especially common when
> > using direct solvers.  We'd like to address the latter case by only
> > promoting the row offsets (thereby avoiding the memory hit of promoting
> > column indices):
> > 
> > https://gitlab.com/petsc/petsc/-/issues/333
> 
> An interesting extra challenge.
> 
> 
> > I wonder if you are aware of any static analysis tools that can
> > flag implicit conversions of this sort:
> > 
> > int64_t n = ...;
> > for (int32_t i=0; i<n; i++) {
> >   ...
> > }
> > 
> > There is -fsanitize=signed-integer-overflow (which generates a runtime
> > error message), but that requires data to cause overflow at every
> > possible location.
> 
> I'll ask the Debian gcc team and the Science team if they have ideas about
> this.
> 
> Drew
>