Dear NEMO community,
we have a rather big AGRIF configuration which runs when compiled with intel but fails using gnu compilers, which we would like to use due to a series of issues with intel (e.g. this ticket and more).
A test configuration with a small nest also runs when compiled with gnu.
In the following I try to describe what we tested so far on two different HPCs with NEMO 5 (tagged version 5.0 and the recent branch 5.0).
For compiling with AGRIF, other than without, we need
%LDFLAGS -Wl,--allow-multiple-definition
and
%CFLAGS -O0 -fcommon
to get the code compiled at all.
At runtime, we then get
Fortran runtime error: Integer overflow when calculating the amount of memory to allocate
Error termination. Backtrace:
In file BLD/ppsrc/agrif/modarrays.f90, around line 251: Error allocating 9306821842815638016 bytes: Cannot allocate memory
which is case 3 in
!---------------------------------------------------------------------------------------------------
subroutine Agrif_array_allocate ( variable, lb, ub, nbdim )
!---------------------------------------------------------------------------------------------------
type(Agrif_Variable), intent(inout) :: variable !< Variable struct for allocation
integer, dimension(nbdim), intent(in) :: lb !< Lower bound
integer, dimension(nbdim), intent(in) :: ub !< Upper bound
integer, intent(in) :: nbdim !< Dimension of the array
!
select case (nbdim)
case (1) ; allocate(variable%array1(lb(1):ub(1)))
case (2) ; allocate(variable%array2(lb(1):ub(1),lb(2):ub(2)))
case (3) ; allocate(variable%array3(lb(1):ub(1),lb(2):ub(2),lb(3):ub(3)))
case (4) ; allocate(variable%array4(lb(1):ub(1),lb(2):ub(2),lb(3):ub(3),lb(4):ub(4)))
case (5) ; allocate(variable%array5(lb(1):ub(1),lb(2):ub(2),lb(3):ub(3),lb(4):ub(4),lb(5):ub(5)))
case (6) ; allocate(variable%array6(lb(1):ub(1),lb(2):ub(2),lb(3):ub(3),lb(4):ub(4),lb(5):ub(5),lb(6):ub(6)))
end select
!---------------------------------------------------------------------------------------------------
This allocation is unreasonably high. If I add -finit-local-zero to FCFLAGS (also tried some more explicit versions of zero-initializations: -finit-integer=0 -finit-real=zero) this way-too-huge number in the error message reduces to
Error allocating 2078764171264 bytes: Cannot allocate memory
Therefore I assume, there are uninitialized fields in the AGRIF routines which I think require to be fixed.
As the allocated memory is (still) above 32 bit, the next attempt was to get a 64 bit-version with integer-8.
I tried to achieve this by compiling with -m64, -fdefault-integer=8, -finteger-4-integer-8 (individually and together) which runs into argument mismatches (which do not disappear with -fallow-argument-mismatch) with some NF90 routines.
BLD/ppsrc/ioipsl/nc4interface.f90:72:13:
72 | iret = NF90_DEF_VAR_DEFLATE(nfid, nvid, ishuffle, ideflate, ideflate_level)
| 1
Error: Type mismatch in argument 'ncid' at (1); passed INTEGER(8) to INTEGER(4)
BLD/ppsrc/ioipsl/nc4interface.f90:53:13:
53 | iret = NF90_DEF_VAR_CHUNKING(nfid, nvid, ichunkalg, ichunksz)
| 1
Error: Type mismatch in argument 'ncid' at (1); passed INTEGER(8) to INTEGER(4)
I stopped here and would like to ask for input/suggestions on how to solve this on the NEMO/AGRIF side.
For reference, here’s the arch-file (tested with different optimization levels, all resulting in the same problem):
%NCDF_INC -I%NCDF_HOME/include -I%HDF5_HOME/include
%NCDF_LIB -L%HDF5_HOME/lib -L%NCDF_HOME/lib -L%NCDF_C_HOME/lib -lnetcdf -lnetcdff -lhdf5 -lz
%XIOS_INC -I%XIOS_HOME/inc
%XIOS_LIB -L%XIOS_HOME/lib -lxios -lstdc++
%CPP cpp -P -Dkey_nosignedzero
%FC mpif90
%FCFLAGS -O3 -fdefault-real-8 -ffree-line-length-none -fallow-argument-mismatch -finit-local-zero -finit-integer=0 -finit-real=zero
%FFLAGS %FCFLAGS
%LD mpif90
%FPPFLAGS -P -traditional
%LDFLAGS -Wl,--allow-multiple-definition
%AR ar
%ARFLAGS -r
%MK gmake
%USER_INC %XIOS_INC %NCDF_INC
%USER_LIB %XIOS_LIB %NCDF_LIB
%CC mpicc
%CFLAGS -O0 -fcommon