Hi all,
I am using NEMOv4.0.4 and I came across an issue when trying to output the variable mldr10_3
(mixed layer depth diagnosed from vertical density gradient of 0.03 kg.m-3 with respect to the 10m deep layer). The model outputs something that is polluted by crazy values:
When bounding the value range, the field does not seem that bad though. We see that the unrealistic values are mostly (if not all) located over continent or over shallow ocean regions, see dark red points on this figure:
We found that changing the number of processor to run NEMO is a way to solve the problem. For the figure I posted above I was using 720 processors, but using only 480 processors fixes the issue.
What is puzzling is that 720 processors leads supposedly to the best mpi decomposition. I have this in the ocean.output:
==> you use the best mpi decomposition
Number of mpi processes: 720
Number of ocean subdomains = 720
Number of suppressed land subdomains = 200
Whereas with 480 processors it seems that I am wasting resources (from ocean.output):
===>>> : W A R N I N G
=============
The number of mpi processes: 480
exceeds the maximum number of ocean subdomains = 475
we suppressed 100 land subdomains
BUT we had to keep 5 land subdomains that are useless...
--- YOU ARE WASTING CPU... ---
I believe this indicates that 720 proc should be the best option…
However, when looking at the processor decomposition in the case of 720 proc, we can see that for the processor 0 the neighbour processors on the west, north and south are not “allocated”:
resulting internal parameters :
nproc = 0
nowe = -1 noea = 1
nono = -1 noso = -1
Which is not the case when using 480 proc:
resulting internal parameters :
nproc = 0
nowe = 478 noea = 1
nono = 7 noso = -1
We wonder if this could explain the problem we have with the mldr10_3
variable.
Although, if it was the case, we would expect to see this problem in other variables but, so far, mldr10_3
is the only one we found having this problem. I also tried to output the variable mldr10_1
, which is computed similarly as mldr10_3
(cf. routine diahth.F90), but for this one I have no problem.
Cheers,
Yohan