I run ORCA2-ICE simulations and often see artefacts in output variables along the grid cells where north-folding occurs. It sometimes takes several years before it appears, depending on the forcing and namelist settings. An example:
This is sea ice concentration, August year 0055, in a configuration of ORCA2-ICE (PISCES removed during build), main branch (#0bb9add6), with the standard input climatological forcing from SETTE v5.0. For the record, the following cfg namelist parameters are changed in this run but I don’t think these matter to the root problem:
Form drag activated (ln_Cx_ice_cst = .false.; ln_Cx_ice_frm = .true.)
A plotting issue (it can be seen with, e.g., ncview)
Related to above custom namelist changes, since it can also be seen in a run without any of them activated (though it is less obvious; see below, top-left)
To do with input forcing (in this post I just show it occurring with SETTE forcing to keep the examples as reproducible as possible, but I have also seen this with other custom forcings, e.g., JRA-55-do)
In main (#0bb9add6), it seems to disappear with ln_nnogather = .false.
To emphasise this occurs in the reference configuration (i.e., minimum reproducible example), here is a comparison of (left) main vs. (right) 5.0.1, with and without ln_nnogather (top vs bottom), in ORCA2-ICE, no other namelist changes. Each panel is June year 45 and I have zoomed in on the colour scale so that is clear. Note how ln_nnogather makes no difference in 5.0.1 (as expected, if I understand it correctly):
I think this suggests the newer north-folding code (or whatever is activated by ln_nnogather) is somehow not working correctly in main, but it’s beyond me to find out any more than this…
Yes, that does seem to resolve it. Repeat of above second figure: year 0045 June, only top-left panel changed after changing llfull = .TRUE. in mpp_nfd_generic.h90. The artefact also does not appear for at least a few years more (went up to year 0066 with no sign of it). Left panels now look the same:
Thank you for these tests.
We have tests within our continuous integration process which are supposed to detect this kind of errors…
I will do additional tests on my side but could you compare the restart files of 2 experiments which differs only by the definition of llfull = .TRUE. or .FALSE..
The restarts for the above tests are end of December so it’s not really visible (actually might not be present) in either case, although the variables are different between the two restart files (well, I only looked at a_i). So I ran these simulations again but ending after 30 June year 0046 so we get restarts in summer, and below the artefact can be seen in the default run with llfull = .FALSE. (left) but not in the run where that alone is changed to .TRUE. (right):
Here I calculated at_i from a_i in the restart files, but it can also be seen in the individual thickness categories (most obviously in the smallest category).
Also, just to confirm changing llfull does remove the artefact in our custom configuration (various namelist changes; first Fig. in original post) as well.
By the way, I note llfull = .FALSE. in the 5.0.1 version of mpp_nfd_generic.h90 as well (the whole file is identical), even though the problem doesn’t seem to arise there (I assume you’re aware of this but just noting anyway as I’m curious why it only seems to matter in main).
Thank you for these additional tests, which confirm the problem in the main.
Setting llfull = .TRUE. is only a way to correct (or hide) a problem appearing somewhere else…
If the code was properly working, setting llfull to .TRUE. or .FALSE. should give you exactly the same results.
We will check why it is no more the case in the main and why we were not able to detect this in our sanity tests on GitLab.
we are not able to reproduce the problem (same revision, same ORCA2-LIM3 config). Assuming that you didn’t modify the code itself, we guess that you are using slightly different namelists. Would you like to share your namelist_cfg and namelist_ice_cfg with us ?
In the runs I’ve shown in this thread we have not included any modified code (except the mpp_nfd_generic.h90 as discussed).
In the top-left panel of my earlier figure we also have no changes to namelist_cfg, except I did forget to mention: ln_icebergs is set to .false.., and some cn_dir are changed to point to the data on our system. I attach the namelist_cfg below anyway, as it is entirely possible I’ve missed/forgotten about something (namelist_ice_cfg is clearly not changed except for the restart directory paths, so I haven’t attached it here).
I link to our *.fcm files below. We compile with default/prod flags. I thought maybe optimisation could be the culprit, but -O3 is also set in the template arch-linux_gfortran.fcm which is what ours is based on.
Also, if helpful we have GCC 8.5.0 20210514 (Red Hat 8.5.0-28), OpenMPI 4.1.1, netcdf 4.7.0.
Is it also worth trying your end the namelist changes from the very first figure/post in this thread where the artefact is more obvious? (ln_Cx_ice_frm = .true.; nlay_i = 4; nlay_s = 1; ln_leadhfx = .false.; ln_pnd_TOPO = .true.).
A new file should appear in your working directory (fort.71), identifying places and variables which North Pole folding update may cause problems. Could you send me a fort.71 file produced by (the last chunk of) your problematic simulation ?
I just set up the original run with those source changes and the fort.71 output is filling up very fast (almost 1MB after a few seconds of the run) so I might not have the disk space to complete the whole simulation until the artefact appears in the outputs. So I’m just attaching what I have here in case this is enough to go on, which it looks like it might be, but I’ll let you interpret…
Thank you for the checking file. I realised that it takes some times for the artifact been visible on this diagnostic (basically 299 time step here). I modified the lbc_lnk routines to limit the output to the error messages (and stop the simulation there). However, the diagnostic reveals a North Pole mismatch in the model prognostic variables, which doesn‘t really help us to find the precise origin of the problem. More work needed from my side …