NEMO output in Parcels v4: Request for test datasets and questions about metadata

Hi NEMO community,

I am working on Parcels v4 alongside Prof. Erik van Sebille. We want to support a wide range of hydrodynamic models in our software, and are working to provide ingestion code converting NEMO output to CF and SGRID metadata compliant xarray datasets which can then be picked up and used in the rest of our code.

As part of this development, I would like to know:

  • How good is NEMO output when it comes to CF compliance and using SGRID conventions? Is that mainly up to the data provider running the NEMO model, or is that something provided by default by the model output itself? Does this behaviour differ between versions of NEMO?
  • How can I go about finding a wide range of NEMO output that I can use for testing? (in particular, I am interested in the metadata of these datasets. Taking this metadata - e.g., via ncdump output, or xarray.Dataset.to_dict(data=False) - we can then create lightweight example datasets which we can integrate into our test suite to test ingestion)

Also keen to chat about how we can best support users of Parcels when it comes to NEMO.

Hi Nick

To my understanding, NEMO always stores output in one file per “grid”
<experiment_name>____grid_T.nc - includes all variables at T points (grid-cell center). Example: temperature, salinity
<experiment_name>____grid_U.nc - includes variables at U points (east of cell center). Example: u-velocity, wind stress in x direction
<experiment_name>____grid_V.nc - includes variables at V points (north of cell center). Example: v-velocity, wind stress in y direction

There are also grid_W files (vertical velocity), grid_F (vorticity). Ice and biogeochemistry fields are usually stored in separate files even though most fields are on the T grid.

A “U” file looks something like this:

dimensions:
	axis_nbounds = 2 ;
	x = 360 ;
	y = 331 ;
	depthu = 75 ;
	time_counter = UNLIMITED ; // (12 currently)
variables:
	float nav_lat(y, x) ;
		nav_lat:standard_name = "latitude" ;
		nav_lat:long_name = "Latitude" ;
		nav_lat:units = "degrees_north" ;
	float nav_lon(y, x) ;
		nav_lon:standard_name = "longitude" ;
		nav_lon:long_name = "Longitude" ;
		nav_lon:units = "degrees_east" ;
	float depthu(depthu) ;
		depthu:name = "depthu" ;
		depthu:long_name = "Vertical U levels" ;
		depthu:units = "m" ;
		depthu:positive = "down" ;
		depthu:bounds = "depthu_bounds" ;
	float depthu_bounds(depthu, axis_nbounds) ;
		depthu_bounds:units = "m" ;
	double time_centered(time_counter) ;
		time_centered:standard_name = "time" ;
		time_centered:long_name = "Time axis" ;
		time_centered:calendar = "gregorian" ;
		time_centered:units = "seconds since 1800-01-01 00:00:00" ;
		time_centered:time_origin = "1800-01-01 00:00:00" ;
		time_centered:bounds = "time_centered_bounds" ;
	double time_centered_bounds(time_counter, axis_nbounds) ;
	double time_counter(time_counter) ;
		time_counter:axis = "T" ;
		time_counter:standard_name = "time" ;
		time_counter:long_name = "Time axis" ;
		time_counter:calendar = "gregorian" ;
		time_counter:units = "seconds since 1800-01-01 00:00:00" ;
		time_counter:time_origin = "1800-01-01 00:00:00" ;
		time_counter:bounds = "time_counter_bounds" ;
	double time_counter_bounds(time_counter, axis_nbounds) ;
	float e3u(time_counter, depthu, y, x) ;
		e3u:standard_name = "cell_thickness" ;
		e3u:long_name = "U-cell thickness" ;
		e3u:units = "m" ;
		e3u:online_operation = "average" ;
		e3u:interval_operation = "2700 s" ;
		e3u:interval_write = "1 month" ;
		e3u:cell_methods = "time: mean (interval: 2700 s)" ;
		e3u:_FillValue = 1.e+20f ;
		e3u:missing_value = 1.e+20f ;
		e3u:coordinates = "time_centered nav_lat nav_lon" ;
	float uos(time_counter, y, x) ;
		uos:long_name = "ocean surface current along i-axis" ;
		uos:units = "m/s" ;
		uos:online_operation = "average" ;
		uos:interval_operation = "2700 s" ;
		uos:interval_write = "1 month" ;
		uos:cell_methods = "time: mean (interval: 2700 s)" ;
		uos:_FillValue = 1.e+20f ;
		uos:missing_value = 1.e+20f ;
		uos:coordinates = "time_centered nav_lat nav_lon" ;
	float uo(time_counter, depthu, y, x) ;
		uo:standard_name = "sea_water_x_velocity" ;
		uo:long_name = "ocean current along i-axis" ;
		uo:units = "m/s" ;
		uo:online_operation = "average" ;
		uo:interval_operation = "2700 s" ;
		uo:interval_write = "1 month" ;
		uo:cell_methods = "time: mean (interval: 2700 s)" ;
		uo:_FillValue = 1.e+20f ;
		uo:missing_value = 1.e+20f ;
		uo:coordinates = "time_centered nav_lat nav_lon" ;

In the case of EC-Earth, the lon, lat fields are named “nav_lon, nav_lat” in all files even though they are actually different in the T, U, and V files. Some centres call them “nav_lon_grid_T, nav_lat_grid_T” etc which is probably better.

The field names are not standard. The user can name them as he/she pleases. I call the u-velocity “uo”, but some use “uoce” and other stick to the old “vozocrtx”. The cell thickness “e3u” which is time-varying is stored by some people (like me), but not all. If you are desperate, you can use the time-invariant “e3u” from the mesh file “domain_cfg.nc” and hope your mass-flux calculations are not too far off.

If you would like to have some test files, I could share some eORCA1 output. One year of monthly means would be maybe 1 Gb in total for T,U,V files. Too much to attach here, but if there is an FTP site or something, send me a link and I’ll upload.

Hope this helps
/Joakim