(0.96.0) Improve the `NetCDFOutputWriter` experience #4046

ali-ramadhan · 2025-01-16T00:33:51Z

This PR updates the NetCDFOutputWriter to:

Properly work on RectilinearGrid and LatitudeLongitudeGrid (with correct and useful output attributes).
Save grid metrics with useful attributes.
Save immersed boundary information.
Properly support flat grids.
Work cleanly with LagrangianParticles output.
Allow for flexible dimension naming to satisfy desired naming conventions.
Write grid reconstruction metadata into a NetCDF group to support FieldTimeSeries construction from NetCDF (full support to be added in a subsequent PR).

Tests have been added for all these features. Thank you to @tomchor and @almacarolina for helpful discussions during this refactor!

There's still a little bit left to do, but if anyone has any thoughts or feedback and would like to leave a review I'd appreciate it! There are a lot of line additions but thankfully the changes are almost fully isolated to just two files.

TODO:

Output grid areas and volumes too? But maybe it would be nice to first have xareas, volumes, etc. via KernelFunctionOperation?
Further improve the docstring and documentation for NetCDFOutputWriter to highlight all features.
Use NetCDFOutputWriter + some fancier features in an example.
Figure out which tests are missing.
Watch out for failing tests/examples and fix them.

Some comments:

I'm hoping to merge this PR and immediately work on adding support for NetCDF-backed FieldTimeSeries. I was planning on working on it here (as the branch name suggests) but it will involve refactoring field_time_series.jl which seems best suited for a separate PR.
Supporting output of the free surface height is a bit hacky. Since it's a regular Field but is kind of a ReducedField we cannot dispatch on its type and work with it correctly. Will open an issue to discuss.
I'm tagging v0.96.0 since this PR significantly changes the dimension names used in NetCDF files produced by NetCDFOutputWriter.

Resolves #1334 (via the dimension name generator)
Resolves #2248
Resolves #2770 (Maybe? You can save u, v, w, T, S, and η in the same NetCDF file but it's a bit hacky, see above.)
Resolves #3997

This PR makes progress on issue #3935
This PR supercedes PR #2652

ali-ramadhan · 2025-01-16T00:57:18Z

Here's what a NetCDF file looks like now in NCDatasets.jl and xarray to give an idea of the changes introduced by this PR.

NCDatasets.jl:

julia> ds = NCDataset("test/test_immersed_grid_latlon_no_halos_GPU.nc")
Dataset: test/test_immersed_grid_latlon_no_halos_GPU.nc
Group: /

Dimensions
   time = 6
   z_c = 16
   z_f = 17
   latitude_c = 16
   latitude_f = 17
   longitude_f = 17
   longitude_c = 16

Variables
  time   (6)
    Datatype:    Float64 (Float64)
    Dimensions:  time
    Attributes:
     units                = seconds
     long_name            = Time

  z_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  z_c
    Attributes:
     units                = m
     long_name            = Locations of the cell centers in the z-direction.

  z_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  z_f
    Attributes:
     units                = m
     long_name            = Locations of the cell faces in the z-direction.

  latitude_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  latitude_c
    Attributes:
     units                = degrees north
     long_name            = Locations of the cell centers in the meridional direction.

  latitude_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  latitude_f
    Attributes:
     units                = degrees north
     long_name            = Locations of the cell faces in the meridional direction.

  longitude_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f
    Attributes:
     units                = degrees east
     long_name            = Locations of the cell faces in the zonal direction.

  longitude_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c
    Attributes:
     units                = degrees east
     long_name            = Locations of the cell centers in the zonal direction.

  bottom_height   (16 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c

  dlatitude_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  latitude_c
    Attributes:
     units                = degrees
     long_name            = Angular spacings between the cell centers in the meridional direction.

  dlatitude_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  latitude_f
    Attributes:
     units                = degrees
     long_name            = Angular spacings between the cell faces in the meridional direction.

  dlongitude_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c
    Attributes:
     units                = degrees
     long_name            = Angular spacings between the cell centers in the zonal direction.

  dlongitude_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f
    Attributes:
     units                = degrees
     long_name            = Angular spacings between the cell faces in the zonal direction.

  dx_cc   (16 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the zonal direction between the cell located at (Center, Center).

  dx_cf   (16 × 17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_f
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the zonal direction between the cell located at (Center, Face).

  dx_fc   (17 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_c
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the zonal direction between the cell located at (Face, Center).

  dx_ff   (17 × 17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_f
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the zonal direction between the cell located at (Face, Face).

  dy_cc   (16 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the meridional direction between the cell located at (Center, Center).

  dy_cf   (16 × 17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_f
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the meridional direction between the cell located at (Center, Face).

  dy_fc   (17 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_c
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the meridional direction between the cell located at (Face, Center).

  dy_ff   (17 × 17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_f
    Attributes:
     units                = m
     long_name            = Geodesic spacings in the meridional direction between the cell located at (Face, Face).

  dz_c   (16)
    Datatype:    Float32 (Float32)
    Dimensions:  z_c
    Attributes:
     units                = m
     long_name            = Spacings between the cell faces (located at the cell centers) in the z-direction.

  dz_f   (17)
    Datatype:    Float32 (Float32)
    Dimensions:  z_f
    Attributes:
     units                = m
     long_name            = Spacings between the cell centers (located at the cell faces) in the z-direction.

  immersed_boundary_mask_ccc   (16 × 16 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c × z_c

  immersed_boundary_mask_ccf   (16 × 16 × 17)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c × z_f

  immersed_boundary_mask_cfc   (16 × 17 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_f × z_c

  immersed_boundary_mask_fcc   (17 × 16 × 16)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_c × z_c

  S   (16 × 16 × 16 × 6)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c × z_c × time
    Attributes:
     units                = practical salinity unit (psu)
     long_name            = Salinity

  T   (16 × 16 × 16 × 6)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c × z_c × time
    Attributes:
     units                = °C
     long_name            = Temperature

  u   (17 × 16 × 16 × 6)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_f × latitude_c × z_c × time
    Attributes:
     units                = m/s
     long_name            = Velocity in the zonal direction (+ = east).

  v   (16 × 17 × 16 × 6)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_f × z_c × time
    Attributes:
     units                = m/s
     long_name            = Velocity in the meridional direction (+ = north).

  w   (16 × 16 × 17 × 6)
    Datatype:    Float32 (Float32)
    Dimensions:  longitude_c × latitude_c × z_f × time
    Attributes:
     units                = m/s
     long_name            = Velocity in the vertical direction (+ = up).

Global attributes
  Julia                = This file was generated using Julia Version 1.10.7
Commit 4976d05258e (2024-11-26 15:57 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 48 × AMD Ryzen Threadripper 7960X 24-Cores
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 48 virtual cores)
Environment:
  JULIA_TEST_FAILFAST = true
  LD_PRELOAD = /usr/NX/lib/libnxegl.so
  JULIA_LOAD_PATH = @:/tmp/jl_JcvMoH
  GPU: NVIDIA GeForce RTX 4090

  Oceananigans         = This file was generated using Oceananigans v0.96.0
  date                 = This file was generated on 2025-01-15T17:53:38.863 local time (2025-01-16T00:53:38.863 UTC).
  interval             = 1
  output iteration interval = Output was saved every 1 iteration(s).
  schedule             = IterationInterval
Groups
  Dataset: test/test_immersed_grid_latlon_no_halos_GPU.nc
  Group: grid_reconstruction

  Dimensions
     φ_f = 17
     λ_f = 17
     z_f = 17

  Variables
    φ_f     (17)
      Datatype:    Float32 (Float32)
      Dimensions:  φ_f

    λ_f     (17)
      Datatype:    Float32 (Float32)
      Dimensions:  λ_f

    z_f     (17)
      Datatype:    Float32 (Float32)
      Dimensions:  z_f

  Global attributes
    Hx                   = 2
    Hy                   = 3
    Hz                   = 4
    Nx                   = 16
    Ny                   = 16
    Nz                   = 16
    TX                   = Bounded
    TY                   = Bounded
    TZ                   = Bounded
    eltype               = Float64
    immersed_boundary_type = GridFittedBottom
    type                 = LatitudeLongitudeGrid
    z_spacing            = regular
    λ_spacing            = regular
    φ_spacing            = regular

xarray:

In [2]: ds = xr.open_dataset("test/test_immersed_grid_latlon_no_halos_GPU.nc")

In [3]: ds
Out[3]: 
<xarray.Dataset> Size: 589kB
Dimensions:                     (time: 6, z_c: 16, z_f: 17, latitude_c: 16,
                                 latitude_f: 17, longitude_f: 17,
                                 longitude_c: 16)
Coordinates:
  * time                        (time) timedelta64[ns] 48B 00:00:00 ... 00:00...
  * z_c                         (z_c) float32 64B -968.8 -906.2 ... -31.25
  * z_f                         (z_f) float32 68B -1e+03 -937.5 ... -62.5 0.0
  * latitude_c                  (latitude_c) float32 64B -9.375 -8.125 ... 9.375
  * latitude_f                  (latitude_f) float32 68B -10.0 -8.75 ... 10.0
  * longitude_f                 (longitude_f) float32 68B -20.0 -17.5 ... 20.0
  * longitude_c                 (longitude_c) float32 64B -18.75 ... 18.75
Data variables: (12/24)
    bottom_height               (latitude_c, longitude_c) float32 1kB ...
    dlatitude_c                 (latitude_c) float32 64B ...
    dlatitude_f                 (latitude_f) float32 68B ...
    dlongitude_c                (longitude_c) float32 64B ...
    dlongitude_f                (longitude_f) float32 68B ...
    dx_cc                       (latitude_c, longitude_c) float32 1kB ...
    ...                          ...
    immersed_boundary_mask_fcc  (z_c, latitude_c, longitude_f) float32 17kB ...
    S                           (time, z_c, latitude_c, longitude_c) float32 98kB ...
    T                           (time, z_c, latitude_c, longitude_c) float32 98kB ...
    u                           (time, z_c, latitude_c, longitude_f) float32 104kB ...
    v                           (time, z_c, latitude_f, longitude_c) float32 104kB ...
    w                           (time, z_f, latitude_c, longitude_c) float32 104kB ...
Attributes:
    Julia:                      This file was generated using Julia Version 1...
    Oceananigans:               This file was generated using Oceananigans v0...
    date:                       This file was generated on 2025-01-15T17:53:3...
    interval:                   1
    output iteration interval:  Output was saved every 1 iteration(s).
    schedule:                   IterationInterval

In [4]: ds.longitude_f
Out[4]: 
<xarray.DataArray 'longitude_f' (longitude_f: 17)> Size: 68B
array([-20. , -17.5, -15. , -12.5, -10. ,  -7.5,  -5. ,  -2.5,   0. ,   2.5,
         5. ,   7.5,  10. ,  12.5,  15. ,  17.5,  20. ], dtype=float32)
Coordinates:
  * longitude_f  (longitude_f) float32 68B -20.0 -17.5 -15.0 ... 15.0 17.5 20.0
Attributes:
    units:      degrees east
    long_name:  Locations of the cell faces in the zonal direction.

In [5]: ds.u
Out[5]: 
<xarray.DataArray 'u' (time: 6, z_c: 16, latitude_c: 16, longitude_f: 17)> Size: 104kB
[26112 values with dtype=float32]
Coordinates:
  * time         (time) timedelta64[ns] 48B 00:00:00 ... 00:00:00.500000
  * z_c          (z_c) float32 64B -968.8 -906.2 -843.8 ... -156.2 -93.75 -31.25
  * latitude_c   (latitude_c) float32 64B -9.375 -8.125 -6.875 ... 8.125 9.375
  * longitude_f  (longitude_f) float32 68B -20.0 -17.5 -15.0 ... 15.0 17.5 20.0
Attributes:
    units:      m/s
    long_name:  Velocity in the zonal direction (+ = east).

Hmmm, looks like I forgot to save some attributes for immersed boundary related variables.

tomchor · 2025-01-17T04:33:35Z