Skip to content

caijt961117/LBM2D-LETKF

Repository files navigation

About LBM2D-LETKF

LBM2D-LETKF is a microbenchmark code for ensemble data assimilation of turbulent flow using the 2D lattice Boltzmann method (LBM) and local ensemble transform Kalman filter (LETKF). Computation is fully implemented in NVIDIA GPU by using CUDA, cuBLAS, cuSOLVER. The problem size of the test is typically 256 x 256 grid points with {4,16,64} ensembles, where the required number of GPUs is equal to the ensemble size (4, 16, or 64 GPUs).

This microbenchmark code has been developed to test the capability of ensemble data assimilation to CityLBM. CityLBM is an application for real-time high-resolution simulation of urban wind flow and plume dispersion. See our publication [Onodera2021] for detail.

Differences between CityLBM and this benchmark are summarized as follows:

CityLBM LBM2D-LETKF
dimension 3D 2D
problem realistic problem of wind flow and plume dispersion in urban areas with high-rise buildings virtually configured (non-realistic) isotropic turbulence
mesh type locally refined mesh (AMR) uniform mesh
mesh size > O(10^8) O(10^4)
numerical model for velocity LBM
cumulant collision [Geier2015]
CSM-LES [Kobayashi2005]
LBM
standard BGK collision
standard Smagorinsky model
numerical model for temperature Boussinesq approximation with finite difference discretization None
numerical model for plume dispersion passible scalar with finite volume discretization None
implement of LEKTF not yet yes

Besides, the LETKF is a well-known method of ensemble data assimilation with high-performance computing in numerical weather prediction (NWP) community; for details, cf. e.g. [Hunt2007], [Miyoshi2016], [Yashiro2020].

Usage

Preparation

Firstly, the following external libraries are required:

  • GCC >= 7.4.0
  • CUDA >= 11.0
  • CUDA-aware MPI (e.g. module ompi-cuda at Wisteria-A at ITC UTokyo, mpt/2.23-ga at HPE SGI8600 at JAEA)
  • HDF5 (ver 1.8.2 or 1.12.0 may be available)
  • Boost C++

Optional: As eigenvalue decomposition solver for LETKF, while cuSOLVER is used as default, EigenG-Batched, an open-source code released from RIKEN, is also available. (Due to a performance reason, EigenG-Batched is highly recommended for the larger ensemble size M>32)

To use EigenG-Batched, execute

./config/enable_EigenG-Batched.sh

Compile

Select one of the test modes from below:

Test Mode Description
NATURE Compute a ground-truth
OBSERVE Create observation data from nature run
LYAPNOV Test Lyapunov exponent of the system (TBD)
DA_NUDGING Compute data assimilation experiment using nudging method (TBD)
DA_LETKF Compute data assimilation experiment using LETKF
DA_DUMMY dummy

For example, if you want to compute a ground-truth,

make clean
TEST=NATURE make

After compilation is succeeded, the executable named run/a.out is generated.

Or if you want to check the compilation of every mode, run

make test

Run

(1) Short validation test

To check the validity of the data assimilation, one may test the Observing System Simulation Experiment (OSSE) of LBM2D-LETKF. The following steps do the OSSE calculation:

  1. Compute nature run and create observational data:
    make clean
    make TEST=OBSERVE
    mpiexec <mpiexec_options...> -np 1 run/a.out
    
    ** note: mpiexec_options depend on environments. For details, please refer to the example job scripts, run/wistimer (for Wisteria-A) or run/jstimer (for HPE SGI8600).
  2. Compute data assimilation:
    make clean
    make TEST=DA_LETKF
    mpiexec <mpiexec_options...> -np <ensemble_size> run/a.out
    
    ** note: ensemble_size is, e.g. 4, 16, or 64
  3. Visualize the result; this will generate the snapshots of the contour of the vorticity field in nature run, observation, and LETKF:
    ./postprocess/plot-vorticity.py io/*/ens*/
    

After 3., you may find the snapshots named io/*/0/vor_1990.png (an example is shown below). The result of the LETKF is expected to be similar to the Nature run; otherwise, the validation test has failed.

Nature run Observation LETKF (expected) Non-DA or invalid LETKF

(2) performance evaluation

OUT OF DATE. The below result is with 128 x 128 mesh.

The validation test also measures the performance, hence additional computation is not needed. You may find the performance result in io/elapsed_time_rank*.csv and io/letkf_elapsed_time_rank*.csv. The result is, for instance,

  • io/elapsed_time_rank0.csv

    #tag,sec
    DA,1.68128
    _sync.DA,0.0289409
    _sync.forecast,0.0262086
    _sync.output,0.01511
    forecast,0.55471
    output,1.39295
    

    Here, the elapsed time [sec] is measured over 100 data assimilation cycles (10000 LBM steps). Each tag means:

    • DA: elapsed time of LETKF. Its breakdown is stored in io/letkf_elapsed_time_rank0.csv
    • forecast: elapsed time of LBM.
    • output: elapsed time of file output for validation test.
    • _sync.*: overhead. Maybe ignored.
  • io/letkf_elapsed_time_rank0.csv

    #tag,sec
    _ignored,2.00556
    lbm2euler,0.00115395
    load_obs,0.270094
    mpi_barrier,0.17283
    mpi_xk,0.0161288
    mpi_xsol,0.190429
    mpi_yk,0.064585
    pack_xk,0.0215552
    pack_yk,0.00485086
    set_yo,0.0119514
    solve,0.114645
    solve_comm_ovlp,0.809463
    update_xk,0.00277638
    

    These tags are roughly categorized into the following types:

    • Computation: lbm2euler, set_yo, solve, solve_comm_ovlp, update_xk
    • File read: load_obs
    • Communication: mpi_xk, mpi_xsol, mpi_yk.
      Note: Communication is partially overlapped with computation and file read. The elapsed time shown here is of the non-overlapped part.
    • Others:
      • mpi_barrier: waiting time to synchronize imbalance of computation.
      • _ignored: elapsed times outside the LETKF. Ignored.

Besides, examples of performance evaluations at Wisteria-A can be found in doc/result_example/aquarius.

Publication

  1. Y. Hasegawa, T. Imamura, T. Ina, N. Onodera, Y. Asahi, and Y. Idomura, "GPU Optimization of Lattice Boltzmann Method with Local Ensemble Transform Kalman Filter," in 2022 IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH), pp. 10-17, 2022. [doi:10.1109/ScalAH56622.2022.00007, arXiv:2308.03310, code: LBM2D-LETKF:v2.4.4]
  2. Y. Hasegawa, N. Onodera, Y. Asahi, T. Ina, T. Imamura, and Y. Idomura, "Continuous data assimilation of large eddy simulation by lattice Boltzmann method and local ensemble transform Kalman filter (LBM-LETKF)," Fluid Dynamics Research, vol. 55, p. 065501. [doi:10.1088/1873-7005/ad06bd, arXiv:2308.03972, code: LBM2D-LETKF:v3.0.6]

Cite us

@inproceedings{Hasegawa2022-ScalAH,
    title = {{GPU Optimization of Lattice Boltzmann Method with Local Ensemble Transform Kalman Filter}},
    year = {2022},
    booktitle = {IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH)},
    author = {Hasegawa, Yuta and Imamura, Toshiyuki and Ina, Takuya and Onodera, Naoyuki and Asahi, Yuuichi and Idomura, Yasuhiro},
    pages = {10--17},
    doi = {10.1109/ScalAH56622.2022.00007},
    arxivId = {2308.03310}
}
@article{Hasegawa2023-FDR,
    title = {{Continuous data assimilation of large eddy simulation by lattice Boltzmann method and local ensemble transform Kalman filter (LBM-LETKF)}},
    year = {2023},
    journal = {Fluid Dynamics Research},
    volume = {55},
    number = {6},
    pages = {065501},
    author = {Hasegawa, Yuta and Imamura, Toshiyuki and Ina, Takuya and Onodera, Naoyuki and Asahi, Yuuichi and Idomura, Yasuhiro},
    doi = {10.1088/1873-7005/ad06bd},
    arxivId = {2308.03972}
}