Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 #1638

Merged
merged 2 commits into from
Jan 19, 2022

Conversation

epn09
Copy link
Contributor

@epn09 epn09 commented Jan 12, 2022

TYPE: bug fix

KEYWORDS: noah mosaic, wudapt

SOURCE: Do Ngoc Khanh (Tokyo Institute of Technology)

DESCRIPTION OF CHANGES:
Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0.

Problem:
Segmentation fault occurs when running the model using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 as described in detail in #1633.

Solution:
The code now checks for use_wudapt_lcz and uses different code to define urban categories in lsm_mosaic routine in module_sf_noahdrv.F.

ISSUE:
Fixes #1633

LIST OF MODIFIED FILES:
M dyn_em/module_first_rk_step_part1.F
M phys/module_sf_noahdrv.F
M phys/module_surface_driver.F

TESTS CONDUCTED:

  1. When use_wudapt_lcz = 1: bit-by-bit identical output before and after modification.
  2. When use_wudapt_lcz = 0: Segmentation fault is fixed.
  3. The Jenkins tests are all passing.

RELEASE NOTE: Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0.

@weiwangncar
Copy link
Collaborator

@epn09 Thank you for making this PR. Have you looked how the non-mosaic version of Noah LSM handles the urban categories? That code did not need to check the namelist option use_wudapt_lcz and it appears to work fine.

@epn09
Copy link
Contributor Author

epn09 commented Jan 13, 2022

@weiwangncar In the non-mosaic version of Noah LSM, IVGTYP is constant and assignment of FRC_URB2D is done in urban_var_init in module_sf_urban.F (https://github.com/wrf-model/WRF/blob/master/phys/module_sf_urban.F#L2653-L2686) and use_wudapt_lcz is checked there.

@weiwangncar
Copy link
Collaborator

@epn09 This probably makes sense. Thanks. Would you be interested in testing an urban option using mosaic Noah LSM?

@epn09
Copy link
Contributor Author

epn09 commented Jan 13, 2022

@weiwangncar I have run the mosaic Noah LSM with single layer UCM urban option (using the bug-fixed code) for a 3-day run. Comparing with the run without mosaic option, the result looks reasonable to me.

@weiwangncar
Copy link
Collaborator

@epn09 Thank you for doing that test.

weiwangncar
weiwangncar previously approved these changes Jan 18, 2022
@davegill davegill changed the base branch from master to develop January 19, 2022 03:26
@davegill davegill dismissed weiwangncar’s stale review January 19, 2022 03:26

The base branch was changed.

@davegill
Copy link
Contributor

@weiwangncar
Wei,
I swapped the base branch from master to develop, and that means github tossed out your review. Would you do your review again please?

@davegill davegill merged commit f8c4b13 into wrf-model:develop Jan 19, 2022
@epn09 epn09 deleted the mosaic-bugfix branch January 20, 2022 01:22
davegill added a commit that referenced this pull request Jan 24, 2022
TYPE: bug fix

KEYWORDS: netcdfpar, Error

SOURCE: internal

DESCRIPTION OF CHANGES:
IMPORTANT: Without these mods, every commit since the parallel netcdf4 IO mods will fail the DA
build test in the regression test. For example, at least these commits:
```
fed10f4 Adding the WRF-Solar EPS model (#1547)
0bda5e0 Fix 4dvar build failure after commit 8b5bfe5 (#1652)
8b5bfe5 Thompson AA enhancements: BC aerosol, biomass burning emissions, and … (#1616)
9dc68ca After testing with UFS/GFS/FV-3, some tuning knob changes to Thompson-MP and icloud3 (cloud fraction) scheme (#1626)
96fd889 Update HONO, TERP, and CO2 emissions (#1644)
64fb190 SFCLAY=1, add shallow water roughness calculation (#1543)
609c2fc New module firebrand_spotting for WRF-Fire (#1540)
75bfe6d MYNN PBL clouds in photolysis option 4 (TUV) (#1622)
f8c4b13 Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 (#1638)
b511c70 Run-time option for climate GHG for radiation (#1625)
8194c66 Bug fix for configuration option INTEL:HSW/BDW (#1645)
16c9287  bug fixes for radar_rf_opt=2 (#1642)
a82ce24 Sync with NoahMP Github version with all NoahMP updates since v4.3 (#1641)
7b642cc Bug fix for TAMDAR T VarBC (#1632)
92fd706 fix WRFDA build for Parallel netcdf-4 IO (#1634)
```
Problem:
With PR #1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without
the new parallel NetCDF4 compression, the build log had an `Error`. 
```
> grep Error compile.log
Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory
make[2]: [diffwrf] Error 1 (ignored)
make[2]: [diffwrf] Error 1 (ignored)
wrf_io.f:117: Error: Can't open included file 'mpif.h'
make[2]: [wrf_io.o] Error 1 (ignored)
Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory
make[2]: [field_routines.o] Error 1 (ignored)
make[2]: [libwrfio_nfpar.a] Error 127 (ignored)
make[2]: [libwrfio_nfpar.a] Error 1 (ignored)
```
The problem was related to constructing the object files in the io_netcdfpar directory. When the 
option is not selected at compile time, then we do not care about errors in the directory that will 
never be used.

Solution:
If the NETCDFPAR option is not selected at compile time, then SKIP going into the io_netcdfpar
directory all together.

LIST OF MODIFIED FILES:
m Makefile
m arch/Config.pl
m arch/configure.defaults
m configure

TESTS CONDUCTED:
1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed:
```
          cd ../io_netcdfpar ; \
          echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \

          cd ../io_netcdfpar ; \
          echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \
```

2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory:
          cd ../io_netcdfpar ; \
           make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \

          cd ../io_netcdfpar ; \
           make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \
```

3. Jenkins tests are all PASS.
@pvahmanilbl
Copy link

@davegill and @weiwangncar I hope this is the right place to raise this issue: I have been getting segmentation fault error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference
when I use
SF_SURFACE_MOSAIC = 1
This happens when the radiation scheme is called for the first time (6 min in the simulation). I checked and the fix above is already in the WRFv44 I am using.

I am using
use_wudapt_lcz=1
num_land_cat=61
sf_urban_physics=1
When I remove SF_SURFACE_MOSAIC=1 there are no issues. I would really appreciate it if you can help me with this!

@kkeene44
Copy link
Collaborator

@pvahmanilbl
The subject for this PR is specific to when use_wudapt_lcz=0. I just want to check that you're actually using =1.

@pvahmanilbl
Copy link

@kkeene44 I am using use_wudapt_lcz=1. I have not tried it use_wudapt_lcz=0. My configuration works with non-mosaic (SF_SURFACE_MOSAIC=0) option but when I use mosaic (SF_SURFACE_MOSAIC=1) it crashes with seg fault. I posted here since this is the closest I found to the issue I am facing.

@kkeene44
Copy link
Collaborator

@pvahmanilbl
I ran a test case with those settings and I'm able to run without problems for a case using sf_surface_mosaic=1 and use_wudapt_lcz=1. This must be related to your specific case. Can you post your issue to the WRF & MPAS-A Support Forum so we can look into the problem there? When you do, please attach your namelist.input file, and if you'd like to share larger files (e.g., wrfinput, wrfbdy), take a look at the home page of the forum for instructions on sharing large files. Thanks!

@pvahmanilbl
Copy link

@kkeene44 thank you so much for your response! (sorry I am being slow; I was on paternity leave)
I will try the Forum one more time but can I ask a quick question? when you run real.exe with the mosaic option do you get the new variables (e.g., LANDUSEF2, MOSAIC_CAT_INDEX, etc) in wrfinput with values in them? for me, all of these new variables in wrfinput have only zeros.

@kkeene44
Copy link
Collaborator

kkeene44 commented Nov 7, 2023

@pvahmanilbl
Mine are also all 0, but if I'm reading the code correctly, the new variables are just initialized during the real.exe process. They are later used during WRF to update other variables (for e.g., the values for mosaic_cat_index become IVGTYP.

@pvahmanilbl
Copy link

@kkeene44 thanks so much for your response! I was trying to use CONUS physics and it didn't work with MOSAIC until I used:

sf_sfclay_physics = 1, 1, 1, 1,
bl_pbl_physics = 1, 1, 1, 1,
and it worked!

vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
…z = 0 (wrf-model#1638)

TYPE: bug fix

KEYWORDS: noah mosaic, wudapt

SOURCE: Do Ngoc Khanh (Tokyo Institute of Technology)

DESCRIPTION OF CHANGES:
Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0.

Problem:
Segmentation fault occurs when running the model using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 as described in detail in wrf-model#1633. 

Solution:
The code now checks for use_wudapt_lcz and uses different code to define urban categories in lsm_mosaic routine in module_sf_noahdrv.F.

ISSUE: 
Fixes wrf-model#1633 

LIST OF MODIFIED FILES:
M       dyn_em/module_first_rk_step_part1.F
M       phys/module_sf_noahdrv.F
M       phys/module_surface_driver.F

TESTS CONDUCTED: 
1. When use_wudapt_lcz = 1: bit-by-bit identical output before and after modification.
2. When use_wudapt_lcz = 0: Segmentation fault is fixed.
3. The Jenkins tests are all passing.

RELEASE NOTE: Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0.
vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
TYPE: bug fix

KEYWORDS: netcdfpar, Error

SOURCE: internal

DESCRIPTION OF CHANGES:
IMPORTANT: Without these mods, every commit since the parallel netcdf4 IO mods will fail the DA
build test in the regression test. For example, at least these commits:
```
fed10f4 Adding the WRF-Solar EPS model (wrf-model#1547)
0bda5e0 Fix 4dvar build failure after commit 8b5bfe5 (wrf-model#1652)
8b5bfe5 Thompson AA enhancements: BC aerosol, biomass burning emissions, and … (wrf-model#1616)
9dc68ca After testing with UFS/GFS/FV-3, some tuning knob changes to Thompson-MP and icloud3 (cloud fraction) scheme (wrf-model#1626)
96fd889 Update HONO, TERP, and CO2 emissions (wrf-model#1644)
64fb190 SFCLAY=1, add shallow water roughness calculation (wrf-model#1543)
609c2fc New module firebrand_spotting for WRF-Fire (wrf-model#1540)
75bfe6d MYNN PBL clouds in photolysis option 4 (TUV) (wrf-model#1622)
f8c4b13 Fix runtime error when using sf_surface_mosaic = 1 with use_wudapt_lcz = 0 (wrf-model#1638)
b511c70 Run-time option for climate GHG for radiation (wrf-model#1625)
8194c66 Bug fix for configuration option INTEL:HSW/BDW (wrf-model#1645)
16c9287  bug fixes for radar_rf_opt=2 (wrf-model#1642)
a82ce24 Sync with NoahMP Github version with all NoahMP updates since v4.3 (wrf-model#1641)
7b642cc Bug fix for TAMDAR T VarBC (wrf-model#1632)
92fd706 fix WRFDA build for Parallel netcdf-4 IO (wrf-model#1634)
```
Problem:
With PR wrf-model#1552 "Parallel netcdf-4 IO option" (SHA1 3cd4713), when then code was built without
the new parallel NetCDF4 compression, the build log had an `Error`. 
```
> grep Error compile.log
Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory
make[2]: [diffwrf] Error 1 (ignored)
make[2]: [diffwrf] Error 1 (ignored)
wrf_io.f:117: Error: Can't open included file 'mpif.h'
make[2]: [wrf_io.o] Error 1 (ignored)
Fatal Error: Cannot open module file ‘wrf_data_ncpar.mod’ for reading at (1): No such file or directory
make[2]: [field_routines.o] Error 1 (ignored)
make[2]: [libwrfio_nfpar.a] Error 127 (ignored)
make[2]: [libwrfio_nfpar.a] Error 1 (ignored)
```
The problem was related to constructing the object files in the io_netcdfpar directory. When the 
option is not selected at compile time, then we do not care about errors in the directory that will 
never be used.

Solution:
If the NETCDFPAR option is not selected at compile time, then SKIP going into the io_netcdfpar
directory all together.

LIST OF MODIFIED FILES:
m Makefile
m arch/Config.pl
m arch/configure.defaults
m configure

TESTS CONDUCTED:
1. Without the NETCDFPAR parameter set, the build for the io_netcdfpar directory is bypassed:
```
          cd ../io_netcdfpar ; \
          echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \

          cd ../io_netcdfpar ; \
          echo SKIPPING make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.7.3/gnu/9.1.0" \
```

2. When the NETCDFPAR env variable is set, the build includes the io_netcdfpar directory:
          cd ../io_netcdfpar ; \
           make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \

          cd ../io_netcdfpar ; \
           make -i -r NETCDFPARPATH="/glade/u/apps/ch/opt/netcdf/4.8.0/gnu/9.1.0" \
```

3. Jenkins tests are all PASS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault with sf_surface_mosaic in WRF4.3.1.
6 participants