Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

marine bufr2ioda converters fail in prepatmioda job #897

Closed
RussTreadon-NOAA opened this issue Feb 1, 2024 · 19 comments · Fixed by #937
Closed

marine bufr2ioda converters fail in prepatmioda job #897

RussTreadon-NOAA opened this issue Feb 1, 2024 · 19 comments · Fixed by #937

Comments

@RussTreadon-NOAA
Copy link
Contributor

While running a C48 3denvar JEDI parallel on Orion and Hera, the prepatmiodaobs job failed when executing marine bufr2ioda converters. For example, bufr2ioda_subpfl_glider_profiles.py failed with

^[[38;21m2024-02-01 16:20:29,785 - INFO     - run_bufr2ioda.py: Convert subpfl_glider_profiles...^[[0m
^[[38;21m2024-02-01 16:20:29,785 - INFO     - gen_bufr2ioda_json.py: Using /scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/sorc/gdas.cd/parm/ioda/bufr2ioda/bufr2ioda_subpfl_glider_profiles.json as input^[[0m
^[[38;21m2024-02-01 16:20:29,792 - INFO     - gen_bufr2ioda_json.py: Wrote to /scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/prjedi/prepatmobs.211737/subpfl_glider_profiles_2021032412.json^[[0\
m

...

^[[38;21m2024-02-01 16:20:29,987 - INFO     - run_bufr2ioda.py: Executing /scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_subpfl_glider_profiles.py -c /scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/prjedi/prepatmobs.211737/subpfl_glider_profiles_2021032412.json^[[0m

...

Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_subpfl_glider_profiles.py", line 307, in <module>
    log_level = 'DEBUG' if args.verbose else 'INFO'
NameError: name 'args' is not defined

...

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/ush/run_bufr2ioda.py", line 28, in mp_bufr_converter
    cmd()
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/ush/python/wxflow/executable.py", line 230, in __call__
    raise ProcessError(f"Command exited with status {proc.returncode}:", long_msg)
wxflow.executable.ProcessError: Command exited with status 1:
'/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_subpfl_glider_profiles.py' '-c' '/scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/prjedi/prepatmobs.211737/subpfl_glider_profiles_2021032412.json'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/ush/run_bufr2ioda.py", line 119, in <module>
    bufr2ioda(args.current_cycle, args.RUN, args.DMPDIR, args.config_template_dir, args.COM_OBS)
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/ush/python/wxflow/logger.py", line 266, in wrapper
    retval = func(*args, **kwargs)
  File "/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/ush/run_bufr2ioda.py", line 108, in bufr2ioda
    pool.starmap(mp_bufr_converter, zip(exename, config_files))
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/multiprocessing/pool.py", line 276, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/scratch1/NCEPDEV/da/python/opt/core/miniconda3/4.6.14/envs/gdasapp/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
wxflow.executable.ProcessError: Command exited with status 1:
'/scratch1/NCEPDEV/da/role.jedipara/git/global-workflow/atmos/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_subpfl_glider_profiles.py' '-c' '/scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/prjedi/prepatmobs.211737/subpfl_glider_profiles_2021032412.json'
+ JGLOBAL_ATM_PREP_IODA_OBS[1]: postamble JGLOBAL_ATM_PREP_IODA_OBS 1706804425 1

As currently written ush/ioda/bufr2ioda/run_bufr2ioda.py executes all bufr2ioda scripts in ush/ioda/bufr2ioda

    # Specify observation types to be processed by a script
    BUFR_py_files = glob.glob(os.path.join(USH_IODA, 'bufr2ioda_*.py'))
    BUFR_py_files = [os.path.basename(f) for f in BUFR_py_files]
    BUFR_py = [f.replace('bufr2ioda_', '').replace('.py', '') for f in BUFR_py_files]

    config_files = []
    exename = []
    for obtype in BUFR_py:

Marine observations do not need to be processed in an atmosphere only parallel.

As a short term patch a sub-directory named marine was created in ush/ioda/bufr2ioda and marine converters were moved into this directory. With this change existing logic in run_bufr2ioda.py does not see the marine converters and prepatmiodaobs completes. This is not a solution. It's only a patch.

This issue is opened to document this issue and seek a permanent solution.

@guillaumevernieres
Copy link
Contributor

@RussTreadon-NOAA , moving the marine scripts somewhere else is fine by me, but It seems that some logic is missing in the atmos obs processing . Shouldn't we only trigger the bufr to ioda conversion if the bufr obs is in the obs_list.yaml?

@ShastriPaturi
Copy link
Collaborator

@RussTreadon-NOAA, @emilyhcliu is testing marine insitu obs through this PR #879.
Except for tesac_mammals_profiles all the other 7/8 converters pass. I will have a fix for that.

@guillaumevernieres, I will anyways need to do it.

@RussTreadon-NOAA
Copy link
Contributor Author

@guillaumevernieres , I'm not inclined to move marine scripts into a folder unless we as a team feel it's better from an organizational point of view to move application specific converters into aerosol, atmosphere, land, and marine directories. I'm not sure we want to go here.

Your suggestion is good. One could refactor run_bufr2ioda.py to read a yaml listing the observations to process and based on this execute the appropriate converters.

@guillaumevernieres
Copy link
Contributor

@guillaumevernieres , I'm not inclined to move marine scripts into a folder unless we as a team feel it's better from an organizational point of view to move application specific converters into aerosol, atmosphere, land, and marine directories. I'm not sure we want to go here.

Your suggestion is good. One could refactor run_bufr2ioda.py to read a yaml listing the observations to process and based on this execute the appropriate converters.

@RussTreadon-NOAA , temporarily moving the marine converters somewhere else while more logic is added to run_bufr2ioda.py seems like a good/quick temporary solution?

@RussTreadon-NOAA
Copy link
Contributor Author

@ShastriPaturi , a few questions

  1. Does PR #879 mean that g-w will require atmosphere only parallels run marine observation converters?
  2. Do you know if GFS v17 will populate operational obsproc gfs and gdas dump directories with marine data?
  3. For retrospective and real-time GFS v17 parallels, will EIB refactor the global dump archive to include marine data in gfs and gdas dump directories?

@emilyhcliu
Copy link
Collaborator

emilyhcliu commented Feb 1, 2024

@guillaumevernieres , I'm not inclined to move marine scripts into a folder unless we as a team feel it's better from an organizational point of view to move application specific converters into aerosol, atmosphere, land, and marine directories. I'm not sure we want to go here.

Your suggestion is good. One could refactor run_bufr2ioda.py to read a yaml listing the observations to process and based on this execute the appropriate converters.

@guillaumevernieres , I'm not inclined to move marine scripts into a folder unless we as a team feel it's better from an organizational point of view to move application specific converters into aerosol, atmosphere, land, and marine directories. I'm not sure we want to go here.

Your suggestion is good. One could refactor run_bufr2ioda.py to read a yaml listing the observations to process and based on this execute the appropriate converters.

I like the idea. Sort of like the list of obs to assimilate under the list.

@ShastriPaturi
Copy link
Collaborator

@ShastriPaturi , a few questions

  1. Does PR #879 mean that g-w will require atmosphere only parallels run marine observation converters?
  2. Do you know if GFS v17 will populate operational obsproc gfs and gdas dump directories with marine data?
  3. For retrospective and real-time GFS v17 parallels, will EIB refactor the global dump archive to include marine data in gfs and gdas dump directories?

@RussTreadon-NOAA:

  1. I do not know the answer to that.
  2. Yes. @ilianagenkova is populating the 6-hrly dumps for both gfs and gdas cycles, as we speak.
  3. I do not know the answer to that yet. I can definitely say that global dump archive to include marine data in gfs and gdas dump directories do have the 6-hrly dumps. Infact, the dump directories are being populated from 2019 to real-time.

@RussTreadon-NOAA
Copy link
Contributor Author

@ShastriPaturi , would you please point me at the machine on which and the directories in which 2 and 3 are occurring?

@ShastriPaturi
Copy link
Collaborator

ShastriPaturi commented Feb 1, 2024

@RussTreadon-NOAA WCOSS2 (dev machine).
I can point you to the directory over there.

I will be making a copy of those on orion and hera for the retrospective runs.

@ilianagenkova
Copy link
Collaborator

@RussTreadon-NOAA @ShastriPaturi , let's be clear here - I am only running a NRT cron , i.e. in dev environment, that generates the marine in-situ augmented bufr dumps. They can be found (some dates on Cactus, others on Dogwood) in:
/lfs/h2/emc/obsproc/noscrub/iliana.genkova/CRON/SOCA/com/obsproc/v1.2/*

@RussTreadon-NOAA
Copy link
Contributor Author

Got it. Thank you @ShastriPaturi . Makes sense that observations are parsed into type specific sub-directories

russ.treadon@dlogin06:/lfs/h2/emc/obsproc/noscrub/iliana.genkova/CRON/SOCA/com/obsproc/v1.2/gdas.20240201/00> ls -l
total 44
drwxr-sr-x 2 iliana.genkova obsproc  4096 Feb  1 12:00 adt
drwxr-sr-x 2 iliana.genkova obsproc 12288 Feb  1 05:54 atmos
drwxr-sr-x 2 iliana.genkova obsproc  4096 Feb  1 04:15 icec
drwxr-sr-x 2 iliana.genkova obsproc  4096 Feb  1 09:15 sss
drwxrwsr-x 2 iliana.genkova obsproc 20480 Feb  1 03:10 sst

Work will be needed in the GDASApp and g-w to properly handle this directory structure.

@RussTreadon-NOAA
Copy link
Contributor Author

Understood @ilianagenkova . GDA mangers will need to properly reconfigure the GDA for GFS v17 retrospective parallels.

@ilianagenkova
Copy link
Collaborator

ilianagenkova commented Feb 1, 2024 via email

@RussTreadon-NOAA
Copy link
Contributor Author

Update

Now attempting to run C192/C96L127 3DEnVar JEDI atmospheric DA parallel on Hera. Parallel cold started from 2024020100. The 2024020106 gdasprepatmtidaobs job fails when processing marine data. An example follows

bufr2ioda_tesac_profiles.py tries to process gdas.t06z.tesac.tm06.bufr_d. The string tm06 is not correct. The GDA dump file is named gdas.t06z.tesac.tm00.bufr_d. Change line 57 of bufr2ioda_tesac_profiles.py as follows

-    bufrfile = f"{cycle_type}.t{hh}z.{data_format}.tm{hh}.bufr_d"
+    bufrfile = f"{cycle_type}.t{hh}z.{data_format}.tm00.bufr_d"

bufr2ioda_tesac_profiles.py generates message

Traceback (most recent call last):
  File "/scratch1/NCEPDEV/da/Russ.Treadon/git/global-workflow/jedi_atm_ci/sorc/gdas.cd/ush/ioda/bufr2ioda/bufr2ioda_tesac_profiles.py", line 301, in <module>
    log_level = 'DEBUG' if args.verbose else 'INFO'
NameError: name 'args' is not defined

Line 301 is the log_level line below

     start_time = time.time()
     config = "bufr2ioda_tesac_profiles.json"

     log_level = 'DEBUG' if args.verbose else 'INFO'

While argparse is imported in bufr2ioda_tesac_profiles.py, the script does not actually parse the argument list. Add the following to bufr2ioda_tesac_profiles.py`

     start_time = time.time()
     config = "bufr2ioda_tesac_profiles.json"

+    parser = argparse.ArgumentParser()
+    parser.add_argument('-c', '--config', type=str, help='Input JSON configuration', required=True)
+    parser.add_argument('-v', '--verbose', help='print debug logging information',
+                        action='store_true')
+    args = parser.parse_args()
+
     log_level = 'DEBUG' if args.verbose else 'INFO'

With the above changes made to a local copy of bufr2ioda_tesac_profiles.py, the script ran to completion. File gdas.t06z.tesac_profiles.tesac.nc was created.

We should test all marine bufr2ioda converts using the near real time GDA.

@guillaumevernieres
Copy link
Contributor

It looks like @ShastriPaturi is working on this (see pr #914 ). While it looks like there is a bug in the marine converter, your obs processing task should not convert ocean obs.

@RussTreadon-NOAA
Copy link
Contributor Author

prepatmiodaobs executes ush/ioda/bufr2ioda/run_bufr2ioda.py. Logic in this script is such that it executes all bufr2ioda_* scripts in ush/ioda/bufr2ioda/.

    # Specify observation types to be processed by a script
    BUFR_py_files = glob.glob(os.path.join(USH_IODA, 'bufr2ioda_*.py'))
    BUFR_py_files = [os.path.basename(f) for f in BUFR_py_files]
    BUFR_py = [f.replace('bufr2ioda_', '').replace('.py', '') for f in BUFR_py_files]

    config_files = []
    exename = []
    for obtype in BUFR_py:

Script run_bufr2ioda.py needs to be refactored or the directory structure of ush/ioda/bufr2ioda needs to be changed.

@RussTreadon-NOAA
Copy link
Contributor Author

g-w CI testing using 2021032318 through 2021032400 identified a situation requiring additional error checking. For 2021032400 the bufr dump file queried for marine mammal observations did not contain any such observations. This caused bufr2ioda_tesac_mammals_profiles.py to abort with a zero length array.

Logic was added to the script to to trap this situation, log the occurrence, and return.

     alpha_mask = [item.isalpha() for item in stationID]
     indices_true = [index for index, value in enumerate(alpha_mask) if value]
+    if len(indices_true) is 0:
+        logger.info(f"No marine mammals in {DATA_PATH}")
+        return

In examining the 2021032400 failure, it was noted that the 2021032318 job skipped processing because it looked for dump file gdas.t18z.tesac.tm18.bufr_d. The tm18 string is incorrect for gdas dump files. The correct string is tm00.

This was corrected via the following change to bufr2ioda_tesac_mammals_profiles.py

-    bufrfile = f"{cycle_type}.t{hh}z.{data_format}.tm{hh}.bufr_d"
+    bufrfile = f"{cycle_type}.t{hh}z.{data_format}.tm00.bufr_d"

A similar tm00 change was made to other marine bufr2ioda converters.

Finally, scripting was added to marine bufr2ioda converters to trap the situation in which a query of a bufr dump file fails.
The revised script logs the exception message and returns execution instead of aborting.

@ShastriPaturi
Copy link
Collaborator

ShastriPaturi commented Feb 29, 2024

@RussTreadon-NOAA, all the marine BUFR2IODA converters have been fixed in PR #879 and are part of #914.
FYI: in #914, run_bufr2ioda.py is not being used for marine in situ processing.

@RussTreadon-NOAA
Copy link
Contributor Author

Thank you @ShastriPaturi for the update.

PR #914 adds bufr2ioda_insitu* converters. It does not modify existing marine bufr2ioda converters. PR #879 modifies existing marine bufr2ioda converters but there are no traps for bufr dump file existence, bufr query errors, or no-data present checks.

As currently written run_bufr2ioda.py executes all bufr2ioda converters in ush/ioda/bufr2ioda. Hence my predicament. The changes in PR #879 and #914 are beneficial but not sufficient.

RussTreadon-NOAA added a commit to RussTreadon-NOAA/GDASApp that referenced this issue Mar 5, 2024
…g-w ci, enhance error checking for bufr2ioda_trackob_surface (NOAA-EMC#897)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants