Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Fix retrieve data. #810

Merged

Conversation

christinaholtNOAA
Copy link
Collaborator

DESCRIPTION OF CHANGES:

Fix a whitespace issue that leads to pre-mature exit when looking through the possible archive internal directories.

Also addresses an issue uncovered when a user has a pre-existing PYTHONPATH set. We should be prepending the path instead of appending it so that we override any other clones a user might have in their PYTHONPATH.

This fixes an issue with the nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_GFS_v16 test.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)

Several manual tests were run in conjunction with @MichaelLueken and @mkavulich to isolate the sort of behavior that led to the failure for the NCO test. The Intel fundamental tests were run in a clean clone in a different directory structure than my usual (see below) to ensure that the NCO directories were indeed "fresh".

After the failure was identified a few different tests were run with the user PYTHONPATH set to other clones' ush dirs. This type of failure is addressed by the "prepend" instead of the "append" in the "run tests" script.

DEPENDENCIES:

n/a

DOCUMENTATION:

n/a

ISSUE:

Related to bug in Issue #652

The intermittent nature of that failure could be explained by the location of the tester's clone, and how the NCO directories are set. For example, I personally tend to cloning the repo with a different name under the same directory, and have forever. This means that all my tests for all repos forever have had the same NCO directories by default. So, way back when this test wasn't broken, I pulled the data and it has lived there on disk since, so all my fundamental tests pass.

When I clone the repo anywhere else and run the fundamental tests, I observe the data failure associated with not finding the data in the correct internal archive directory structure. The get_data_* test do not complain, and the make_ics and make_lbcs tasks fail with no data.

I believe this is why some folks may have seen intermittent or random failures, as outlined in Issue #652.

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

CONTRIBUTORS (optional):

Thanks @mkavulich @MichaelLueken for running tests to help isolate the problem!

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA These changes look good to me! I was able to run the fundamental tests and the coverage suite on Hera Intel (with the expected failure of grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta from issue #805) without issue. Approving now.

Copy link
Collaborator

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix, looks good

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label May 26, 2023
@MichaelLueken
Copy link
Collaborator

@christinaholtNOAA The Jenkins automated tests failed on Gaea (known issue), Hera functional tests, and Orion due to the inability to clone the Externals. The tests on Jet, Cheyenne Intel, and the expected tests on Cheyenne GNU passed without issue. The WE2E coverage tests were manually ran on Hera and Cheyenne, where they passed without issue. Additionally, manually running the functional tests script on Hera showed that the data was successfully found and pulled (I don't know why reruns of the Jenkins tests on Hera still show failures). Moving forward with merging this update now.

@MichaelLueken MichaelLueken merged commit 11e95e3 into ufs-community:develop May 26, 2023
michelleharrold pushed a commit to michelleharrold/ufs-srweather-app that referenced this pull request Jun 7, 2023
* remove wcoss_dell_p3

* remove block for tide and gyre
michelleharrold pushed a commit to michelleharrold/ufs-srweather-app that referenced this pull request Jun 7, 2023
…community#778)

* Construct var_defns components from dictionary.

* Bring back config_defaults.yaml

* Add support for sourcing yaml file into shell script.

* Remove newline for printing config, json config fix.

* Make QUILTING a sub-dictionary in predef_grids

* Reorganize config_defaults.yaml by task and feature.

* Bug fix with QUILTING=true.

* Structure a dictionary based on a template dictionary.

* Convert all WE2E config files to yaml.

* Take care of problematic chars when converting to shell string.

* Process only selected keys of config.

* Add symlinked yaml config files.

* Actually use yaml config files for WE2E tests.

* Delete all shell WE2E configs.

* Don't check for single quotes in test description.

* Make WE2E work with yaml configs.

* Make yaml default config format.

* Bug fix in run_WE2E script.

* Add utility to check validity of yaml config file.

* Add config utility interface in ush directory.

* Remove unused check_expt_config_vars script.

* Add description to default config.

* Reorganize source_config.

* Add XML as one of the config formats.

* Update custom_ESGgrid config.

* Bug fix due to update.

* Change ensemble seed.

* Change POST_OUTPUT group due to merge.

* Make xml and ini configs work.

* Maintain config structure down to var_defns.

* Add function to load structured shell config, put description under metadata

* Flatten dicts before importing env now that shell config is structured.

* Support python regex for selecting dict keys.

* Add capability of sourcing task specific portion of config file.

* Access var_defns via env variable.

* Make names of tasks consistent with ex- and j- job script names.

* Append pid to temp file.

* Prettify user config, don't use " in xml texts.

* Compare timestamp of csv vs all files instead of directory.

* Fixes for some pylint suggestions.

* Convert new configs to yaml.

* Format python files with black (no functional change).

* More readable yaml/json formats by using more data types.
Only datetime type is now in quotes.

* More readable yaml config files for WE2E and default configs.

* Make config_defaults itself more readable.

* Correct pyyaml list indentation issue.

* Fix indentation in all config files.

* Use unquoted WTIME in config_defaults

* Cosmotic changes.

* Fix due to merge.

* Make __init__.py clearer.

* Fixes due to merge.

* Minor edits of comments.

* Remove wcoss_dell_p3 from workflow (ufs-community#810)

* remove wcoss_dell_p3

* remove block for tide and gyre

* Replace deprecated NCAR python environment with conda on Cheyenne (ufs-community#812)

* Fix issue on get_extrn_lbcs when FCST_LEN_HRS>=40 with netcdf (ufs-community#814)

* activate b file on hpss for >40h

* add a new we2e test for fcst_len_hrs>40

* reduce fcst time for we2e

* Convert new test case to yaml.

* Fix formatting due to merge.

* Convert new test case to yaml.

* Fix unittest.

* Merge develop

* Remove exception logic from __init__.py

* Minor change to cmd concat.

* Make grid gen methods return dictionary, simplifis code a lot.

* Add a comment why we are suppressing yaml import exception.

* Minor change to beautify unittest output.

* Add status badge for functional tests.

* Reorder tasks in config_default and we2e test cases to match order in FV3LAM.xml

* Keep single quotes and newlines in we2e test description.

* Revert back to not rounding to 10 digits

Co-authored-by: Chan-Hoo.Jeon-NOAA <60152248+chan-hoo@users.noreply.github.com>
Co-authored-by: Michael Kavulich <kavulich@ucar.edu>
@christinaholtNOAA christinaholtNOAA deleted the fix_retrieve_data branch July 2, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants