Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow runtime for regrid_data_plane MAX interpolation in MET-10.0.0_beta5 #1778

Closed
21 tasks
LoganDawson-NOAA opened this issue May 5, 2021 · 6 comments · Fixed by #1785
Closed
21 tasks

Slow runtime for regrid_data_plane MAX interpolation in MET-10.0.0_beta5 #1778

LoganDawson-NOAA opened this issue May 5, 2021 · 6 comments · Fixed by #1785
Assignees
Labels
component: code optimization Code optimization issue MET: Library Code priority: blocker Blocker requestor: NOAA/EMC NOAA Environmental Modeling Center type: bug Fix something that is not working
Milestone

Comments

@LoganDawson-NOAA
Copy link

LoganDawson-NOAA commented May 5, 2021

Replace italics below with details for this issue.

Describe the Problem

When testing the beta5 release for METplus-4.0.0 and MET-10.0.0, I discovered a regrid_data_plane MAX interpolation step that is taking ~8x longer (~12 minutes vs. 90 seconds) to run with MET-10.0.0_beta5 than with MET-9.1. When I initially ran this step in a batch job and from the command line, I thought MET was hanging and unable to complete the interpolation. The input file for this step is a file output by regrid_data_plane at a prior step, so I knew MET shouldn't have a problem reading the file. I ran the command again and realized the interpolation command is simply taking longer to run.

Expected Behavior

With MET-9.1, this regrid_data_plane MAX interpolation step took on the order of 90 seconds to complete. With MET-10.0.0_beta5, the same step took upwards of 12 minutes to complete.

Environment

Describe your runtime environment:
1. Machine: WCOSS Dell
2. OS: Linux
3. MET-10.0.0_beta5

To Reproduce

Describe the steps to reproduce the behavior:
1. Interpolate the GRIB2 data to G227 using the following command:
met/10.0.0-beta5/bin/regrid_data_plane -v 5 -method BUDGET -width 2 -field 'name="MergedReflectivityQCComposite"; level="Z500";' -name MergedReflectivityQCComposite MergedReflectivityQCComposite_00.50_20210504-190000.grib2 G227.nc MergedReflectivityQCComposite_00.50_20210504-190000_g227.nc

2. Run MAX interpolate on the prior step's output using the following command:
met/10.0.0-beta5/bin/regrid_data_plane -v 5 -method MAX -width 17 -field 'name="MergedReflectivityQCComposite"; level="Z500";' -name MergedReflectivityQCComposite MergedReflectivityQCComposite_00.50_20210504-190000_g227.nc G227.nc MergedReflectivityQCComposite_MAX40_20210504-190000_g227.nc

3. See the significantly slower runtime for the second command. Running the command(s) within a script that can provide timing info would be useful.

4. Running the same two commands with a MET-9.1 executable will show that the runtime is approximately the same for the first interpolation step, but the second step is considerably slower.

Both the input file (for step 1 and step 2) and the "to grid" file have been posted to the ftp.rap.ucar.edu site.

These commands were generated by METplus-4.0.0_beta5 and METplus-3.1 in my test environment. I can provide the METplus conf files and commands if those would be helpful, but the issue appears isolated to MET itself. I have however attached METplus log files, which provide the output I've seen and the timing information as well.

master_metplus.METplus-3.1.log.20210505164635.txt
master_metplus.METplus-4.0.0-beta5.log.20210504230632.txt

Relevant Deadlines

Investigating before MET-10.0.0 official release would be helpful.

Funding Source

NONE

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Review projects and select relevant Repository and Organization ones or add "alert:NEED PROJECT ASSIGNMENT" label
  • Select milestone to relevant bugfix version

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s), Project(s), Milestone, and Linked issues
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
  • Close this issue.
@LoganDawson-NOAA LoganDawson-NOAA added the type: bug Fix something that is not working label May 5, 2021
@JohnHalleyGotway JohnHalleyGotway added this to the MET 10.0.0 milestone May 5, 2021
@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented May 5, 2021

Pulled the sample data to my local machine and am testing there. Checked run times by varying the MAX regridding widths:

  • width = 3 has runtimes of 0m28.264s and 0m30.619s for main_v9.1/develop.
  • width = 9 has runtimes of 3m6.029s and 3m4.778s for main_v9.1/develop.
  • width = 17 has runtimes of 12m7.563s and 11m51.669s for main_v9.1/develop.

Running the same tests on WCOSS/Venus.

cd /gpfs/dell2/emc/verification/noscrub/John.H.Gotway/MET/MET_Help/dawson_data_20210505
module use /gpfs/dell2/emc/verification/noscrub/emc.metplus/modulefiles
module load met/9.1.2
./run_pdp.sh
  • width = 3 has runtimes of 0m32.130s and 0m31.453s for 9.1.2/10.0.0-beta5.
  • width = 9 has runtimes of 3m17.513s and 3m18.486s for 9.1.2/10.0.0-beta5.
  • width = 17 has runtimes of 12m23.442s and 12m25.119s for 9.1.2/10.0.0-beta5 (still running)

@LoganDawson-NOAA I'm not saying these are desirable runtimes, but I'm just not able to replicate the discrepancy between versions that you describe. I could proceed with profiling the code to see if/how runtimes could be improved in general, but I'm not convinced that this is a new problem introduced by development for MET version 10.0.0.

Is there any other way you'd recommend I test this out?

Note that this is a very special case of calling regrid_data_plane. You're using it to SMOOTH the data rather than actually regrid it. I'd recommend we consider the following enhancements in a future version:

  • Complete development for MET Speed up the MODE convolution step using the algorithm for the fractional coverage field. #1759 to make the convolution (i.e. smoothing) step in MODE as fast as we can.
  • Make sure these data smoothing operations are defined in a common place in the library code.
  • Enhance regrid_data_plane to check for input grid = output grid. In that case, call the much faster library smoothing functions instead of the slower regridding functions.
  • Profile the code to see how often the code is range change the x, y values passed to DataPlane::get(x,y). Doing that millions of time slows things down! Change the code to minimize those calls.
  • Consider using mpi and/or openmp to spread processing across multiple processors.

@LoganDawson-NOAA
Copy link
Author

@JohnHalleyGotway you're correct that this isn't a new problem introduced in MET-10.0.0. It is however a behavior that's come in after the initial release of MET-9.1, which is the version I've been using since last fall.

I grabbed your run_pdp.sh script and ran a slightly modified version to test MET-9.1, MET-9.1.1, MET-9.1.2, and MET-10.0.0_beta5 (all with width = 17).

  • MET-9.1: 1m15.684s
  • MET-9.1.1: 12m21.113s
  • MET-9.1.2: 12m20.935s
  • MET-10.0.0_beta5: 12m26.169s

I didn't test smaller widths last night, but I think you should see the faster runtime with the met/9.1 executable. Maybe the source of the performance change will be easier to track down between the MET-9.1 and MET-9.1.1 code?

And my apologies for not having spotted this sooner with the previous bugfix releases. With my routine verification, it's typically easier to stick with the major releases unless there's a specific new capability I need in a bugfix release. I'm now recognizing the necessity and value in at least fully testing my entire suite of use cases, even with bugfix releases, to make sure prior performance and capabilities are unchanged.

Your idea on moving this smoothing process to a faster capability makes a lot of sense, and I'd definitely be happy to test it with data from my use case.

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented May 6, 2021

@LoganDawson-NOAA this is great feedback. Thanks for narrowing it down! I'll debug the difference in runtimes between 9.1 and 9.1.1.
...
Testing the runtime of 9.1 vs 9.1.1 locally on my laptop reveals no apparent discrepancy:
For MET-9.1 and 9.1.1, I got runtimes of 11m9.961s and 11m55.222s.
...
But testing on WCOSS, I do see a major discrepancy between the two.
For MET-9.1 and 9.1.1, I got runtimes of 1m16.568s and 12m24.640s.
...
I did also check to make sure that 9.1 and 9.1.1 produce the same result by running pcp_combine.

pcp_combine -subtract rdp_9.1.nc rdp_9.1.1.nc rdp_9.1_minus_9.1.1.nc -field 'name="MergedReflectivityQCComposite"; level="(*,*)";'

The differences in the output are all 0 or bad data values.
...
There was no obvious code change for 9.1.1 to explain this:
Release Notes: https://met.readthedocs.io/en/v9.1.3/Users_Guide/overview.html#version-9-1-1-release-notes-20201118
Closed Issues: https://github.com/dtcenter/MET/milestone/68?closed=1
Perhaps the change is due to how 9.1 and 9.1.1 were configured/compiled on WCOSS?

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented May 6, 2021

Compare compilation options:

I compared the compilation flags in config.log for the build configurations of 9.1 and 9.1.1 on WCOSS:

egrep "CFLAGS=|CXXFLAGS=|FFLAGS=" /gpfs/dell2/emc/verification/noscrub/emc.metplus/met/9.1/met-9.1/config.log

CFLAGS='-D__64BIT__'
CXXFLAGS='-D__64BIT__'
FFLAGS='-g'

egrep "CFLAGS=|CXXFLAGS=|FFLAGS=" /gpfs/dell2/emc/verification/noscrub/emc.metplus/met/9.1.1/met-9.1.1/config.log

CFLAGS='-g -O2'
CXXFLAGS='-g'
FFLAGS='-g'

I doubt the -D__64BIT__ flag is the the culprit here, but compiling with -g may very likely impact runtime!

Test removing -g:

I compiled met-9.1 on my laptop without -g, but the runtime was still 12m1.905s. This is compiled with GNU.

Next, I compiled MET's develop branch on WCOSS with/without -g in:
/gpfs/dell2/emc/verification/noscrub/John.H.Gotway/MET/MET_development/MET-no-dash-g/met/bin/regrid_data_plane

The run completes in a (relatively) blazing 1m16.978s without -g vs 12m26.862s with -g.

So the culprit here is compiling with the -g option.

@jprestop should we remove '-g' from the default compilation settings in configure.ac?

: ${CXXFLAGS="-g"}

Any idea about this comment?

# The CXXFLAGS default to "-O2 -g".  The optimization is causing
# problems so just set it to "-g" if the user hasn't overridden it
# themselves.

@LoganDawson-NOAA
Copy link
Author

@JohnHalleyGotway thanks for tracking down the runtime difference as a difference in the compilation settings used for the older and newer versions. If it's possible to compile 10.0.0 without the -g option, that'd be great, but I totally understand if that option is necessary to allow some other capability to work correctly.

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented May 7, 2021

I am changing the handling of CFLAGS, FFLAGS, and CXXFLAGS in configure.ac.
Prior to this change, running ./configure revealed their default settings:

egrep "CFLAGS=|FFLAGS=|CXXFLAGS=" config.log
CFLAGS='-g -O2'
CXXFLAGS='-g'
FFLAGS='-g -O2'

Next, I updated configure.ac.

# For GNU compilers, CFLAGS, CXXFLAGS, and FFLAGS default to "-O2 -g".
# The CXXFLAGS "-O2" optimization has caused problems in the past.
# For Intel compilers, "-g" slows down runtimes considerably (MET #1778).
# For development, retain the "-g" option. Otherwise, discard it.

AM_COND_IF([ENABLE_DEVELOPMENT], [: ${CFLAGS="-g -O2"}], [: ${CFLAGS="-O2"}])
AM_COND_IF([ENABLE_DEVELOPMENT], [: ${CXXFLAGS="-g"}  ], [: ${CXXFLAGS=""}] )
AM_COND_IF([ENABLE_DEVELOPMENT], [: ${FFLAGS="-g -O2"}], [: ${FFLAGS="-O2"}])

And reran bootstrap and configure with MET_DEVELOPMENT not set:

unset MET_DEVELOPMENT; ./bootstrap; ./configure --prefix=`pwd`
egrep "CFLAGS=|FFLAGS=|CXXFLAGS=" config.log
CFLAGS='-O2'
CXXFLAGS=''
FFLAGS='-O2'

With MET_DEVELOPMENT set:

export MET_DEVELOPMENT=true; ./bootstrap; ./configure --prefix=`pwd`
egrep "CFLAGS=|FFLAGS=|CXXFLAGS=" config.log
CFLAGS='-g -O2'
CXXFLAGS='-g'
FFLAGS='-g -O2'

And testing the environment variable overrides:

export CFLAGS='-my_cflags'; export CXXFLAGS='-mycxxflags'; export FFLAGS='-my_fflags'
./boostrap; ./configure --prefix=`pwd`
egrep "CFLAGS=|FFLAGS=|CXXFLAGS=" config.log
CFLAGS='-my_cflags'
CXXFLAGS='-mycxxflags'
FFLAGS='-my_fflags'

So this is all working as expected.

JohnHalleyGotway added a commit that referenced this issue May 7, 2021
…ing development, compile with the -g debug option. Otherwise, remove it by default.
JohnHalleyGotway added a commit that referenced this issue May 7, 2021
…so, change the default MET version from 8.1 to development.
@JohnHalleyGotway JohnHalleyGotway mentioned this issue May 7, 2021
12 tasks
@JohnHalleyGotway JohnHalleyGotway linked a pull request May 7, 2021 that will close this issue
12 tasks
JohnHalleyGotway added a commit that referenced this issue May 8, 2021
* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.
JohnHalleyGotway added a commit that referenced this issue May 24, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
JohnHalleyGotway added a commit that referenced this issue Jun 1, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
JohnHalleyGotway added a commit that referenced this issue Jun 13, 2021
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

* Feature GitHub actions (#1742)

* Adding files to build documenation via GitHub Actions

* Removing html_theme_options

* Removed warnings.log from help section

* Feature 1575 large_diffs (#1741)

* Per #1575, add mpr_column and mpr_thresh entries to all of the Grid-Stat and Point-Stat config files.

* Per #1575, define config strings to be parsed from the config files.

* Per #1575, store col_name_ptr and col_thresh_ptr in PairBase. They are being used for PairDataPoint to do MPR filtering in Grid-Stat and Point-Stat. But they could be eventually be extended to filter ORANK columns for Ensemble-Stat.

* Per #1575, add MPR filtering logic to pair_data_point.cc. Include filtering logic in PairDataPoint instead of VxPairDataPoint since Grid-Stat uses PairDataPoint.

* Per #1575, update point_stat to parse the mpr_column and mpr_thresh config file options. Include the MPR rejection reason code counts in the log output.

* Per #1575, updated Grid-Stat to parse mpr_column and mpr_thresh options.

* Per #1575, update Point-Stat to store mpr_sa and mpr_ta locally and then call set_mpr_filt() after the VxPairDataPoint object has been sized and allocated.

* Per #1575, renamed PairDataEnsemble::subset_pairs() to subset_pairs_obs_thresh() to be a little more explicit about things. I'll do the same for PairDataPoint using names subset_pairs_cnt_thresh() and subset_pairs_mpr_thresh().

* Per #1575, some cleanup, moving check_fo_thresh() utility function from vx_config to vx_statistics library.

* Per #1575, when implementing this for Grid-Stat, I realized that there isn't much benefit in storing col_name_ptr and col_name_thresh in PairBase. These changes remove that.

* Per #1575, updating pair_data_point.h/.cc to handle the subsetting of data based on the MPR thresh.

* Per #1575, rename subset_pairs() to subset_pairs_cnt_thresh() to be a bit more explicit with the naming conventions.

* Per #1575, no real changes here. Just reorganizing the location of the mpr_sa and mpr_ta members.

* Per #1575, make the subset_pairs() utility function a member function of the PairDataPoint class named subset_pairs_cnt_thresh() and update the application code to call it.

* Per #1575, need to actually set the mpr_thresh!

* Per #1575, update subset_pairs_mpr_thresh() to make sure the StringArray and ThreshArray lengths are the same.

* Per #1575, replace PairDataPoint::subset_pairs_mpr_thresh() with a utility function named apply_mpr_thresh_mask(). This is for Grid-Stat to apply the mpr_thresh settings after the DataPlane pairs have been created but prior to applying any smoothing operations.

* Per #1575, add documentation about mpr_column and mpr_thresh.

* Per #1575, mpr_columns can also include CLIMO_CDF.

* Per #1575, add tests for Grid-Stat and Point-Stat to exercise the mpr_column and mpr_thresh config file options.

* Feature 1319 no pickle (#1720)

* Try path insert.

* sys.path insert.

* Per #1319, adding David's changes back into the feature_1319_no_pickle branch. It compiles but TEST: python_numpy_plot_data_plane_pickle fails when testing on my Mac. Comitting now to test on kiowa.

* Per #1319, small updated to write_tmp_dataplane.py script. Had a couple of if statements that should really be elif.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>

* Feature 1736 out_stat (#1744)

* Per #1736, if -out_stat was used for aggregate or aggregate_stat jobs, do not write output to the -out or log output.

* Per #1736, clarify stat_analysis documentation for -out_stat option.

* Per #1736, for jobs which can write .stat output, don't waste time populating the output AsciiTable unless it's actually going to be written.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Per #1319, this is a hotfix to the develop branch. While running unit_python.xml works via the command line, it fails when run through cron. The problem is the PATH setting. Need to have the anaconda bin directory in the path for it to succeeed. Adding that for the single test.

* Just lining up a log message in the output of gen_vx_mask.

* Per #1319, setting PATH as an envvar might cause problems. All variables set prior to the test are unset afterwards! So we'd run all the rest of the tests after unit_python.xml with an empty path. That would likely cause any subsequent call to Rscript to fail. Recommend tightening up this logic when we move these tests to GHA.

* Trying to get the PATH setting correct for unit_python.xml.

* Changed weblink for METplus documentation

* per #1319 added netCDF4 python package to MET docker image so it is available for python embedding cases that use MET_PYTHON_EXE

* Feature 1747 pylonglong (#1748)

* Per #1747, update MET to interpret longlong values as integers. NetCDF file attributes that have an LL suffix are read into python as numpy.int64 objects. Right now MET fails when trying to read those as integers. Update the parsing logic to interpret those as ints.

* Per #1747, since MET can now interpret both long and longlong's as ints, there's no need to cast nx and ny to ints in the read_tmp_dataplane.py script anymore.

* Per #1747, this is slightly unrelated. But after installing the netCDF4 module on kiowa for /usr/local/met-python3/bin/python3, we should no longer need a custom PATH setting to get unit_python.xml to work. Reverting the change I made to it a couple of days ago to get it working.

* Hotfix for the develop branch in tc_pairs.cc. The METplus unit tests kept failing through GHA with a divide by zero error. It occurs in compute_track_err() but only for a very specific set of data. The bdeck valid increment evaluates to 0 which causes the divide by 0 error. It also can evaluate to bad data (e.g. -9999). The fix is to check for 0 and bad data. If found, use the constant best_track_time_step value instead.

* Turned specific section numbers into linked sections because section numbers can change

* Removed hard-coded references to section numbers

* Feature 1714 tc_gen (#1750)

* Per #1714, add tc_gen genesis_match_window configuration option to define a search window relative to the forecast genesis time.

* Per #1714, clarify docs to state the genesis_match_window.end = 12 allows for matches for early forecasts. Also add an example of this option to the tc_gen unit test.

* Per #1714, switch ops_hit_tdiff to ops_hit_window.

* Per #1714, skip genesis events for tracks where the cyclone number is > 50.

* Per #1714, only discard cyclone numbers > 50 from the Best track, not the forecast tracks.

* Per #1716, add note to the tc_gen chapter about skipping Best tracks with cyclone number > 50.

* Per #1714, adding genesis_match_point_to_track config file option for TC-Gen. Note that this version of the code is close but doesn't actually compile yet. I still need to figure out exactly how to process the operational tracks. Should this logic also apply to the matching for those tracks?

* Per #1714, the logic for checking the operational tracks is pretty simple. We only store/check operational track points for lead time = 0. So applying the genesis_match_point_to_track boolean config option does not make sense.

* Per #1714, update the tc-gen user's guide chapter to describe the updated logic and new config file option.

* Per #1714, fix the logic of the is_match() function.

* Per #1714, reconfigure the call to tc_gen to exercise the new genesis_match_track_to_point option.

* Per #1714, just fixing spacing in source code.

* Committing 2 small changes not specifically related to #1714, but related the processing of genesis tracks. When getting items from ATCFGenLines, the columns to be shifted are off by one. We had been shifting offset 2 up to 3, but it should have remained at 2. Also when initializing a TrackInfo object, set the StormID by calling ATCFLineBase::storm_id() instead of constructing it from BASIN:CYCLONE:YYYY. For ATCFGenLines we want to set the Storm ID equal to the 3rd column rather than constructing it!

* Per #1714, fix an error in the logic of GenesisInfo::is_match(const GenesisInfo &,...). I was using the index of the current GenesisInfo object instead of the one from the input argument. Fix this by adding GenesisInfo::genesis() member function to return a reference the TrackPoint for Genesis.

* Per #1714, correcting logic for parsing the storm_id and warning_time columns for ATCFGen and regular ATCF line types. For ATCFGen line types, the code was incorrectly using the 3rd column when it should have used the 4th column!

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* feature_1552_gcc_10 (#1752)

* Cleaned bash comparison operators; Made changes for MET to compile using GNU 10.1.0 compilers

* Updated documentation for new flag for BUFRLIB compilation

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* #1755 Support time string for slicing at MetNcCFDataFile::collect_time_offsets

* Feature 1735 stat_analysis (#1754)

* Per #1735, enhance Stat-Analysis to support multiple -out_thresh settings when processing aggregate_stat jobs for MPR line types. This applies to output for FHO, CTC, CTS, ECLV, CNT, SL1L2, and SAL1L2.

* Per #1735, since we are writing multiple output line types to the same .stat file, adding stat_row counter to the job class. This way all functions that write to it can increment that row counter when needed.

* Per #1735, lots of little changes here to enable the aggregate_stat job type to write multiple output line types.g

* Per #1735, update documentation for stat_analysis.

* Per #1735, add AsciiTable::expand() function to increase the AsciiTable dimensions and also update the Stat-Analysis handling of the output .stat file it writes.

* Per #1735, fix bug in the write_job_aggr_ssvar where I was using the wrong row counter when writing .stat outptu.

* Per #1735, print a warning message is the continuous filtering logic results in 0 matched pairs. Also, match existing logic to NOT WRITE any output, not even header rows, when the output AsciiTable contains no results.

* Per #1735, update unit_stat_analysis.xml to consolidate jobs stat_analysis_AGG_STAT_ORANK_RHIST and stat_analysis_AGG_STAT_ORANK_PHIST down into 1 job with 2 output line types. Rename the output files accordingly.

* Per #1735, update the STATAnalysis config file for processing MPR data by tweaking the jobs to write multiple output line types and/or apply multiple output thresholds.

* Per #1753, this one change to write_tmp_data.py solves this problem. When creating the variable to write the temp NetCDF file, we just need to pass through the fill value for the data. Also, make the script less verbose.

* Per #1753, make the read_tmp_dataplane.py script less verbose.

* Per #1753, there are 3 calls to the user's python version throughout MET. Update all 3 to print consistent log message when writing/reading the temp file. In particular, print the system command that is being executed at Debug(4) to make it easier to replicate python embedding problems that may arise.

* Per #1753, wrap the call to get_fill_value() in a try block in case the input in a regular numpy array instead of a masked array.

* Per #1620, correct bug in read_ascii_mpr.py script. The MPR line type has 37 columns in it, not 36! (#1760)

* Feature 1700 python (#1765)

* Per #1700, no real change, removing extra newline.

* Per #1700, move global_python.h from vx_data2d_python over to the vx_python3_utils library where it belongs better.

* Per #1700, no code changes. Just removing commented out code.

* Per #1700, lots of little changes to make the python scripts consistent, updating the write*.py functions to add the user script directory to the system path, and remove extraneous log messages.

* Per #1700, rename generic_python.py to set_python_env.py. Still actually need to change the source code to handle this change!

* Per #1700 remove the pickle import.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, work in progress. Replaced pickle file with tmp file.

* Per #1700, update read_tmp_ascii.py to work for both ascii2nc and stat_analysis. Just create an object named ascii_data and have both instances read it.

* Per #1700, getting closer. Work in progress. Just need to get user-python embedding working for stat-analysis.

* Per #1700, removing extraneous cout.

* Per #1700, fix logic in PyLineDataFile::do_tmp_ascii() to get stat_analysis python embedding working again.

* Per #1700, just comments.

* Per #1700, replace references to pickle with user_python

* Per #1700, update documentation to replace pickle with temp files.

* Feature 1766 v10.0.0_beta5 (#1767)

* Per #1766, udpate the release date and add release notes for v10.0.0-beta5.

* Per #1766 and #1728, update the copyright notice year to 2021.

* Bugfix 1768 edeck (#1769)

* Per #1768, update logic in ATCFProbLine::read_line(). If read_line() from the base class returns bad status, have this one return bad status as well. But do NOT for unsupported line types. Just print a Debug(4) log message instead.

* Per #1768, update the probability line types to match those listed in https://www.nrlmry.navy.mil/atcf_web/docs/database/new/edeck.txt. That documentation was last updated in 11/2020, so presumably these reflect NHC's latest changes.

* Per #1768, renaming enumerated value from ATCFLineType_ProbRIRW to ATCFLineType_ProbRI since there are now separated ATCF line type for rapid intensitifcation (RI) and weakening (RW). Will work more with this data in future issues to verify more of these probability types.

* Feature 1771 release_notes (#1772)

* Per #1771, draft version of combined met-10.0.0 release notes.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, add bolding to indicate emphasis.

* Per #1771, also update the flowchart for met-10.0.0.

* Per #1771, update flowchart to indicate that tc_gen now has netcdf output.

* Per #1771, rotate the authorship list for met-10.0.0, shifting Barb from first author down to the end.

* Committing hotfix to the develop branch to fix a bad merge that caused the MET compilation to fail.

* Update compile_MET_all.sh

Added "-L${LIB_LIBPNG}" to rpath to fix problem on WCOSS"

* Update pull_request_template.md

Added entry for completion date of pull request review.

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Per #1777, fixing memory management in DbfHeader::set_subrecords(). It is dynamically allocating a buffer based on the record_legnth (e.g. 5) but then reading 32 characters into it! Deleting the dynamically allocated buf variable causes it to abort. Since we always read 32 bytes here, switch to a static buffer of that size rather than dynamically allocating. (#1779)

Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>

* Feature 1731 authorship (#1776)

* Per #1731, add ioda2nc documentation.

* Changes to make the authorship list consistent with METplus.

* Rename CIR to CIRA.

* Per #1731, fix alignment issued caused by tabs vs spaces.

* Updated input sources to include newly acceptable data formats

* Per #1731, add more details about the grid-diag bin definition.

* Per #1731, clarify wording.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Changes to align Xarray language with that used in Xarray documentation, to clarify only DataArray Xarray structures are supported and not Xarray Dataset structures, and an example of how to quickly create a DataArray for MET from a Dataset.

* Corrects plural of DataArray.

* Aligns references to NumPy arrays with NumPy docs to refer to them as ndarrays.

* Fixes formatting of note for Xarray.

* Update met/docs/Users_Guide/tc-pairs.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Fixes hyperlink reference.

* Fixes line spacing in note directive.

* Changes to reference again.

* #1782 Set the time offset to 0 if the time dimension does not exist at the data variable

* Updates to link and note directive.

* Corrects plural of DataArray once more.

* Adds more clarity to NumPy heading.

* Adds bold for emphasis.

* Removes bold in note directive.

* Feature 1778 debug (#1785)

* Per #1778, please see #1778 (comment) for details. Basically, when doing development, compile with the -g debug option. Otherwise, remove it by default.

* Per #1778, update stale URL's in the README and configure.ac file. Also, change the default MET version from 8.1 to development.

* Made suggested changes by Tara Jensen

* Update python embedding docs to list required packages for the base python version.

* Feature 1786 v10.0.0 (#1787)

* Per #1786, updates for the v10.0.0 release. Note that no changes were needed in conf.py. It had already been updated.

* Added update MET to compile using GNU version 10 compilers and PGI version 20 compilers

* Made updates to improve compilation

* Per #1768, added a couple more 10.0.0 release notes.

* Adding pgi config file from cheyenne

Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>

* Per #1789, remove duplicate plot_point_obs configuration section. (#1791)

* Update the version of Fortify on kiowa from 19.2.0 to 20.2.1.

* #1795 Release memory at time_values

* Bugfix 1798 develop py_grid_string (#1800)

* Per #1798, fix up the read_tmpe_dataplane.py script to handle a grid string or dictionary.

* Per #1798, add a test to unit_python.xml to exercise this bugfix.

* #1794 Corrected the offset for Filter

* Github Issue #1801: Comment out code that checks for BEST track to support extra-tropical cyclone tracks not verified against BEST tracks.

* #1795 Cleanup

* #1795 Create DataCube for 2D or 3D only, not both to avoid memory leak

* Bugfix 1395 develop comp script (#1804)

* Updated compile script and added assocaited config files

* Added jet config file

* Updated orion file

* Added new stampede config file and modulefiles for various machines

* Gitub Issue #1801 Remove code that checks for -bmodel filter to support plotting of extra-tropical cyclone tracks that aren't verified against BEST tracks.

* Migrate issue and PR template changes from PR #1803 into the develop branch so that they'll be available for future releases.

* Per met-help question (https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=99964) clarify the description of the obs_thresh option.

* Update README.md

Adding GitHub Discussions information

* changed non-unicode apostrophe and fixed typo in URL

* Feature 1581 api point obs (#1812)

* #1581 Initial release

* #1581 Added met_nc_point_obs.cc met_nc_point_obs.h

* Removed nc_ in function names and moved them to the struct members

* #1581 Added HDR_TYPE_ARR_LEN

* #1581 Changed API calls (API names)

* #1581 Cleanup

* #1581 Removed duplicated definitions:hdr_arr_len, hdr_typ_arr_len, and obs_arr_len

* #1581 Removed duplicated definitions:hdr_arr_len and obs_arr_len

* #1581 Removed duplicated definitions: str_len, hdr_arr_len, and obs_arr_len

* Added vx_nc_obs library

* #1581 Using common APIs

* #1581 Corrected API calls because of renaming for common APIs

* #1581 Moved function from nc_obs_util to nc_point_obs2

* #1581renamed met_nc_point_obs to nc_point_obs

* #1581 API ica changed from obs_vars to nc_point_obs

* #1581 Initial release

* #1581 Renamed from met_nc_point_obs to nc_point_obs

* 1581 Renamed met_nc_point_obs to nc_point_obs

* Per #1581, update the Makefile.am for lidar2nc to fix a linker error. Unfortunatley, the vx_config library now depends on the vx_gsl_prob library. threshold.cc in vx_config includes a call to normal_cdf_inv(double, double, double) which is defined in vx_gsl_prob. This adds to the complexity of dependencies for MET's libraries. Just linking to -lvx_gsl_prob one more time does fix the linker problem but doesn't solve the messy dependencies.

* #1581 Added method for NcDataBuffer

* Cleanup

* #1581 Cleanup

* #1581 Cleanup

* #1591 Cleanup

* #1591 Corrected API

* #1581 Avoid reading PB header twice

* #1581 Warning if PB header is not defined but read_pb_hdr_data is called

* #1581 Cleanup libraries

* 1581 cleanup

* 1581 cleanup

* 1581 cleanup

* #1581 Cleanup for Fortify (removed unused variables)

* #1581 Cleanup

* #1581 Cleanup

* #1581 Use MetNcPointObsIn instead of MetNcPointObs

* #1581 Use MetNcPointObsOut instead of MetNcPointObs2Write

* #1581 Separated nc_point_obs2.cc to nc_point_obs_in.cc and nc_point_obs_out.cc

* #1581 Renamed nc_point_obs2.cc to nc_point_obs_in.cc And added add nc_point_obs_in.h nc_point_obs_out.h nc_point_obs_out.cc

* #1581 Removed APIs related with writing point obs

* #1581 Changed copyright years

* #1581 Cleanup

* #1581 Updated copyright year

* #1581 Cleanup

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Reanmed read_obs_data_strings to read_obs_data_table_lookups

* #1581 Added more APIs

Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Feature 1792 gen_vx_mask (#1816)

* Per issue #1792, Change -type from optional to required. Set default_mask_type to MaskType_None. Added a check on mask_type to see if it's set and print error message accordingly.

* Update test_gen_vx_mask.sh

* For the first test, added -type poly, since the masking type is now required. SL

* For all of the Poly unit tests added -type poly to the command line. The mask type is now required. SL

* Modified document to indicate that -type string (masking type) is now required on the command line for gen_vx_mask. SL

* Update met/docs/Users_Guide/masking.rst

Co-authored-by: johnhg <johnhg@ucar.edu>

* Update met/src/tools/other/gen_vx_mask/gen_vx_mask.cc

Co-authored-by: johnhg <johnhg@ucar.edu>

Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu>
Co-authored-by: johnhg <johnhg@ucar.edu>

* Fix 2 minor formatting errors in the release notes.

* PR #1816 for issue #1792 unexpectedly caused the NB to fail on 20210605. We changed -type from optional to required, but missed adding the -type option in unit_met_test_scripts.xml and unit_ref_config.xml. This is a hotfix to resolve that.

* Feature 1811 anchor links (#1822)

* testing new anchoring link idea.

* testing without the bold asterik

* tinkering with the look

* another attempt

* another attempt #2

* another attempt #3

* another attempt #4

* making sure new anchor works as expected.

* seeing if link will save with spaces instead of dashes

* need underscores to link

* is it fixed?

* testing

* testing 2

* testing 4

* testing 5

* testing 6

* testing 7

* testing 8

* going back to test original problem

* able to link with spaces instead of underscores. Testing if a return is possible to keep under 79 character limit.

* double checking everything is still working.

* DO NOT break ref lines apart, it won't work.

* trying a shorter name.

* continuing to add anchors

* updating lines 1946 thru 2214 with anchors

* updating lines 2214 thru 3371 with anchors

* updating lines 3371 to the end with anchors

* testing anchor

* testing anchor

* testing anchor 3

* testing anchor 4

* testing anchor 45 percent

* testing anchor final half

* fixing typo

* numbering fcst, obs_1 and 2 to create different links.

* finding more anchors that need numbers to keep them separate.

* fixing warnings

* fixing warnings

* fixing typo

* Feature 1749 hss (#1825)

* Per #1749, updating the MET version number from 10.0 to 10.1 prior to adding new columns of output to existing line types.

* Per #1749, adding 10.1 columns to the Makefile.am

* Per #1749, changes for the mechanics of adding the HSS_EC statistic to the MCTS line type. Still need to acutally compute it and make the expected correct value configurable.

* Per #1749, add hss_ec_value as a configurable option for Point-Stat and Grid-Stat. Still need to actually compute it correctly, add it to other test config files, add support to series_analysis/stat_analysis, update the docs, and make writeup corresponding issues for other METplus components.

* Per #1749, fix the column offsets for the HSS_EC columns.

* Per #1749, add correct definition of HSS_EC.

* Per #1749, pass hss_ec_value from the config file into the computation of the MCTS statistics.

* Per #1749, add hss_ec_value entry to all the Grid-Stat config files.

* Per #1749, update the documentation about the HSS_EC statistic.

* Per #1749, add the -hss_ec_value job command option to Stat-Analysis.

* Per #1749, no real code changes here. Just changing to consistent ordering with hss_ec_value preceeding rank_corr_flag.

* Per #1749, update docs for stat_analysis supporting hss_ec_value.

* Per #1749, add HSS_EC to Series-Analysis, but only with a constant hss_ec_value for now.

* Per #1749, add EC_VALUE to the MCTC line type definition.

* Per #1749, move ECvalue from the MCTSInfo class into the ContingencyTable class so that it's available to be included in the MCTC output line type.

* Per #1749, update point_stat, grid_stat, and series_analysis to accomodate the move of ECvalue from the MCTSInfo class to the ContingencyTable class.

* Per #1749, update library code to write EC_VALUE to the MCTC line type and update the User's Guide docs.

* Per #1749, update stat_analysis code for the addition of EC_VALUE in the MCTC line type.

* Per #1749, write EC_VALUE to the MCTC output line type.

* Per #1749, store the ec_value that was actually used to compute the stats.

* Per #1749, parsing EC_VALUE from the MCTC line type.

* Per #1749, move the MCTC EC_VALUE column to the end of the line, as requested by METdatadb.

* Per #932, need to write MCTS HSS_EC value to temp file during the bootstrapping process.

* Added new reference for Ou 2016

* Layout correction

* Added generalized HSS, removed word from HSS_EC

* Per #1749, change the hss_ec_value config entry to match new conventions.

Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>

* Feature 1826 v10.1.0_beta1 (#1828)

* Per #1826, add update the version in the docs to 10.1.0-beta1 and add release notes for this development version.

* Per #1826, change the beta1 release date to 6/11 so that I can do it today.

* Revoming Randy and David from the email notification list for nightly run scripts.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
Co-authored-by: George McCabe <mccabe@ucar.edu>
Co-authored-by: John.H.Gotway@noaa.gov <John.H.Gotway@v72a1.ncep.noaa.gov>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: bikegeek <minnawin@ucar.edu>
Co-authored-by: bikegeek <3753118+bikegeek@users.noreply.github.com>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: Seth Linden <linden@kiowa.rap.ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: code optimization Code optimization issue MET: Library Code priority: blocker Blocker requestor: NOAA/EMC NOAA Environmental Modeling Center type: bug Fix something that is not working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants