Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Testing Time #241

Merged
merged 89 commits into from
Oct 12, 2023
Merged

Improve Testing Time #241

merged 89 commits into from
Oct 12, 2023

Conversation

ssolson
Copy link
Contributor

@ssolson ssolson commented May 11, 2023

A major headwind to MHKiT development is the time it takes to run the testing suite.

This PR decreases the testing time by creating a data cache prior to running the testing suite. This step can be seen as the first block in the work flow below called prepare-cache:
image

The primary approach here is to use data caching to only call the APIs once. This should additionally reduce test failures and subsequent test re-runs.

While hindcast calls are the primary source of delay and failled API calls the caching was added to all of the API calls listed below:
API Calls to add cacheing to:

  1. wave/
    • ndbc
    • cdip
    • hindcast
  2. tidal/
    • noaa
  3. river/
    • usgs

The caching is handled through a single function handle_cache located in mhkit/utils/. A new handle cache test was added for this module.

To utilize the caching functionality in our testing suite the main.yml file added the prepare-cache step as shown in the flow diagram above. All tests depend on this step and will not continue whiteout a successful prepare-cache run.

This PR removes python 3.7 support due to the most recent release breaking the last 3.7 compatibility we had (See discussion in thread below).

An important follow on to this PR is to only run the hindcast calls if the hindcast files have been modified and to add python 3.10 support.

A modification was made to the contours.py file to fix a deprecation with latest matplotlib release v3.8.0.

@akeeste akeeste self-requested a review May 11, 2023 18:42
@ssolson
Copy link
Contributor Author

ssolson commented May 15, 2023

API Calls to add cacheing to:

  1. wave/
    • ndbc
    • cdip
    • hindcast
  2. tidal/
    • noaa
  3. river/
    • usgs

@ssolson
Copy link
Contributor Author

ssolson commented May 22, 2023

@jmcvey3 I have not looked at your latest PR or into these error but have you already solved these dolfyn xarray failures?
image

@ssolson
Copy link
Contributor Author

ssolson commented Sep 22, 2023

Python 3.7 is failing for all tests due to a Dolfyn method and an xarray new release.

  • In our current tests on master we only run py 3.7 tests on a conda install (no pip install or hindcast checks).
  • MHKiT's latest master release was on August 11.
  • xarray released a new version on August 19 that seems to have broken the py 3.7 conda compatibility.

Python 3.7 was released on 27 June 2018 making it over 5 years old. My vote is to drop 3.7 support in this PR and in a follow on PR add Python 3.10 support for the next MHKiT release.

image

Copy link
Contributor Author

@ssolson ssolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akeeste This PR is ready for review

.github/workflows/main.yml Outdated Show resolved Hide resolved
mhkit/river/io/usgs.py Outdated Show resolved Hide resolved
mhkit/tests/river/test_resource.py Outdated Show resolved Hide resolved
mhkit/tests/tidal/test_io.py Show resolved Hide resolved
mhkit/tests/tidal/test_io.py Show resolved Hide resolved
mhkit/wave/io/cdip.py Outdated Show resolved Hide resolved
mhkit/wave/io/hindcast/hindcast.py Outdated Show resolved Hide resolved
mhkit/wave/io/hindcast/wind_toolkit.py Show resolved Hide resolved
mhkit/wave/io/ndbc.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
Copy link
Contributor

@akeeste akeeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssolson I added a couple minor docstring revisions on my first pass. Thanks for documenting what's going on and where, that was helpful. I want to give this another pass and make sure I'm understanding everything. I'll add any more questions/comments then.

Notes:
------
- To access the WIND Toolkit hindcast data, users need to configure `h5pyd`
for data access on HSDS (see the WTK_hindcast_example notebook for more details).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for data access on HSDS (see the WTK_hindcast_example notebook for more details).
for data access on HSDS (see the metocean_example or WPTO_hindcast_example
notebooks for more details).


Note: To access the WIND Toolkit hindcast data, you will need to
configure h5pyd for data access on HSDS. Please see the
WTK_hindcast_example notebook for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WTK_hindcast_example notebook for more information.
metocean_example or WPTO_hindcast_example notebooks
for more information.

Copy link
Contributor

@akeeste akeeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssolson Overall the caching seems clean, well documented, and functional. The decreased time is a big improvement. I'm not familiar with the caching process beyond this review though. So I'll defer to your judgement on the implementation and we can learn as we go if issues arise.

@ssolson ssolson merged commit 8db27f5 into MHKiT-Software:develop Oct 12, 2023
1 check passed
@ssolson ssolson mentioned this pull request Apr 24, 2024
@ssolson ssolson mentioned this pull request May 6, 2024
ssolson added a commit that referenced this pull request May 8, 2024
# MHKiT v0.8.0
We're excited to announce the release of MHKiT v0.8.0, which brings a host of new features, enhancements, and bug fixes across various modules, ensuring compatibility with Python 3.10 and 3.11, and introducing full xarray support for more flexible data handling. Significant updates in the Wave and DOLfYN modules improve functionality and extend capabilities.

## Python 3.10 & 3.11 Support
MHKiT now supports python 3.10 and 3.11. Support for 3.12 will follow in the next minor update.
- #240


## Wave Module
### Enhancements:
**Automatic Threshold Calculation for Peaks-Over-Threshold**: We've introduced a new feature that automatically calculates the "best" threshold for identifying significant wave events. This method, originally developed by Neary, V. S., et al. in their 2020 study, has now been translated from Matlab to Python, enhancing our existing peaks-over-threshold functionality.

**Wave Heights Analysis**: A new function, `wave_heights`, has been added to extract the heights of individual waves from a time series. This function uses zero up-crossing analysis to accurately measure wave heights, improving upon our previous methods which only provided the maximum value between up-crossings.

**Enhanced Zero Crossing Analysis**: Building on the above, the zero crossing code previously embedded in `global_peaks` has been isolated into a helper function. This modular approach not only refines the codebase but also supports new functionalities such as calculating wave heights, zero crossing periods, and identifying crests.

### Bug Fixes:
**Contour Sampling Error in Wave Contours**: A bug identified in `mhkit.wave.contours.samples_contour` has been resolved. The issue occurred when period samples defined using the maximum period resulted in values outside the interpolation range of the contour data. This was corrected by ensuring that all sampling points are within the interpolation range and adjusting the contour data selection process accordingly.

- #268 
- #252 
- #278


## Xarray Support
MHKiT functions now fully support the use of xarray for passing and returning data.

- #279 
- #282
- #285
- #302
- #310


## DOLfYN

Thanks to the many user contributions and users of MHKiT the DOLFYN module include a significant number of enhancements and bug fixes. 

### Enhancements:
**Altimeter Support**: Enhanced the Nortek Signature Reader to add capability for reading ADCP dual profile configurations.

**Data Handling Improvements**: Introduced logic to skip messy header data that can accumulate during measurements collected via Nortek software on PCs and Macs.

**Instrument Noise Subtraction**: Added a function to subtract instrument noise from turbulence intensity estimation using RMS calculations, providing results that differ by approximately 1% from the existing standard deviation-based "I" property.

**Improved File Handling**: Updates for RDI files to handle changing "number of cells" and variable "cell sizes," which are now bin-averaged into the largest cell size.

### Bug Fixes:
**Power Spectra Calculation**: Fixed a bug where a given noise value was not being subtracted from the power spectra, and noise was inadvertently added as an input to dissipation rate calculations.

**Improved Header Handling**: Allowed RDI reader to skip junk headers effectively.

**Nortek Reader C Types Update**: Adjusted C types in the Nortek reader to handle below-zero water temperatures and to allow pitch and roll values to go negative.


- #280 
- #289
- #290
- #292
- #293
- #294
- #299


## River & Tidal: D3D
Added limits to `variable_interpolation` and added 3 array input capability to `create_points`
- #271

## Developer Experience
### Black formatting
Black formatting is now enforced on all MHKiT files. This ensures consistent formatting across the MHKiT package.
- #281

### Linting & Type Hints
MHKiT is in the process of enforcing pylint and adding type hints to all functions. Currently this has been achieved and is enforced in the Loads and Power modules.
- #288 
- #296 

### CI/CD
This release introduces significant reduction in testing time for development. This is achieved by reducing the number of tests for pulls against the develop branch and only running hindcast test when changes are made to it. A bug in the hindcast CI was fixed which only ran on changes to the hindcast tests instead of the hindcast module. Additionally the wave and wind hindcast needed to be separated in 2 jobs due to the excessive time taken to run a wind cache. This created a number of follow on PRs around solidifying the logic of these job. A special case for Python 3.8, pip, and Mac OS was added to use homebrew to install NetCDF and HDF5 to get tests to pass.
- #241
- #270
- #306
- #311
- #317
- #318
- #319
- #320
- #324

### Clean Up
MHKiT fixed an implementation error where functions used assert instead of built in errors for type and value checking. Additionally these PRs removed unused files, fixed typos, and created an argument which allows users to run CDIP API calls silently.
- #276
- #272
- #273
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants