Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise warning if no time coords found with decode_times #409

Merged
merged 3 commits into from
Feb 6, 2023

Conversation

tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented Jan 24, 2023

Description

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@tomvothecoder tomvothecoder self-assigned this Jan 24, 2023
@tomvothecoder tomvothecoder added type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. Priority: High patch A patch release, includes backwards compatible bug fixes. labels Jan 24, 2023
@tomvothecoder
Copy link
Collaborator Author

Hey @lee1043 and @pochedls, this PR is ready for review.

I tested it locally using dummy datasets without time coordinates and it raises the warning as expected. It would be good to test using the dataset from the example in #396 too.

xcdat/dataset.py Outdated
Comment on lines 103 to 106
try:
ds = decode_time(ds)
except KeyError as e:
logger.warning(e)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key fix – if decode_time fails then ds remains as is and a warning is printed. No need to update the decode_times boolean. This makes sense to me.

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2023

Codecov Report

Base: 100.00% // Head: 100.00% // No change to project coverage 👍

Coverage data is based on head (d331ccf) compared to base (1a0afb4).
Patch coverage: 100.00% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #409   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           14        14           
  Lines         1254      1260    +6     
=========================================
+ Hits          1254      1260    +6     
Impacted Files Coverage Δ
xcdat/dataset.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Collaborator

@pochedls pochedls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I just had one question about where logger warnings go – I think they are generally unseen in the conda version (unless you specifically try to capture it). Is that right?

xcdat/dataset.py Outdated
Comment on lines 402 to 405
try:
ds_new = decode_time(ds_new)
except KeyError as e:
logger.warning(e)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this issue and it prints out the following in my xcdat_dev environment:

2023-01-25 14:39:40,142 [WARNING]: dataset.py(open_dataset:106) >> "No time coordinates were found in this dataset to decode. Make sure time coordinates exist and their CF 'axis' or 'standard_name' attribute is set (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding with decode_time() again."

This is what we wanted...but as a user I don't want this printed to the console...does this print to the console in the conda version or is this only evident in the developer environment?

Copy link
Collaborator Author

@tomvothecoder tomvothecoder Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change the level to .debug if we want to avoid printing these warnings to the console. This is reasonable if the default behavior is decode_times=True and we expect decode_times=True to work in most cases.

One scenario to consider is if the dataset has time coordinates without CF attributes (e.g., axis or standard_name) but the user expects the coordinates to be decodable. In this case, the message won't be raised unless the user sets the xcdat logger level to .debug, or if they try to run temporal averaging operations which will throw the KeyError.

Let me know what you think about the above scenario and how frequent you think that might occur.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've convinced me we should keep the code as you have it. @lee1043 - thoughts?

@lee1043
Copy link
Collaborator

lee1043 commented Jan 26, 2023

I tested it locally using dummy datasets without time coordinates and it raises the warning as expected. It would be good to test using the dataset from the example in #396 too.

@tomvothecoder below is what I see when I open the example data in #396 from terminal. It is great that data opened with no problem.

Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:41:54) [Clang 13.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import xcdat as xc
>>> path = 'target_grid.nc'
>>> ds = xc.open_dataset(path)
2023-01-26 11:23:21,603 [WARNING]: dataset.py(open_dataset:106) >> "No time coordinates were found in this dataset to decode. Make sure time coordinates exist and their CF 'axis' or 'standard_name' attribute is set (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding with `decode_time()` again."

I understand @pochedls prefers error message to not shown in console and I agree. In parallel, I think revising the error message slightly may be helpful for clarifying. I suggest something like below:
No time coordinates were found in this dataset to decode. If existence of time coordinates is expected, make sure time coordinates exist and their CF 'axis' or 'standard_name' attribute is set (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time')

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Feb 3, 2023

@tomvothecoder below is what I see when I open the example data in #396 from terminal. It is great that data opened with no problem.

Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:41:54) [Clang 13.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import xcdat as xc
>>> path = 'target_grid.nc'
>>> ds = xc.open_dataset(path)
2023-01-26 11:23:21,603 [WARNING]: dataset.py(open_dataset:106) >> "No time coordinates were found in this dataset to decode. Make sure time coordinates exist and their CF 'axis' or 'standard_name' attribute is set (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding with `decode_time()` again."

I understand @pochedls prefers error message to not shown in console and I agree.

Thank you for your input @lee1043. @pochedls is actually in agreement with my revision to raise a warning instead of silencing it. A warning is raised if time coordinates cannot be detected with decode_times=True (the default).
Time coordinates can't be detected if they don't exist or don't have the appropriate CF attributes to map to them.

If we expect decode_times=True to work in most cases, I think it is reasonable to raise a warning since it won't appear often and they can override decode_times=False if they want. Also, it is probably not good idea to silence potential warnings or errors because the user might expect time coordinates to be decoded/decodeable but that isn't true (and they find out later in their temporal workflow such as averaging).

In parallel, I think revising the error message slightly may be helpful for clarifying. I suggest something like below: No time coordinates were found in this dataset to decode. If existence of time coordinates is expected, make sure time coordinates exist and their CF 'axis' or 'standard_name' attribute is set (e.g., ds['time'].attrs['axis'] = 'T' or ds['time'].attrs['standard_name'] = 'time')

Good suggestion. I agree with revising the warning message to make it clearer.

Comment on lines +319 to +324
"No time coordinates were found in this dataset to decode. If time "
"coordinates were expected to exist, make sure they are detectable by "
"setting the CF 'axis' or 'standard_name' attribute (e.g., "
"ds['time'].attrs['axis'] = 'T' or "
"ds['time'].attrs['standard_name'] = 'time'). Afterwards, try decoding "
"again with `xcdat.decode_time`."
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated KeyError in decode_time(), which is raised as a logger warning in open_dataset() and open_mfdataset()

@tomvothecoder
Copy link
Collaborator Author

Merging this PR

@tomvothecoder tomvothecoder merged commit c1989c9 into main Feb 6, 2023
@tomvothecoder tomvothecoder deleted the bug/396-decode-times branch February 6, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch A patch release, includes backwards compatible bug fixes. type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: decode_times=True as default triggers error when open netcdf with no time dimension
4 participants