Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic ObsEntry #1360

Merged
merged 21 commits into from
Oct 7, 2024
Merged

Pydantic ObsEntry #1360

merged 21 commits into from
Oct 7, 2024

Conversation

lewisblake
Copy link
Member

@lewisblake lewisblake commented Oct 2, 2024

Change Summary

Deeper work motivated by desire to generalize model naming convention.
First step of making ObsCollection and ModelCollection child classes of BaseModel
Partly related to improving the ColocatedData interface.
Rename fileconventions -> file_conventions
EvalSetup.get_obs_entry() now returns an instance of ObsEntry, not a dictionary. Clean up parts of the code which relied on this dictionary to instead, f. eks., access instance attributes.
Remove HasColocator._get_diurnal_only() and associated tests. Instead give ObsEntry an attribute diurnal_only with a default of False.

Related issue number

Part of #1085

Checklist

  • Start with a draft-PR
  • The PR title is a good summary of the changes
  • PR is set to AeroTools and a tentative milestone
  • Documentation reflects the changes where applicable
  • Tests for the changes exist where applicable
  • Tests pass locally
  • Tests pass on CI
  • At least 1 reviewer is selected
  • Make PR ready to review

@lewisblake lewisblake changed the title Deeper work motivated by desire to generalize model naming convention Pydantic ObsEntry Oct 4, 2024
@lewisblake lewisblake requested a review from dulte October 4, 2024 14:52
@lewisblake lewisblake added this to the m2024-10 milestone Oct 4, 2024
Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 13 lines in your changes missing coverage. Please review.

Project coverage is 78.85%. Comparing base (af3a797) to head (114ecdd).
Report is 37 commits behind head on main-dev.

Files with missing lines Patch % Lines
pyaerocom/aeroval/obsentry.py 80.35% 11 Missing ⚠️
pyaerocom/aeroval/experiment_processor.py 66.66% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           main-dev    #1360      +/-   ##
============================================
+ Coverage     78.84%   78.85%   +0.01%     
============================================
  Files           136      136              
  Lines         20790    20849      +59     
============================================
+ Hits          16392    16441      +49     
- Misses         4398     4408      +10     
Flag Coverage Δ
unittests 78.85% <85.71%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dulte
dulte previously requested changes Oct 4, 2024
Copy link
Collaborator

@dulte dulte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Some dead code. But other than that, I think it can be merged

pyaerocom/aeroval/collections.py Outdated Show resolved Hide resolved
tests/aeroval/test_setup_classes.py Outdated Show resolved Hide resolved
@lewisblake lewisblake requested a review from dulte October 6, 2024 17:41
@lewisblake lewisblake requested review from thorbjoernl and removed request for heikoklein October 7, 2024 10:42
Copy link
Collaborator

@thorbjoernl thorbjoernl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicking a bit, otherwise lgtm

pyaerocom/aeroval/collections.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/setup_classes.py Outdated Show resolved Hide resolved
@@ -320,7 +320,7 @@ class ColocationSetup(BaseModel):
@classmethod
def validate_obs_vars(cls, v):
if isinstance(v, str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I just don't understand what these pydantic validators are supposed to do but shouldn't this raise an exception for unexpected input? Right now it just returns the value?

The example in field_validator docs raises ValueError for unexpected input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obs_vars can be either a string or tuple of strings. What this is doing is coercing a provided string into a tuple of strings (for consistency). If it is not a string or a tuple of strings, then pydantic raises an error immediately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the typehint on obs_vars is what is accepted as input in a sense? Is there a way to make vscode use the narrower set of hints that the variable actually contains after coercion? Because right now a lot of the tooling struggles with assuming that the types can be any of the original types.

Copy link
Member Author

@lewisblake lewisblake Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the typehint on obs_vars is what is accepted as input in a sense?

Yes

Is there a way to make vscode use the narrower set of hints that the variable actually contains after coercion? Because right now a lot of the tooling struggles with assuming that the types can be any of the original types.

I'm not sure.

pyaerocom/aeroval/obsentry.py Show resolved Hide resolved
pyaerocom/aeroval/obsentry.py Show resolved Hide resolved
pyaerocom/aeroval/obsentry.py Show resolved Hide resolved
pyaerocom/aeroval/obsentry.py Show resolved Hide resolved
pyaerocom/aeroval/collections.py Show resolved Hide resolved
Copy link
Member

@heikoklein heikoklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK-ish.

This PR breaks the pyaerocom interface at several places. It does not try to create a interoperability with the old interface. What it tries, i.e. introduction of immutable code, could most likely be achieved without all of these breaks, i.e. by introducing __getitem__.

Since the broken parts are mostly internal, I approve.

Some of the comments don't make any sense to me, so maybe just remove them. Easier to understand code than comments.

@@ -127,7 +104,7 @@ def get_colocator(self, model_name: str = None, obs_name: str = None) -> Colocat
mod_cfg = self.cfg.get_model_entry(model_name)
col_cfg["model_cfg"] = mod_cfg

# LB: Hack and at what lowlevel_helpers's import_from was doing
# Hack and at what lowlevel_helpers's import_from was doing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been in the code for many months now, just removing my initials. It's saying that the following lines of code were taken from a function which was broken apart.

# LB: Hack and at what lowlevel_helpers's import_from was doing
for key, val in obs_cfg.items():
# Hack and at what lowlevel_helpers's import_from was doing
for key, val in obs_cfg.model_dump().items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear/unnecessary comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

pyaerocom/aeroval/collections.py Outdated Show resolved Hide resolved


class ObsEntry(BaseModel):
"""Observation configuration for evaluation (BaseModel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I don't like it if the interface changes (here access to all attributes from ["XXX"] to .XXX) This might destroy existing code. Ok since obsentry is mostly internal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that where the user interacts with the code, we should try to minimize the changes to the interface. But also I think it's hard to provide structure to the code if everything is fundamentally a BrowseDict without everything becoming very nested. There's a balance there, and I think this is a good place to move away from the older dictionary-style accessing.

list
DESCRIPTION.

tuple[str, ...]
"""
return self.obs_vars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change of signature

return list(self.obs_vars)

would have worked as well, and didn't require the change to (v,) tuple code. A tuple[...] is in most cases better a list from a user-point perspective.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of an effort to make objects in the code immutable which should not change at run time, like has been changed elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to make obsentry immutable, which my example would also satisfy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree making ObsEntry immutable is great. Now implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, not as easy as I thought, but worth coming back to. #1367

Copy link
Collaborator

@thorbjoernl thorbjoernl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐐

@lewisblake lewisblake removed the request for review from dulte October 7, 2024 14:24
@lewisblake lewisblake dismissed dulte’s stale review October 7, 2024 14:26

You aren't at work today.

@lewisblake lewisblake merged commit bfef4c4 into main-dev Oct 7, 2024
8 checks passed
@lewisblake lewisblake deleted the generalize-model_naming_convention branch October 7, 2024 14:26
@lewisblake lewisblake restored the generalize-model_naming_convention branch October 7, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants