Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obs only #593

Merged
merged 11 commits into from
Apr 25, 2022
Merged

Obs only #593

merged 11 commits into from
Apr 25, 2022

Conversation

dulte
Copy link
Collaborator

@dulte dulte commented Mar 4, 2022

Way of handling dummy data to be able obs only evaluation. See #436

@dulte
Copy link
Collaborator Author

dulte commented Mar 4, 2022

The _make_dummy_model function is only partially annotated, since importing EvalSetup lead to a circular import

@dulte dulte requested a review from avaldebe March 4, 2022 14:09
@codecov
Copy link

codecov bot commented Mar 4, 2022

Codecov Report

Merging #593 (eaa8f4f) into main-dev (27a7dac) will decrease coverage by 0.27%.
The diff coverage is 31.63%.

Impacted file tree graph

@@             Coverage Diff              @@
##           main-dev     #593      +/-   ##
============================================
- Coverage     76.98%   76.70%   -0.28%     
============================================
  Files            97       97              
  Lines         17494    17589      +95     
============================================
+ Hits          13467    13492      +25     
- Misses         4027     4097      +70     
Flag Coverage Δ
unittests 76.70% <31.63%> (-0.28%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyaerocom/helpers.py 72.56% <8.33%> (-3.92%) ⬇️
pyaerocom/aeroval/helpers.py 65.00% <35.89%> (-27.86%) ⬇️
pyaerocom/aeroval/experiment_output.py 87.10% <50.00%> (-0.37%) ⬇️
pyaerocom/aeroval/experiment_processor.py 73.43% <50.00%> (-6.96%) ⬇️
pyaerocom/aeroval/setupclasses.py 93.33% <100.00%> (+0.11%) ⬆️
pyaerocom/griddeddata.py 61.38% <100.00%> (ø)
pyaerocom/_lowlevel_helpers.py 82.48% <0.00%> (-0.51%) ⬇️
pyaerocom/config.py 83.39% <0.00%> (-0.17%) ⬇️
pyaerocom/io/read_eea_aqerep_base.py 56.52% <0.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74d1283...eaa8f4f. Read the comment docs.

Copy link
Collaborator

@avaldebe avaldebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some minor comments about the code itself.
More importantly, I would appreciate the private functions to be collected on a private module with a descriptive name.

pyaerocom/aeroval/experiment_output.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/experiment_output.py Outdated Show resolved Hide resolved
obs_list = self.cfg.obs_cfg.keylist(obs_name)
if self.cfg.model_cfg == {}:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use the fact that an empty dictionary "is falsy"

        if not self.cfg.model_cfg:
            ...

this also covers the case when self.cfg.model_cfg is None

Is it possible for the model_cfg attribute to be missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is possible. I think aeroval will crash if no model_cfg is given

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and model_cfg = None?

pyaerocom/aeroval/helpers.py Outdated Show resolved Hide resolved
pyaerocom/griddeddata.py Outdated Show resolved Hide resolved
pyaerocom/griddeddata.py Show resolved Hide resolved
Comment on lines 1690 to 1693
start = min([int(per.split("-")[0]) for per in periods])
stop = max(
[int(per.split("-")[1]) if len(per.split("-")) > 1 else int(per) for per in periods]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not need to pass a list to min/max a generator expression will work just as well

    start = min(int(per.split("-")[0]) for per in periods)
    stop = max(
        int(per.split("-")[1]) if "-" in per 1 else int(per) for per in periods
    )

pyaerocom/helpers.py Outdated Show resolved Hide resolved
@dulte
Copy link
Collaborator Author

dulte commented Mar 7, 2022

I think I've made all the changes requested

Copy link
Collaborator

@avaldebe avaldebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor requests regarding the use of pathlib.Path instead of os.path.*.

Please, feel free to ignore them if you do not see the added value

pyaerocom/aeroval/helpers.py Outdated Show resolved Hide resolved
Comment on lines +159 to +161
outdir = os.path.join(tmpdir, f"{model_id}/renamed")

os.makedirs(outdir, exist_ok=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if GriddedData.to_netcdf accepts pathlib.Path objects but it might be cleaner to write this part with a Path object

    outdir = Path(tmpdir) / f"{model_id}/renamed"
    outdir.mkdir(parents=True, exist_ok=True)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GriddedData.to_netcdf uses os.path.join, so I think it must be str. If we are going to change this, then it should be its own PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.join accepts path-like objects (see docs),
which means that it will accept pathlib.Path objects

Copy link
Collaborator

@avaldebe avaldebe Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using a Path object?

    outdir = Path(tmpdir) / f"{model_id}/renamed"
    outdir.makedir(parents=True, exist_ok=True)

as I wrote on my previous review

GriddedData.to_netcdf uses os.path.join, so I think it must be str. If we are going to change this, then it should be its own PR

os.path.join accepts path-like objects (see docs),
which means that it will accept pathlib.Path objects

Copy link
Collaborator

@avaldebe avaldebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nitpicks

Comment on lines +159 to +161
outdir = os.path.join(tmpdir, f"{model_id}/renamed")

os.makedirs(outdir, exist_ok=True)
Copy link
Collaborator

@avaldebe avaldebe Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using a Path object?

    outdir = Path(tmpdir) / f"{model_id}/renamed"
    outdir.makedir(parents=True, exist_ok=True)

as I wrote on my previous review

GriddedData.to_netcdf uses os.path.join, so I think it must be str. If we are going to change this, then it should be its own PR

os.path.join accepts path-like objects (see docs),
which means that it will accept pathlib.Path objects



def get_max_period_range(periods):
start = min([int(per.split("-")[0]) for per in periods])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can pass the generator expression directly to min without creating a list, e.g.

    start = min(int(per.split("-")[0]) for per in periods)


def get_max_period_range(periods):
start = min([int(per.split("-")[0]) for per in periods])
stop = max(int(per.split("-")[1]) if len(per.split("-")) > 1 else int(per) for per in periods)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simplify the generator expression by processing the last element on the split,
if you are concerned about invalid periods you could add an assert

    assert all(per.count("-") <= 1 for per in periods), f"invalid period in {periods=}"
    stop = max(int(per.split("-")[-1]) for per in periods)

Comment on lines +1701 to +1702
if freq not in TS_TYPE_TO_PANDAS_FREQ.keys():
raise ValueError(f"{freq} not a recognized frequency")
Copy link
Collaborator

@avaldebe avaldebe Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can test membership on a dictionary directly without calling .keys()

    if freq not in TS_TYPE_TO_PANDAS_FREQ:
        raise ValueError(f"{freq} not a recognized frequency")

@dulte dulte merged commit ee732d8 into main-dev Apr 25, 2022
@dulte dulte deleted the obs-only branch April 25, 2022 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants