Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regulatory calculations do not work with latest pandas version #153

Closed
rschwant opened this issue Dec 30, 2022 · 5 comments
Closed

Regulatory calculations do not work with latest pandas version #153

rschwant opened this issue Dec 30, 2022 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@rschwant
Copy link
Collaborator

The regulatory calculations error with the latest pandas version. The error is

KeyError: 'time_local'

in this code:
File /scratch2/BMC/rcm1/rhs/miniconda3/envs/mm_dev_all/lib/python3.9/site-packages/melodies_monet/plots/surfplots.py:74, in calc_8hr_rolling_max_v1(df, col, window)
72 df_rolling_max = df_rolling.groupby("siteid").resample("D", on="time_local").max(min_count=18).reset_index(drop=True).dropna()
73 df = df.reset_index(drop=True)
---> 74 return df.merge(df_rolling_max, on=["siteid", "time_local"])

@mlirenzhenmayi I think this looks similar to the issue you brought up awhile ago and maybe we just didn't notice the problem because Hera only just allowed the latest version of Pandas on it and maybe you were testing this on a different system like Meiyu? @jianheACM, do you have time next week to help fix this problem, since you know this regulatory code the best? Meng's suggestion awhile back may work well, but we should probably test that it doesn't change the answer and works for all options. Thanks!

@rschwant rschwant added the bug Something isn't working label Dec 30, 2022
@rschwant
Copy link
Collaborator Author

rschwant commented Dec 30, 2022

I forgot to mention. A quick fix until this is resolved is to downgrade your pandas version. I tested it and it works well. So while your conda environment is active type:

conda install 'pandas <1.5'

@zmoon
Copy link
Collaborator

zmoon commented Dec 31, 2022

Seems related to this stuff pandas-dev/pandas#47079. That is, it seems that the previous behavior of retaining on ('time_local') when applying .max() was considered a bug, which has now been fixed.

Maybe good fix would be to add a new temporary local time column which can be safely dropped after using, like suggested here

-    df_rolling_max = df_rolling.groupby("siteid").resample("D", on="time_local").max(min_count=18).reset_index(drop=True).dropna()
+    df_rolling["key"] = df["time_local"]
+    df_rolling_max = df_rolling.groupby("siteid").resample("D", on="key").max(min_count=18).reset_index(drop=True).dropna()

That way df_rolling_max still has 'time_local'.

@jianheACM
Copy link
Collaborator

@zmoon @rschwant I just pushed my changes to develp_reg2.
The changes are for the pandas issue mentioned here and also for the colormap issue with new matplotlib.
I will submit a PR for your review. Thanks,

@rschwant
Copy link
Collaborator Author

The fix for this has been merged to develop, I will close this once we merge develop to main.
#154

@rschwant
Copy link
Collaborator Author

This fix is now in the main branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants