Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reformate demographic_id and refactor disaggregate.py #25

Merged
merged 5 commits into from
Mar 6, 2023

Conversation

AHsu98
Copy link
Contributor

@AHsu98 AHsu98 commented Mar 3, 2023

Pretty significant changes here, we should probably update pypi version after this and bump version.

  • Reformatted the io of how we use demographic_id, now you can optionally put in a list of columns, and we won't have to do the ugly tuple hack if we have a age x location x year slice
  • Removed stuff like split_dataframe_rate and added an extra argument to split_dataframe

The version in this branch is what I used to make the example from Lauryn's data.

AHsu98 added 5 commits March 2, 2023 21:03
Cleaned up high level splitting api to make the output type an argument rather than use separate functions
Fixed bug with argument order, got rid of split_dataframe_rate
Also updated example notebook for dataframe splitting to match current changes.
Minor fixes to docstrings, typing
@AHsu98 AHsu98 requested a review from zhengp0 March 3, 2023 07:51
model=model,
observed_total_se=x['obs_se']
if demographic_id_columns is not None:
splitting_df['demographic_id']=list(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AHsu98, I don't think create 'demgraphic_id' column is necessary, especially you will delete it at the end of the function. Maybe I am missing something. If you can explain a little here that will be great!

Pandas set_index function can take multiple column to create the index, and reset_index will flatten out the hierarchical index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, setting it as demographic id here was kind of a hack, I don't really like it. I set it as a tuple because otherwise I was getting errors around the index name not matching with the population df, I'll make an issue and try to fix it later!

@AHsu98 AHsu98 merged commit 704f6ec into main Mar 6, 2023
@AHsu98 AHsu98 deleted the reformat-demographic-id branch August 24, 2023 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants