Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: optional pandas and polars support #467

Merged
merged 1 commit into from
Jul 1, 2024

Conversation

TyberiusPrime
Copy link
Contributor

Fixes #394.

I recently ran into an issue where my pipipegraph2 failed to to recalculate nodes downstream of a changed output because deepdiff assigned the same hash to different DataFrames.

Turns out, it was essentially only hashing the column names.

This PR fixes that for pandas, and while I had it open, for polars as well.

The code paths are optional on a successful pandas/polars import.

The added tests of course require pandas and polars. I tried for both with the older versions I listed in requirements-dev.txt and the current versions

I observe 3 failing & 3 error test cases here locally,
but they also failed before I touched the code, so I'll blame them on my local venv.

@seperman
Copy link
Owner

Hi @TyberiusPrime
Thanks for the PR! Can you please make your PR against the dev branch, not the master branch?
There are some conflicts with your PR against the dev branch.
Please ping me once you have updated the PR!

@TyberiusPrime
Copy link
Contributor Author

TyberiusPrime commented Jun 28, 2024

My apologies, I had rebased against dev before creating the PR (but after starting the creation...) and github somehow didn't pick that up.

Give me a minute to learn how to fix this.

edit: Turns out it's as easy as hitting 'edit' at the top and selecting a new target branch. Now the diff looks much more reasonable as well.

@TyberiusPrime TyberiusPrime changed the base branch from master to dev June 28, 2024 17:53
@seperman
Copy link
Owner

seperman commented Jul 1, 2024

LGTM! Thanks @TyberiusPrime
There is a minor bug in the requirments-dev.txt of your PR. I will fix it.

@seperman seperman merged commit ee36c1d into seperman:dev Jul 1, 2024
@seperman
Copy link
Owner

@TyberiusPrime DeepDiff 8.0.0 is published and it includes your contribution. Thank you!

TyberiusPrime added a commit to TyberiusPrime/pypipegraph2 that referenced this pull request Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DeepHash: Different dataframes get the same hash
2 participants