-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to make dataset objects totally unhashable #42054
Conversation
0e3cd50
to
7cac5ae
Compare
We should consider extra...important for tests.
c71a8f5
to
0695519
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this change, but not sure whether it should be categorized as a breaking change
This should generally not affect too many usages, but there’s no harm adding a note I guess. We can’t tell how those are used… |
This reverts commit 1c4a00b.
cc: @uranusjr @Lee-W -> we revert it for now as it causes some tests involving open-lineage (in amazon and common.io providers) to fail - see #42059 discussion. Apparently there is an implicit dependency betweeen datasets implementation and amazon/common.io tests Please re-do it and add "full tests needed" label for this change to make sure all the tests are passing. |
I think the OpenLineage is not involved here, but the hook level lineage collection is causing different providers to create datasets, so there is a dependency there. |
Using the hash property on Dataset (and DatasetAlias) is problematic since subclassing is a possibility, and user code may not correctly implement hashing for Airflow’s purposes.