-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update return types of get_key
methods on S3Hook
#30923
Update return types of get_key
methods on S3Hook
#30923
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me, triggered CI and re-ran that failed PROD image build.
Ah no ... pre-commit install/run and docs build are recommende/ |
373d3af
to
67efa3d
Compare
I believe have fixed the CI failures on static-checks and build-docs. I am slightly anxious about this change. In some sense, it is a breaking API change (at least at the type checking level), and it may result in type checking failures to airflow users when they upgrade, which is annoying for users to need to manage. But on the other hand, the types are currently wrong, and users who rely on them may have runtime errors. That is how I found this issue. And in particular, I am not sure if Airflow has a policy on this kind of change. Personally, I would like the correct types. But if you want something different (or to close the PR without merging), I am happy to oblige. |
Triggered CI again (the first time contributor workflow can be frustrating @jonshea, bear with us)
This is interesting, if we do consider it a breaking change, we also don't really have an (easy) way to cleanly deprecate type-related changes like this with a deprecation warning, while keeping it backwards compatible. However, I think this can pretty clearly be seen as a bug fix, and as such can be shipped without backcompat/warnings. But I'd be interested to hear what others think. |
I strongly agree with this. This looks like a bug to me and should be fixed without any deprecation process |
Type change is not really breaking IMHO. We do not have a definition of what Breaking change is and mainly because it is impossible. Hyrum's law succintly describes it and the obligatory XKCD comic shows a good example of it [1]. So we always have to rely on personal judgment on when a change is likely to break many people's "production" workflow. This one might easily break someone's CI workflow for example. But it is extremely unlikely to break production workflow. There are few cases only when type-hinting is used at runtime with the capacity of breaking things (for example when it is used in Pydantic definition of data objects being serialized). If your code worked before the change and correctly handled S3 objects returned, it will continue to work, if it did not handle some edge cases, and your MyPy might not have flagged it before, but it will continue to NOT work the same way it did NOT work before. So this one is definitely not breaking. your CI might break, but this is not a breaking change for airlfow release. [1] Obligatory XKCD |
BTW. The reason build-docs have been failing for this one are workarounded in #31002 - this change has generated dependencies change which in turn require docs for all providers to be built, and it took all the time doc build job had allocated on Public Runners. |
`S3Hook` has two methods, `get_key` and `get_wildcard_key`, which use an AWS `ServiceResource` to fetch object data from S3. Both methods are correctly documented as returning an instance of `S3.Object`, but their return types are annotated with `S3Transfer`. This is incorrect. The actual return type, `S3.Object`, is not a subtype of `S3Transfer`, and the two types have many different methods. This PR uses the `mypy-boto3-s3` package to set a correct return type of S3 resource `Object` for `get_key` and `get_wildcard_key`.
67efa3d
to
75b7edc
Compare
I rebased on If there is anything else you need from me, or that I can help with, please let me know. |
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
S3Hook
has two methods,get_key
andget_wildcard_key
, which use an AWSServiceResource
to fetch object data from S3. Both methods are correctly documented as returning an instance ofS3.Object
, but their return types are annotated withS3Transfer
. This is incorrect. The actual return type,S3.Object
, is not a subtype ofS3Transfer
, and the two types have many different methods.This PR uses the
mypy-boto3-s3
package to set a correct return type of S3 resourceObject
forget_key
andget_wildcard_key
.