Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mypy errors in airflow/utils/ #20482

Merged
merged 5 commits into from
Jan 3, 2022
Merged

Conversation

ephraimbuddy
Copy link
Contributor


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@ephraimbuddy ephraimbuddy requested review from ashb, uranusjr and potiuk and removed request for ashb and uranusjr December 23, 2021 09:30
@@ -48,7 +48,7 @@ class VariableAccessor:
class ConnectionAccessor:
def get(self, key: str, default_conn: Any = None) -> Any: ...

class Context(TypedDict, total=False):
class Context(TypedDict):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure of this one cc: @uranusjr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you need that ? This means that we can omit any of the fields but IMHO that would be great if we can make assumption that all fields are present in Context - so that whenever we need, we just add empty/default values for them whenever we find it necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn’t really matter because totality is checked on initialisation, and we don’t actually instantiate the context object directly, only force-cast a dict into one. And I chose total=False because it is arguably more correct semantically—the context does not always contain all the keys listed in this fake TypedDict.

What is the reason behind this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error:

airflow/utils/context.pyi:51: error: Unexpected keyword argument "total" for "__init_subclass__"
of "object"
    class Context(TypedDict, total=False):  
    ^
Found 1 error in 1 file (checked 60 source files)

try_number=try_number,
)
else:
return "" # mypy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also another place I'm confused about the best approach to it and whether to work on the parse_template_string itself

Copy link
Member

@potiuk potiuk Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way parse_template_string works - it guarantess that either filename_jinja_template or filename_template is set. There is no way both will be None. The only "real" case where (currently) the if can be reached is if the "template_string" is "". But this is wrong anyway.

I'd say:

  • raise RuntimeError in parse_template_string if tamplate_string is ""
  • raise RuntimeError instead of return ""' - with "This should never happen" kind of message.
  • change conditions for self.filename_template and self.filename_jinja_template to is not None explicitly.

@@ -133,10 +138,10 @@ def make_aware(value: Optional[dt.datetime], timezone: Optional["Timezone"] = No
value = value.replace(fold=1)
if hasattr(timezone, 'localize'):
# This method is available for pytz time zones.
return timezone.localize(value)
return cast(Any, timezone).localize(value)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use #type: ignore instead?

Copy link
Member

@potiuk potiuk Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a discussion about MyPy supporting this case here: python/mypy#1424

One of the nice solutions suggested in the thred as a workaround would translate to:

localize = getattr(timezone, 'localize', None)
if localize is not None:
    return localize(value)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to getattr. A less-known fact: For user-defined classes (i.e. not something like int or str), hasattr is actually roughly equivalent to

def hasattr(obj, name):
    try:
        getattr(obj, name)
    except AttributeError:
        return False
    return True

So if you need to get the value at some point, getattr is always superior to hasattr.

@@ -82,13 +82,15 @@ def _render_filename(self, ti: "TaskInstance", try_number: int) -> str:
context = Context(ti=ti, ts=ti.get_dagrun().logical_date.isoformat())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "total = False" came from here. I'd say we should have a way to create "default" Context with all the fields present with default values and override the fields needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this on mypy python/mypy#7722 which fixes this but it seems it didn't fix it for stub file. Should we instead ignore the error and create issue in mypy? I'm thinking it's a bug considering the fix on the linked PR to the issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we should refactor Context.__init__() and how it interacts with TaskInstance.get_template_context(). Let’s have this now and I’ll do the refactoring later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But removing total=False shouldn’t work here—it should be the other way around because this requires non-totality! We should keep total=False and find another solution.

https://www.python.org/dev/peps/pep-0589/#totality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have ignored it, for now, to wait for refactoring in another PR. Hope I understand correctly?

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few suggestions left :)

@ephraimbuddy
Copy link
Contributor Author

Few suggestions left :)

Thank you! Will work on them.

class Context(TypedDict, total=False):
class Context(TypedDict, total=False): # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error you get if you don’t ignore here? (For debugging reference later.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here:

airflow/utils/context.pyi:51: error: Unexpected keyword argument "total" for "__init_subclass__"
of "object"
    class Context(TypedDict, total=False):  
    ^
Found 1 error in 1 file (checked 60 source files)

Copy link
Member

@potiuk potiuk Dec 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total=False is Python 3.8 + feature https://docs.python.org/3/whatsnew/3.8.html
Not sure if it will be really helpful. I'd say we should specify all possible keys in Context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have removed it since we still support python 3.7

airflow/utils/python_virtualenv.py Outdated Show resolved Hide resolved
airflow/utils/timezone.py Outdated Show resolved Hide resolved
Comment on lines 202 to 206
def coerce_datetime(v: Optional[dt.datetime]) -> Optional[DateTime]:
def coerce_datetime(v: Optional[dt.datetime]) -> Optional[dt.datetime]:
Copy link
Member

@uranusjr uranusjr Dec 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn’t be necessary; coerce_datetime always returns None or a pendulum.DateTime. Also I believe this will cause some errors to pop up elsewhere because there are functions expecting this to return a pendulum.DateTime specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the error here when changed back

airflow/utils/timezone.py:213: error: Incompatible return value type (got "datetime", expected
"Optional[DateTime]")
            return v if v.tzinfo else make_aware(v)
                   ^
Found 1 error in 1 file (checked 60 source files)

Copy link
Member

@potiuk potiuk Dec 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need additional overload of make_aware with DateTime which should fix it. https://mypy.readthedocs.io/en/stable/more_types.html#function-overloading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding an additional overload of make_aware but mypy errors that it'll never be matched because of the arguments.

airflow/utils/timezone.py:113: error: Overloaded function signature 3 will
never be matched: signature 2's parameter type(s) are the same or broader
    def make_aware(value: dt.datetime, timezone: Optional[dt.tzinfo] = N...
    ^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I figured it out. The preoblm is that DateTime derives from dt.datetime. And we have to specify DateTime overloads before datetimes ones.

What worked for me:

  • make_aware overloads:
@overload
def make_aware(value: None, timezone: Optional[dt.tzinfo] = None) -> None:
    ...


@overload
def make_aware(value: DateTime, timezone: Optional[dt.tzinfo] = None) -> DateTime:
    ...


@overload
def make_aware(value: dt.datetime, timezone: Optional[dt.tzinfo] = None) -> dt.datetime:
    ...


def make_aware(value: Optional[dt.datetime], timezone: Optional[dt.tzinfo] = None) -> Optional[dt.datetime]:
  • coerce_datetime overloads:
@overload
def coerce_datetime(v: None) -> None:
    ...


@overload
def coerce_datetime(v: DateTime) -> DateTime:
    ...


@overload
def coerce_datetime(v: dt.datetime) -> DateTime:
    ...


def coerce_datetime(v: Optional[dt.datetime]) -> Optional[DateTime]:

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Dec 24, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@khalidmammadov
Copy link
Contributor

Duplication of effort here #20504 ...
I wish we all mention/reference these changes here #19891 as per convention so to avoid time waste.

@ephraimbuddy
Copy link
Contributor Author

Duplication of effort here #20504 ...
I wish we all mention/reference these changes here #19891 as per convention so to avoid time waste.

My bad...

@khalidmammadov
Copy link
Contributor

Duplication of effort here #20504 ...
I wish we all mention/reference these changes here #19891 as per convention so to avoid time waste.

My bad...

No worries. I will close that. Please take a look to that anyway to see if you want to amend or express anything here differently, particularly those overloadings. I am not sure why they are necessary

@ephraimbuddy
Copy link
Contributor Author

Duplication of effort here #20504 ...
I wish we all mention/reference these changes here #19891 as per convention so to avoid time waste.

My bad...

No worries. I will close that. Please take a look to that anyway to see if you want to amend or express anything here differently, particularly those overloadings. I am not sure why they are necessary

The overloads are because we don't want a breaking change here(I think).

Comment on lines 31 to 32
if TYPE_CHECKING:
from pendulum.tz.timezone import Timezone
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed I guess.

@potiuk potiuk added the mypy Fixing MyPy problems after bumpin MyPy to 0.990 label Dec 31, 2021
@uranusjr uranusjr merged commit 138f38f into apache:main Jan 3, 2022
@uranusjr uranusjr deleted the fix-mypy-for-utils branch January 3, 2022 03:24
@kaxil kaxil added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge mypy Fixing MyPy problems after bumpin MyPy to 0.990
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants