-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more precise inference for enum attributes #6867
Conversation
This pull request makes two changes to enum attributes. First, this PR refines type inference for expressions like `MyEnum.FOO` and `MyEnum.FOO.name`. Those two expressions will continue to evaluate to `MyEnum` and `str` respectively under normal conditions, but will evaluate to `Literal[MyEnum.FOO]` and `Literal["FOO"]` respectively when used in Literal contexts. Second, the type of `MyEnum.FOO.value` will be more precise when possible: mypy will evaluate that expression to the type of whatever FOO was assigned in the enum definition, falling back to `Any` as a default. Somewhat relatedly, this diff adds a few tests confirming we handle enum.auto() correctly. Two additional notes: 1. The changes I made to the `name` and `value` fields up above are strictly speaking unsafe. While those files are normally read-only (doing `MyEnum.FOO.name = blah` is a runtime error), it's actually possible to change those fields anyway by altering the `_name_` and `_value_` fields which are *not* protected. But I think this use case is probably rare -- I'm planning on investigating the feasibility of just having mypy just disallow modifying these attributes altogether after I investigate how enums are used in some internal codebases in a little more detail. 2. I would have liked to make `MyEnum.FOO.value` also return an even more precise type when used in literal contexts similar to `MyEnum.FOO.name`, but I think our plugin system needs to be a bit more flexible first.
I guess this is kinda-sorta a follow-up to #5599? It makes some enum attributes implicitly Final for the purposes of type inference, but doesn't disallow assignment to them. (I'm planning on tackling that in a separate PR.) |
test-data/unit/check-protocols.test
Outdated
@@ -2427,7 +2427,7 @@ from typing import Protocol | |||
class P(Protocol): ... | |||
class C(P): ... | |||
|
|||
reveal_type(C.register(int)) # E: Revealed type is 'def () -> builtins.int' | |||
reveal_type(C.register(int)) # E: Revealed type is 'def (x: builtins.object =, base: builtins.int =) -> builtins.int' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why the revealed type is the constructor signature -- when I try directly running mypy on this, I get Type[builtins.int]
instead. I guess this is just some test-related artifact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a different fixture for builtins might help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! It's good to have less hacky enum support. I just left some nits.
By the way, what's the expected use case for inferring a literal type for things like A.x.name
? Is it to allow looking things up from a TypedDict?
What's the status of supporting Literal[Enum.foo]
? The test cases don't seem to cover that, but the PR message implies that this PR is related to it.
# Note: 'enum.EnumMeta' is deliberately excluded from this list. Classes that directly use | ||
# enum.EnumMeta do not necessarily automatically have the 'name' and 'value' attributes. | ||
ENUM_PREFIXES = ['enum.Enum', 'enum.IntEnum', 'enum.Flag', 'enum.IntFlag'] | ||
ENUM_NAME_ACCESS = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if using a set would be a bit faster. get_attribute_hook
is called very often so it might even make a small difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out it is indeed faster, at least based on some microbenchmarking I did. I thought the list would be small enough that overhead would be about the same either way, but that was wrong.
(In retrospect, I guess doing on average 4 to 8 some_str.__eq__(...)
calls per containment check is always going to be noticeably more expensive then doing a __hash__(...)
followed by maybe an __eq__(...)
, at least in Python.)
test-data/unit/check-protocols.test
Outdated
@@ -2427,7 +2427,7 @@ from typing import Protocol | |||
class P(Protocol): ... | |||
class C(P): ... | |||
|
|||
reveal_type(C.register(int)) # E: Revealed type is 'def () -> builtins.int' | |||
reveal_type(C.register(int)) # E: Revealed type is 'def (x: builtins.object =, base: builtins.int =) -> builtins.int' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a different fixture for builtins might help.
test-data/unit/lib-stub/builtins.pyi
Outdated
@@ -9,6 +9,8 @@ class type: | |||
|
|||
# These are provided here for convenience. | |||
class int: | |||
# Note: this is a simplification of the actual signature | |||
def __init__(self, x: object = ..., base: int = ...) -> None: pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not add anything to builtins.pyi
if there's a reasonable way of avoiding it. What about adding this to, say, fixtures/primitives.pyi
? How many test cases need this?
F3.x.value # E: "F3" has no attribute "value" | ||
F3.x._value_ # E: "F3" has no attribute "_value_" | ||
|
||
[case testEnumAttributeChangeIncremental] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that testing deserialization of the related types would also be an interesting test case. I wonder if one exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure if we have one. Do you know which file I should add the test to? (I don't remember where we keep the deserialization tests.)
Thanks for the review! I'll work on making the changes you suggested later today. Just to quickly answer the two questions you asked though...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! Just a few more issues remain.
test-requirements.txt
Outdated
@@ -4,6 +4,7 @@ flake8-bugbear; python_version >= '3.5' | |||
flake8-pyi; python_version >= '3.6' | |||
lxml==4.2.4 | |||
mypy_extensions>=0.4.0,<0.5.0 | |||
typing_extensions>=3.7.0,<4.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why I needed to add this in as an explicit dependency -- or rather, how we managed to do without it up until now.
All of the other pull requests seemed fine? But I wasn't able to import typing_extensions
within the new enums
plugin module without it... But mypy is chock-full of imports of typing_extensions
in other modules??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Michael0x2a All the other files do:
MYPY=False
if MYPY:
from typing_extensions import Final
I believe you don't have a guard for the import? I think doing the same guard would fix the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh, that would do it, thanks!
(I guess just blindly grepping for "typing_extensions" might not have quite given me the full picture...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This pull request makes two changes to enum attributes. First, this PR refines type inference for expressions like `MyEnum.FOO` and `MyEnum.FOO.name`. Those two expressions will continue to evaluate to `MyEnum` and `str` respectively under normal conditions, but will evaluate to `Literal[MyEnum.FOO]` and `Literal["FOO"]` respectively when used in Literal contexts. Second, the type of `MyEnum.FOO.value` will be more precise when possible: mypy will evaluate that expression to the type of whatever FOO was assigned in the enum definition, falling back to `Any` as a default. Somewhat relatedly, this diff adds a few tests confirming we handle enum.auto() correctly. Two additional notes: 1. The changes I made to the `name` and `value` fields up above are strictly speaking unsafe. While those files are normally read-only (doing `MyEnum.FOO.name = blah` is a runtime error), it's actually possible to change those fields anyway by altering the `_name_` and `_value_` fields which are *not* protected. But I think this use case is probably rare -- I'm planning on investigating the feasibility of just having mypy just disallow modifying these attributes altogether after I investigate how enums are used in some internal codebases in a little more detail. 2. I would have liked to make `MyEnum.FOO.value` also return an even more precise type when used in literal contexts similar to `MyEnum.FOO.name`, but I think our plugin system needs to be a bit more flexible first.
This pull request makes two changes to enum attributes.
First, this PR refines type inference for expressions like
MyEnum.FOO
andMyEnum.FOO.name
. Those two expressions will continue to evaluate toMyEnum
andstr
respectively under normal conditions, but will evaluate toLiteral[MyEnum.FOO]
andLiteral["FOO"]
respectively when used in Literal contexts.Second, the type of
MyEnum.FOO.value
will be more precise when possible: mypy will evaluate that expression to the type of whatever FOO was assigned in the enum definition, falling back toAny
as adefault.
Somewhat relatedly, this diff adds a few tests confirming we handle enum.auto() correctly.
Two additional notes:
The changes I made to the
name
andvalue
fields up above are strictly speaking unsafe. While those files are normally read-only (doingMyEnum.FOO.name = blah
is a runtime error), it's actually possible to change those fields anyway by altering the_name_
and_value_
fields which are not protected.But I think this use case is probably rare -- I'm planning on investigating the feasibility of just having mypy just disallow modifying these attributes altogether after I investigate how enums are used in some internal codebases in a little more detail.
I would have liked to make
MyEnum.FOO.value
also return an even more precise type when used in literal contexts similar toMyEnum.FOO.name
, but I think our plugin system needs to be a bit more flexible first.