Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Infer boolean literal expression #12688

Merged
merged 1 commit into from
Aug 5, 2024
Merged

Conversation

dhruvmanila
Copy link
Member

Summary

This PR implements type inference for boolean literal expressions.

Test Plan

Add test cases for True and False.

@dhruvmanila dhruvmanila added the red-knot Multi-file analysis & type inference label Aug 5, 2024
Copy link
Contributor

github-actions bot commented Aug 5, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. These changes all look good to me.

I have a concern about union types, though. How do we want a union of boolean literals to be represented internally? For an object that could be the True constant or could be the False constant, should we infer Literal[True] | Literal[False] (a union of literals), or should we eagerly normalize it to <instance of bool> (since we know that the bool class is special: there will only be two possible instance of bool).

  • If the former, I think we will want to modify the Display implementation for DisplayUnionType so that Literal[True] | Literal[False] is simplified to Literal[True, False], the same as we do for numeric literals
  • If the latter, some larger changes will be required

Mypy appears never to normalize Literal[True, False] to bool; pyright appears to always do so. (Link, link.) I think I weakly favour pyright's approach here, but not sure.

A separate question (which I think can almost certainly be deferred for now) is that @carljm added some fancy logic for int literals in infer_binary_expression. We could also add some understanding of bool literals to that method, since True + True == True + 1 == 1 + 1 == 2

@@ -175,6 +177,7 @@ impl<'db> Type<'db> {
// TODO raise error
Type::Unknown
}
Type::BooleanLiteral(_) => Type::Unknown,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we'll want to fall back to looking up attributes and methods on the builtins.bool class in our vendored typeshed stubs. But I think we can defer that for this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of "member" in this context? From what I understand, it's the type of an attribute on the given type. So, for instance, set.append would be a function type. Is this understanding correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your understanding is correct there

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think even the concern I have about union types can probably be deferred for now. So I'm happy with this landing as-is, as long as we keep track of these open questions somewhere.

@dhruvmanila
Copy link
Member Author

dhruvmanila commented Aug 5, 2024

Mypy appears never to normalize Literal[True, False] to bool; pyright appears to always do so. (Link, link.) I think I weakly favour pyright's approach here, but not sure.

What's the benefit of the one or the other? I think I'd prefer Pyright's approach as well because it can be resolved.

A separate question (which I think can almost certainly be deferred for now) is that @carljm added some fancy logic for int literals in infer_binary_expression. We could also add some understanding of bool literals to that method, since True + True == True + 1 == 1 + 1 == 2

Yes, I started playing around with binary expressions as well but thought to have it as it's own PR which would include all possible combinations.

@AlexWaygood
Copy link
Member

What's the benefit of the one or the other? I think I'd prefer Pyright's approach as well because it can be resolved.

I think one "benefit" of mypy's approach is that you don't have to apply quite so much special casing to boolean literals specifically inside the type checker implementation when figuring out how unions of boolean literals resolve. Literal[True] | Literal[False] resolves to Literal[True, False] in exactly the same way that Literal[1] | Literal[2] resolves to Literal[1, 2] and Literal["foo"] | Literal[True] resolves to Literal["foo", True].

One case where normalizing bool back to Literal[True, False] will be necessary will be reachability analysis. In the following snippet, it's self evident that the third case branch is unreachable, and the type checker is not required to special-case boolean types in any way in order to understand this; it follows from its generalised understanding of Literal types:

from typing import Literal

def f(x: Literal[True, False]):
    match x:
        case True:
            ...
        case False:
            ...
        case unreachable:
            ...

However: we'll need to apply that special casing to bool anyway. We'll also need to be able to understand types annotated with bool as being functionally equivalent to Literal[True, False], or we won't understand the third branch in this function as being unreachable:

from typing import Literal

def f(x: bool):
    match x:
        case True:
            ...
        case False:
            ...
        case unreachable:
            ...

So leaving the union unnormalized as Literal[True, False] doesn't really get us out of having to special-case bool.

I also... wouldn't necessarily jump to the conclusion that this is a deliberate choice by mypy with an explicit motivation 😄 it's a very old type checker, and support for Literal types was added to mypy after mypy had already been around for many years. To me, pyright's approach also seems significantly better here.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

Enums are a similar case where we'll need to understand sealed types (i.e. types with a finite number of inhabitants). I'm fine with delaying handling of the equivalence of bool and Literal[True, False] for now. I added #12694 to track this issue.

@carljm carljm merged commit a8e2ba5 into main Aug 5, 2024
20 checks passed
@carljm carljm deleted the dhruv/infer-bool-literal branch August 5, 2024 18:30
dylwil3 pushed a commit to dylwil3/ruff that referenced this pull request Aug 7, 2024
## Summary

This PR implements type inference for boolean literal expressions.

## Test Plan

Add test cases for `True` and `False`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants