-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/unknown kinds #13500
Feat/unknown kinds #13500
Conversation
|
I've very limited understanding of Python's type inference, so take this with a grain of salt. I'm also not entirely convinced if we actually want this. I think I would like to see specific use cases where it demonstrates that the information is indeed useful (and doesn't get lost in translation). I think I would prefer a bit flag over an enum because it seems more obvious to me how the information propagates through different types of inference operations. IMO, the bit flag should be additive. If we have any type inference where an input type is x = [1, 2, 3];
y = x[0] Let's pretend that Red Knot doesn't support inferring the list type, but it does support the subscript.
|
This is achievable through ensuring that whenever we'd encounter a new occurence of That method is responsible for aggregating multiple unknown causes together. It's critical that we use this method in the right places. In the current draft, I am simplifying the implementation of this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
I have mixed feelings about it. I see the potential value for debugging, but I'm not sure how well this will play out in practice. If you see "red-knot-limitation" you still have to figure out where that came from. But I guess it is useful to at least know you're not dealing with intended behavior.
I think I would slightly favor adding this. I agree with Micha that it should be a bitflag.
The main alternative I would want to consider is that we instead just add a new Type
variant, Type::Todo
, which behaves like Unknown
. I think this could give us similar debugging benefits with less runtime cost, although still some cost in verbosity of Type
matches. There are also hypothetical future uses (for better diagnostics) that this wouldn't help with, but I'm hesitant to add complexity and runtime cost for hypothetical features we haven't designed or committed to yet.
One thing that might help clarify would be some specific examples of where mypy makes use of its Any kinds, so we can really decide whether that's something we will definitely want to emulate.
|
||
/// The symbol was unannotated and the true type can't be reasonably inferred | ||
UnannotatedSymbol, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that this case exists, and I think we should limit the cases to those we actually use, not add more speculatively.
/// A "second-order" Unknown type, that results from an interaction with | ||
/// a "first-order" Unknown type. For example, if the type of `x` is `Unknown`, | ||
/// the type of `x + 1` will also be `Unknown` | ||
SecondOrder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not convinced we need this kind; I think I'd rather propagate the original unknown kind in the x + 1
example.
/// Special kind for intersections that is equal to all other kinds. This allows to consider a | ||
/// single `Type::Unknown(_)` in intersections. Should not be used in other contexts. | ||
AllKinds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not convinced we should have this; I think I'd prefer to just not simplify differing unknowns in intersections. I don't expect this case to arise often, because most intersections will come from narrowing constraints, and in most cases we'll just not return a narrowing constraint at all rather than return one with an unknown type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, does that mean that having potentially different unknown kinds inside an intersection is not a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think that's a problem.
|
||
impl UnknownTypeKind { | ||
/// Method to aggregate multiple `UnknownTypeKind`s when they interact in any context | ||
pub(crate) fn union(self, _other: Self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the kind of an unknown should just always be a bitflag, and then the behavior of union is clear. (It doesn't make sense to me to have an enum with a special variant which is a bitflag of other variants; that seems like complexity that doesn't buy anything useful.)
// REVIEWERS: should we display 'RedKnotLimitation' differently? | ||
// => intent: any test including 'red-knot-limitation' should be updated one day |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so -- if we don't do this, the debugging value is limited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexWaygood and I looked more carefully at the mypy use cases, and we are not convinced that we will ever want this data for any purpose other than as a debugging aid while red-knot has a lot of "not implemented" inference. We aren't 100% convinced that we won't eventually want it, either, but it remains unclear enough that it still doesn't feel like it makes sense to add it speculatively. So at the moment we are leaning towards simplifying this PR to just use a new top-level NotImplemented
variant, which behaves like Any
and Unknown
but represents inference that is not yet implemented in red-knot. @MichaReiser, does this sound reasonable to you?
Mypy uses its Any provenance info to be able to alert you when an implicit Any creeps into your code from some particular source (e.g. a failed import, an unannotated function, an unparameterized generic). What's unclear to me about the utility here is that these are all things that can be errored directly where they happen, if you want to be strict about them; erroring directly where they happen doesn't require tracking the provenance of Any
.
I guess the use case is maybe that the implicit Any happens in third-party code that you aren't checking or maintaining, but you still don't want the Any to sneak into your code? But in this case I'm not clear why it would matter whether the Any comes from an unresolved import, vs a type error, vs an explicit Any -- if you aren't maintaining or checking the code where it came from, one Any is just like another, why does it matter specifically how it originated?
@hauntsaninja, you recently provided some great feedback from a mypy perspective on some other red-knot issues -- any thoughts here?
pub enum UnknownTypeKind { | ||
/// Temporary variant that indicates that we *should* | ||
/// be able to infer a type here in due course, but currently can't | ||
RedKnotLimitation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I like the name NotImplemented
here
/// Special kind for intersections that is equal to all other kinds. This allows to consider a | ||
/// single `Type::Unknown(_)` in intersections. Should not be used in other contexts. | ||
AllKinds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think that's a problem.
I agree with nearly everything @carljm said here, with one small exception: tne issue with calling the variant |
Overall yes, but it depends on how this |
I agree that we should consider this carefully to maximize the debugging value. I think that following this proposed rule might cause I think the general rule should be that To take a specific example, the inferred result of addition must be Unknown if either operand is Unknown. That is, |
Makes sense. Summarising this in a comment on the todo variant would be useful |
Hey guys, thank you so much for spending so much time thinking about the problem. The direction is clear, I'll edit this PR when I can to implement what's been decided so far, and I'll request for another review then. Thank you again 😁 |
I think it comes up if an Any is explicitly annotated or comes about from a lack of annotation (assuming an absence of inference). It also matters what configuration options people use in practice. For various often historical reasons, |
Hey guys, closing this in favour of a new PR to allow for discussions focused on the implementation |
This variant shows inference that is not yet implemented.. ## Summary PR #13500 reopened the idea of adding a new type variant to keep track of not-implemented features in Red Knot. It was based off of #12986 with a more generic approach of keeping track of different kind of unknowns. Discussion in #13500 agreed that keeping track of different `Unknown` is complicated for now, and this feature is better achieved through a new variant of `Type`. ### Requirements Requirements for this implementation can be summed up with some extracts of comment from @carljm on the previous PR > So at the moment we are leaning towards simplifying this PR to just use a new top-level variant, which behaves like Any and Unknown but represents inference that is not yet implemented in red-knot. > I think the general rule should be that Todo should propagate only when the presence of the input Todo caused the output to be unknown. > > To take a specific example, the inferred result of addition must be Unknown if either operand is Unknown. That is, Unknown + X will always be Unknown regardless of what X is. (Same for X + Unknown.) In this case, I believe that Unknown + Todo (or Todo + Unknown) should result in Unknown, not result in Todo. If we fix the upstream source of the Todo, the result would still be Unknown, so it's not useful to propagate the Todo in this case: it wrongly suggests that the output is unknown because of a todo item. ## Test Plan This PR does not introduce new tests, but it did required to edit some tests with the display of `[Type::Todo]` (currently `@Todo`), which suggests that those test are placeholders requirements for features we don't support yet.
Summary
Previous Work
With #12986, @AlexWaygood tried to introduce a lower level of granularity into the
Unknown
type, similarly to what mypy does for any.The main benefits (he targeted) were related to
unresolved-import
diagnostics.Current Approach
I rebased Alex's PR and continued his work following some comments on a previous PR.
I would like to take his work as a basis to achieve a slightly different objective. This PR does not target enabling any specific user-facing feature. It's attempting to find a way for us to record more accurate information when we have it (e.g. detail the current
Unknown
into multiple kinds) without breaking other functionalities.As a benefit, I hope this will allow us to
RedKnotLimitation
)Any
(if mypy needs it, maybe we will too)This PR is a draft and doesn't have all the answers, but I'll try to address the limiting factors that led Alex to close his previous work
Previous Objections
Better alternative for
unresolved-imports
One of the critical objections was that the previous PR's main goal could be achieved with an easier solution. See this comment.
I think this wouldn't apply here as I have a different core objective with this PR.
Behaviour for union
For unions, multiple
UnknownKindType
are aggregated together through the methodThe specific of which implementation makes the most sense depends on how we use unions. There are many options. The most trivial would be to only return one kind (e.g. the first found). The most complicated would be to record a bitflag of all kinds encountered.
Behaviour for intersections
For intersections, I tried out an implementation with a special kind,
AllKinds
, which is equal to all other kinds.There can only be one kind of Unknown in a positive or negative intersection, making all unknown kinds valid (or invalid) for an intersection that contains any other unknown kind.
Another idea I had would be to make all
UnknownKind
indistinguishable byEq
&Hash
so only code usingmatch
would make difference between theUnknownTypeKind
, if they need to do so. I didn't implement this in the draft because I'm not sure it's a good idea.Test Plan
Unknown
Union
andIntersection
that requires additional codeP.S: this is a draft, any feedback is welcome. I can try any implementation you'd like to see for the different problems stated above. If this PR's cost/value isn't worth it, no problem with closing!