[red-knot] Track root cause of why a type is inferred as `Unknown` #12986

AlexWaygood · 2024-08-19T11:00:26Z

Summary

This PR adds detailed information to our type inference so we can track exactly why a symbol has been inferred as Unknown. This allows us to restore the unresolved-import red-knot lint to the level of accuracy it had prior to a9847af, because we can now distinguish between Unknown types that were caused by unresolved imports and other kinds of Unknown types. The approach is similar to the one mypy uses.

Test Plan

There are two ways in which this PR is tested:

The assertion in the benchmark is changed: six spurious "Unresolved import" diagnostics go away, meaning the number of diagnostics drops from 34 to 28. The only remaining "unresolved import" diagnostic is emitted on this line, which is because we don't understand * imports yet, and Iterable is defined in the typeshed stub for collections.abc here.
A test is unskipped in red_knot_workspace/src/lint.rs asserting that a project structure (where the foo module does not exist) like this only triggers an "unresolved import" diagnostic in a.py, and not in b.py:
- /src/a.py: import foo as foo
- /src/b.py: from a import foo

AlexWaygood · 2024-08-19T11:01:27Z

crates/red_knot_python_semantic/src/types.rs

        }
    }
 }

+#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]


I derived Ord and PartialOrd here because the Type enum derives them, but I'm not sure it makes sense to do so. I don't really understand why the Type enum derives them in the first place.

Yeah, me neither. I would remove Ord from Type

I removed the PartialOrd and Ord implementations in ff6b148

crates/red_knot_python_semantic/src/types.rs

github-actions · 2024-08-19T11:14:22Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser

How does this new type work when unifying or intersecting types? Do we keep all different variants?

AlexWaygood · 2024-08-19T11:48:08Z

How does this new type work when unifying or intersecting types? Do we keep all different variants?

Good question. Currently yes, you could end up with a union such as int | Unknown(UnknownTypeKind::UnresolvedImport) | Unknown(UnknownTypeKind::TypeError).

I suppose we should flatten them into a single Unknown member. But then the question becomes: what kind of Unknown should it be? Maybe UnknownTypeKind::SecondOrder, since it results from the union of two Unknown elements?

MichaReiser · 2024-08-20T06:56:39Z

I would love to wait with this PR to get @carljm opinion. I feel uncertain about the type's definition because there are no "obvious" definitions for unification and intersection (unless we make it a bit set, at least for unification).

It's further unclear if we still need to distinguish between the two types, now that the check for unresolved imports moves into the type checker.

I had a quick look at pyright and from what i understand is that it uses Unknown for unresolved imports but it only has a single Unknown type.

https://github.com/microsoft/pyright/blob/52a47010b9db2ad6801b4b35d987cb9f4c923c18/packages/pyright-internal/src/analyzer/typeEvaluator.ts#L18629-L18676

AlexWaygood · 2024-08-20T07:01:04Z

I would love to wait with this PR to get @carljm opinion. I feel uncertain about the type's definition because there are no "obvious" definitions for unification and intersection

Agreed, I'm definitely not going to merge until we've heard Carl's thoughts here! There's no rush on this.

AlexWaygood · 2024-08-20T11:58:47Z

I feel uncertain about the type's definition because there are no "obvious" definitions for unification and intersection (unless we make it a bit set, at least for unification).

FWIW, I think simplifying unions is inevitably going to become more complex than what we currently have, whether we go with this PR or not, because of the fact that we will want to simplify Literal[True] | Literal[False] to Instance(builtins.bool), and similarly for other sealed types such as enums.

It's further unclear if we still need to distinguish between the two types, now that the check for unresolved imports moves into the type checker.

Hmm, yeah, not sure. I'll try to pull out some of the improvements here into a separate PR that doesn't introduce the new UnknownTypeKind enum, so that we can evaluate in isolation exactly whether it improves anything for us right now.

AlexWaygood · 2024-08-21T14:28:02Z

crates/red_knot_python_semantic/src/types.rs

-    #[ignore = "\
-A spurious second 'Unresolved import' diagnostic message is emitted on `b.py`, \
-despite the symbol existing in the symbol table for `a.py`"]


Following ecd9e6a, the key benefit of this PR is now that this test can be unskipped. This improvement is also reflected in the fact that six spurious "unresolved import" diagnostics go away in the benchmarks.

codspeed-hq · 2024-08-21T14:36:22Z

CodSpeed Performance Report

Merging #12986 will degrade performances by 5.41%

_{Comparing alex/unknown-kind (698de48) with main (ecd9e6a)}

Summary

❌ 1 regressions
✅ 31 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`alex/unknown-kind`	Change
❌	`linter/default-rules[numpy/globals.py]`	170.5 µs	180.3 µs	-5.41%

carljm · 2024-08-22T01:31:20Z

I'm open to the idea that we may need to do this eventually, but there are complexity and efficiency disadvantages to smuggling additional context through the type system this way, so I'd rather not do it until it's really clear exactly why we need it. And I don't think that's currently clear, given that the issue with the unresolved-imports check in main branch that this PR fixes is also easily fixable with a much smaller change: #13007

I don't think we should ever need this for the sake of deciding whether or not to emit a diagnostic. Unknown may result from a type error, but I don't think its existence should ever create a type error. The fact that main branch currently treats an import of any Unknown typed value as an error is IMO not correct, and is fixed by the change linked above.

I would be interested in knowing more about precisely how mypy uses its extra information about the origin of Any.

AlexWaygood · 2024-08-22T11:37:15Z

I don't think we should ever need this for the sake of deciding whether or not to emit a diagnostic. Unknown may result from a type error, but I don't think its existence should ever create a type error. The fact that main branch currently treats an import of any Unknown typed value as an error is IMO not correct, and is fixed by the change linked above.

Thanks, I think this is correct; treating Unknown as a type error goes against the spirit of gradual typing. We need to catch the error before we transform the Unbound into Unknown. My mistake.

AlexWaygood · 2024-08-22T13:02:57Z

I would be interested in knowing more about precisely how mypy uses its extra information about the origin of Any.

Much of it seems to be used for precise diagnostics when strict mode is enabled. Mypy --strict enables several checks that prevent implicit Any types from accidentally percolating through your code. So mypy wants to be able to distinguish between Anys that result from missing generic arguments, unannotated variables, and objects explicitly annotated as Any (for example).

E.g. here the type of Any is checked by the has_any_from_unimported_type() function in order to implement mypy's optional error code [no-any-unimported].

AlexWaygood · 2024-08-22T13:06:01Z

I still think there's a good chance that we'll need something like this eventually, but I agree that with the fixes in #13055, there are few immediate benefits to this right now. Thanks both!

carljm · 2024-08-22T16:52:35Z

Much of it seems to be used for precise diagnostics when strict mode is enabled.

Yeah, that makes sense. I agree, we'll probably need something like this for the same reason, when we want to have an Any-forbidding strict mode and emit useful diagnostics for it.

@carljm

This variant shows inference that is not yet implemented.. ## Summary PR #13500 reopened the idea of adding a new type variant to keep track of not-implemented features in Red Knot. It was based off of #12986 with a more generic approach of keeping track of different kind of unknowns. Discussion in #13500 agreed that keeping track of different `Unknown` is complicated for now, and this feature is better achieved through a new variant of `Type`. ### Requirements Requirements for this implementation can be summed up with some extracts of comment from @carljm on the previous PR > So at the moment we are leaning towards simplifying this PR to just use a new top-level variant, which behaves like Any and Unknown but represents inference that is not yet implemented in red-knot. > I think the general rule should be that Todo should propagate only when the presence of the input Todo caused the output to be unknown. > > To take a specific example, the inferred result of addition must be Unknown if either operand is Unknown. That is, Unknown + X will always be Unknown regardless of what X is. (Same for X + Unknown.) In this case, I believe that Unknown + Todo (or Todo + Unknown) should result in Unknown, not result in Todo. If we fix the upstream source of the Todo, the result would still be Unknown, so it's not useful to propagate the Todo in this case: it wrongly suggests that the output is unknown because of a todo item. ## Test Plan This PR does not introduce new tests, but it did required to edit some tests with the display of `[Type::Todo]` (currently `@Todo`), which suggests that those test are placeholders requirements for features we don't support yet.

AlexWaygood added the red-knot Multi-file analysis & type inference label Aug 19, 2024

AlexWaygood requested review from carljm and MichaReiser as code owners August 19, 2024 11:00

AlexWaygood commented Aug 19, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Show resolved Hide resolved

MichaReiser reviewed Aug 19, 2024

View reviewed changes

AlexWaygood force-pushed the alex/unknown-kind branch from 232789d to d4a194b Compare August 20, 2024 11:48

AlexWaygood mentioned this pull request Aug 20, 2024

[red-knot] Improve the unresolved-import check #13007

Merged

AlexWaygood added 3 commits August 21, 2024 15:22

[red-knot] Track root cause of why a type is inferred as Unknown

6de39c2

Remove Ord implementations

8e8e0d9

handle unions properly

698de48

AlexWaygood force-pushed the alex/unknown-kind branch from d4a194b to 698de48 Compare August 21, 2024 14:24

AlexWaygood commented Aug 21, 2024

View reviewed changes

AlexWaygood closed this Aug 22, 2024

AlexWaygood deleted the alex/unknown-kind branch August 22, 2024 13:07

AlexWaygood mentioned this pull request Sep 18, 2024

Fix/#13070 defer annotations when future is active #13395

Merged

This was referenced Sep 24, 2024

Feat/unknown kinds #13500

Closed

[red-knot] feat: introduce a new [Type::Todo] variant #13548

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Track root cause of why a type is inferred as `Unknown` #12986

[red-knot] Track root cause of why a type is inferred as `Unknown` #12986

AlexWaygood commented Aug 19, 2024 •

edited

Loading

AlexWaygood Aug 19, 2024

MichaReiser Aug 19, 2024

AlexWaygood Aug 19, 2024

github-actions bot commented Aug 19, 2024 •

edited

Loading

MichaReiser left a comment

AlexWaygood commented Aug 19, 2024

MichaReiser commented Aug 20, 2024

AlexWaygood commented Aug 20, 2024

AlexWaygood commented Aug 20, 2024

AlexWaygood Aug 21, 2024

codspeed-hq bot commented Aug 21, 2024

carljm commented Aug 22, 2024 •

edited

Loading

AlexWaygood commented Aug 22, 2024

AlexWaygood commented Aug 22, 2024

AlexWaygood commented Aug 22, 2024

carljm commented Aug 22, 2024

[red-knot] Track root cause of why a type is inferred as Unknown #12986

[red-knot] Track root cause of why a type is inferred as Unknown #12986

Conversation

AlexWaygood commented Aug 19, 2024 • edited Loading

Summary

Test Plan

AlexWaygood Aug 19, 2024

Choose a reason for hiding this comment

MichaReiser Aug 19, 2024

Choose a reason for hiding this comment

AlexWaygood Aug 19, 2024

Choose a reason for hiding this comment

github-actions bot commented Aug 19, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

MichaReiser left a comment

Choose a reason for hiding this comment

AlexWaygood commented Aug 19, 2024

MichaReiser commented Aug 20, 2024

AlexWaygood commented Aug 20, 2024

AlexWaygood commented Aug 20, 2024

AlexWaygood Aug 21, 2024

Choose a reason for hiding this comment

codspeed-hq bot commented Aug 21, 2024

CodSpeed Performance Report

Merging #12986 will degrade performances by 5.41%

Summary

Benchmarks breakdown

carljm commented Aug 22, 2024 • edited Loading

AlexWaygood commented Aug 22, 2024

AlexWaygood commented Aug 22, 2024

AlexWaygood commented Aug 22, 2024

carljm commented Aug 22, 2024

[red-knot] Track root cause of why a type is inferred as `Unknown` #12986

[red-knot] Track root cause of why a type is inferred as `Unknown` #12986

AlexWaygood commented Aug 19, 2024 •

edited

Loading

github-actions bot commented Aug 19, 2024 •

edited

Loading

`ruff-ecosystem` results

carljm commented Aug 22, 2024 •

edited

Loading