Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Fix merged type after if-else without explicit else branch #14621

Conversation

sransara
Copy link
Contributor

Summary

Closes: #14593

The final type of a variable after if-statement without explicit else branch should be similar to having an explicit else branch.

Test Plan

Originally failed test cases from the bug are added.

Copy link
Contributor

github-actions bot commented Nov 27, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

self.flow_merge(pre_if);
// if there's no else clause, we should act as if we had an empty else clause,
// by again following the same logic as above for elif/else clauses.
post_clauses.push(self.flow_snapshot());
Copy link
Contributor Author

@sransara sransara Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated the logic in above for loop of node.elif_else_clauses in here. Another approach would be to have a dummy else clause. I wasn't sure if a dummy node with a made up range would cause issues down the line. But I can switch if that pattern is preferred than duplicating.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructing an actual dummy node will cause issues, and I don't think it's necessary. But I do think we could remove the logic duplication here. Instead of iterating over clauses above, we could iterate over (test, body) pairs, where both elements of the pair are Option. For a normal else clause, test would be None but body would be Some; if there's no else clause, we'd insert a (None, None) at the end.

That said, without trying to write it, I'm not sure if the code necessary to do this might end up being larger / harder to understand than the duplication you currently have here. Open to whatever you think is best, having worked on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. This is better. Slight change I did was with body. It is a vector/slice. So I opted to using empty slice instead if Option. I have updated the code now.

@MichaReiser MichaReiser added the red-knot Multi-file analysis & type inference label Nov 27, 2024
@@ -55,3 +55,16 @@ else:
# TODO should be Never
reveal_type(x) # revealed: Literal[1, 2]
```

## After if-statement without explicit else branch, assume else branch with negative clause

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, need also test for x is not changed in the if branch like:

def optional_int() -> int | None: ...

x = optional_int()

if x is None: pass

reveal_type(x)  # revealed: int | None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I didn't see it being tested else where. I added another variable in the same mdtest here. I wonder if a new mdtest file would be better to test post if block narrowing scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a new mdtest file would be better to test post if block narrowing scenarios.

I think that's a good idea. We should also add tests for scenarios with (multiple) elif branches, but no else branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new test file and added couple of more tests.

self.flow_merge(pre_if);
// if there's no else clause, we should act as if we had an empty else clause,
// by again following the same logic as above for elif/else clauses.
post_clauses.push(self.flow_snapshot());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructing an actual dummy node will cause issues, and I don't think it's necessary. But I do think we could remove the logic duplication here. Instead of iterating over clauses above, we could iterate over (test, body) pairs, where both elements of the pair are Option. For a normal else clause, test would be None but body would be Some; if there's no else clause, we'd insert a (None, None) at the end.

That said, without trying to write it, I'm not sure if the code necessary to do this might end up being larger / harder to understand than the duplication you currently have here. Open to whatever you think is best, having worked on this.

@carljm
Copy link
Contributor

carljm commented Nov 28, 2024

Thank you, this is great!

sransara and others added 5 commits November 28, 2024 00:33
…f_statement.md

Co-authored-by: Carl Meyer <carl@oddbird.net>
…f_statement.md

Co-authored-by: Carl Meyer <carl@oddbird.net>
…f_statement.md

Co-authored-by: Carl Meyer <carl@oddbird.net>
…f_statement.md

Co-authored-by: Carl Meyer <carl@oddbird.net>
…f_statement.md

Co-authored-by: Carl Meyer <carl@oddbird.net>
@sransara
Copy link
Contributor Author

Constructing an actual dummy node will cause issues, and I don't think it's necessary. But I do think we could remove the logic duplication here. Instead of iterating over clauses above, we could iterate over (test, body) pairs, where both elements of the pair are Option. For a normal else clause, test would be None but body would be Some; if there's no else clause, we'd insert a (None, None) at the end.

That's a neat idea. I think it will work. Let me try that.

sransara and others added 2 commits November 28, 2024 06:07
Copy link

codspeed-hq bot commented Nov 28, 2024

CodSpeed Performance Report

Merging #14621 will not alter performance

Comparing sransara:gh-14593-inverse-narrowing-without-explicit-else-branches (864acb6) with main (6f1cf5b)

Summary

✅ 32 untouched benchmarks

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a somewhat significant performance regression. Could we avoid taking a post_clauses snapshot when e.g. the constraints are empty?

@sransara
Copy link
Contributor Author

There seems to be a somewhat significant performance regression. Could we avoid taking a post_clauses snapshot when e.g. the constraints are empty?

I'm taking a look. Actually my previous commit with collecting to vector 29014a8 didn't complain about perf regression. I wonder being fancy with iter chains is actually slowing it down.

@AlexWaygood
Copy link
Member

Actually my previous commit with collecting to vector 29014a8 didn't complain about perf regression. I wonder being fancy with iter chains is actually slowing it down.

yeah, this is pretty surprising. I would expect collecting into a vector to be expensive, and using an iter chain to be relatively cheap. Not sure what's going on here.

@AlexWaygood
Copy link
Member

I think what might be going on is that there was always a performance regression, but it was right on the borderline of the arbitrary 4% line that codspeed uses to determine whether or not to comment on the PR. So purely by chance, previous measurements on the PR "got lucky" due to random noise in the benchmarks, and codspeed never commented on the PR. But the penultimate measurement that codspeed made went just over the line, and cause it to comment and fail the check.

The check is now passing again, but you can see that there is still a perf regression in the benchmaks that's reasonably significant, it's just once again not quite big enough for codspeed to complain about it: https://codspeed.io/astral-sh/ruff/branches/sransara:gh-14593-inverse-narrowing-without-explicit-else-branches

@sransara
Copy link
Contributor Author

That makes sense. Looking into @MichaReiser's suggestion.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!!

@carljm
Copy link
Contributor

carljm commented Nov 28, 2024

I don't think Micha's suggestion to bring down the regression is possible. At this stage (semantic indexing) there is always at least one constraint; it's simply the test expression(s) of the if and each elif, stored for later analysis. We can't know until type inference whether those test expressions apply any actual narrowing constraints.

I think this falls into the general bucket we are already aware of that snapshots are expensive and we may need to improve perf of constructing the use def map. Short of tackling that larger project, I think we just have to accept this regression as the cost of correct semantics here.

So I will go ahead and merge this as is. Thank you for the PR!

@carljm carljm merged commit 3f6c65e into astral-sh:main Nov 28, 2024
21 checks passed
@sransara
Copy link
Contributor Author

Thank you for the review and merge!

Looking at the restore and snapshot implementation, I can see how the symbol_states can keep growing to a point that IndexVec.clone become an expensive operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[red-knot] Inverse narrowing without explicit else branches
6 participants