-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return to more structured type rules for block and if #1148
Conversation
…en even if it has an unreachable child, keep it with that concrete type. this means we no longe allow the silly case of a block with an unreachable in the middle and a concrete as the final element while the block is unreachable - after this change, the block would have the type of the final element
… a result, even if the if condition is unreachable, to parallel block
100e4dd
to
b87c7b4
Compare
It's not clear to me when you say "valid" whether you're referring to valid wasm or valid binaryen ir -- but neither of those examples are valid wasm. The stack is only valid at the end of a block if the contents of the stack match the block's signature. Since the signature in both cases is required to be empty, there can't be an |
725ae8b
to
2f33046
Compare
Yeah, I should have been clearer. I wasn't sure if those are valid wasm or not - looks like not, based on what you say. But we started to accept them in binaryen as a result of the wasm stack changes. I don't remember the full reasoning back then - maybe we just relaxed validation rules because it seemed simplest - but I think we can do things better with the changes in this PR. |
Looking at the spec test issues here more carefully, some just needed to be fixed. They were already fixed upstream, so what might have happened is when we switched to the stack stuff the tests were not yet fixed at that time. |
Hmm, maybe those tests were from before block signatures were added. Btw, |
Yeah, binaryen still supports |
52912c0
to
d1874d5
Compare
… taken or not. whether they are dead code should not affect how they influence other types in our IR disable wasm2wasm tests FIXME
d1874d5
to
2b73e69
Compare
Added a commit to also clean up the "untaken br" issue mentioned above. That means that we don't ignore unreachable brs in type checking. Consider
Then before this change the type of the block would be unreachable (since the br_if is not taken). After this change the type would be none, since has a br referring to it. This gets rid of a bunch of unnecessary complexity (ignoring untaken branches, handling cases where a branch becomes taken or untaken), and complements the other commits in returning us to a more structured type system. |
Thinking some more on this, I think the core issue is the type system changes from #903 etc., when we moved to make a block unreachable if any of its children is unreachable. When we made that change we still allowed the annoying case mentioned at the top of this issue,
The block has type unreachable despite having a final element which has a concrete type, which is wacky. We probably should have made that not validate back then, and that's basically all this PR does - see the The one slightly odd result from this is that the block in the example should have type i32, while if the last element were I think this makes sense to do despite the asymmetry, because in wasm there is already an asymmetry there, And actually the change in this PR makes things better there too, while we could strip the type in that trivial case, we couldn't for
In other words, requiring the last element to be valid for the type, as this PR enforces, means we can't change types in silly ways like removing the i32 from that block. Furthermore, this is more efficient: when updating the type of a block, if the last element is concrete, we don't need to scan the other children to see if one is unreachable. Finally, this change makes the code simpler and it seems more resilient to fuzzing. @dschuff, @jgravelle-google, you suggested the type system change in #903 etc. - thoughts on this change? If you don't have time to do a full review of the code, it would still be helpful to know if you think this direction makes sense or not for the type system. |
I don't think I have strong opinions on principle. Making the type of a block match the last element (and especially having the asymmetry with none vs concrete types) does seem more 'wacky' to me on the face of it than making the children of a block symmetric. But I think this decision should be driven by its consequences, especially 1) ease and simplicity/minimality of the conversion to/from wasm, and 2) ease of implementing and reasoning about optimization. (and if there is weirdness, it's a good property that the weirdness directly corresponds to weirdness in wasm itself). IMO You've made a strong enough case here about # 2, what is the impact on conversion to and from wasm? |
I think this makes conversion to wasm simpler and better in that we have fewer blocks with type (However, we can't get rid of that unreachable-handling code because we do still have unreachable blocks without a concrete final element, like I don't think conversion from wasm is affected, since there are no unreachable blocks like that coming from there. I.e., wasm wouldn't tell us to create stuff like |
…t to anyhow, and it's more complicated and error prone)
f23a92d
to
1ae7726
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the "taken" terminology thing predates this change to no need to block it on that.
@@ -103,11 +103,11 @@ inline std::set<Name> getBranchTargets(Expression* ast) { | |||
// Finds if there are branches targeting a name. Note that since names are | |||
// unique in our IR, we just need to look for the name, and do not need | |||
// to analyze scoping. | |||
// By default we ignore untaken branches. You can set named to | |||
// note those as well, then any named branch is noted, even if untaken | |||
// By default we consider untaken branches (so any named use). You can unset named to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This terminology "untaken" branches is confusing to me, since whether a branch is taken is a dynamic property that changes from run to run. Do you mean unreachable branches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it should be "unreachable branches" I guess. I'll do that in a followup.
This PR undos some damage from the wasm stack change. When we adapted to that change, we started to allow things like this:
The block has a return value flowing out, but is not marked as having a value, and it is valid because that value is in unreachable code. The same thing can happen in an if-else with an unreachable condition and arm:
This is bad for optimization as it means the optimizer needs to know if we are in unreachable code. For example, if the optimizer removes the
unreachable
in that block, suddenly it won't be valid any more. The fuzzer found a bunch of bugs on this, and it's better to fix it properly in the type system than to hack each pass so it is aware of this aspect of unreachable code.Concretely, this PR takes us back to the more structured logic from before that wasm change, and in these two examples both the block and if would be marked
(result i32)
. In other words,(As a followup, perhaps we should also get rid of the branch change that also happened at the same time, where unreachable branches are ignored for type checking. Not sure about that one.)