-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check call has one predecessor before considering eliminating guard #6619
Check call has one predecessor before considering eliminating guard #6619
Conversation
Devin @jdmpapin, may I ask you to review this change? I won't ask Vijay to review, as he advised me on how I should try to fix the problem. Vijay @vijaysun-omr, FYI. |
Jenkins build all |
Jenkins build plinux,zos |
Jenkins build plinux |
Putting into my own words what went wrong in eclipse-openj9/openj9#15311 to make sure that I've properly understood... There was a cold virtual call block with multiple predecessors (probably copies of some original), each with its own nonoverridden guard node. When GVP processes such a guard node, it looks at the call to determine whether the guard can be eliminated, and in particular it finds a constraint for the receiver. But GVP generally expects a node not to be constrained until processing has reached its point of evaluation, and here we are constraining the node before processing has even reached its containing block. To constrain a load - edit: rather, to 'get the constraint' for a load - we use Because the nonoverridden guard logic constrains (edit: gets the constraint for) the load out of turn, Furthermore, whenever all the defs reaching a use have global constraints, the merged constraint also applies globally to the value number of the use. This is because a use with a single def shares the value number of the def, and a use with multiple defs has a fresh value number, i.e. anything with that value number comes from the same use. Now to put it all together. While processing the nonoverridden guard in one predecessor, I'm not fully clear on the reason why in the second predecessor VP did not consider the def from the first predecessor to be have been seen, but I assume it's because def constraints are treated like local constraints and propagated along edges. @vijaysun-omr, @hzongaro, please confirm whether this is the right understanding of the problem. Then, supposing (optimistically) that I do understand properly, I expect the fix to work (at least when I was worried about the case where the target block itself defines the auto before the call (e.g. due to priv arg remat), but I think it will be fine. In that case, the load will have only that def, and since it won't have been seen, there will be no seen defs, and However, as it stands I think this will unnecessarily prevent the guard from being remembered in I also have a few notes about the comment introduced just above the code change. First, the remat (or similar) case means that we can process all of the predecessors without seeing all definitions that reach from within the same loop iteration. Second, I don't believe it would be correct to (only) check that all predecessors have been processed. When Since all the above is so long and can't easily be "resolved" on a point-by-point basis, I'll list my concerns compactly here:
|
Thanks for reviewing, Devin @jdmpapin! I'm just poring over your detailed comments and questions. |
On second thought, I think it might be possible to get a non-null constraint in this case, when it's the second pass through a block in a loop and there is a store in the target block but not elsewhere, e.g. because of priv arg remat followed by dead store elimination when the inlined body does not use the argument. In that case, there might be a constraint on the back-edge, and if so, I think we would end up with that constraint. I think that would still be okay though, because any such constraint from the back-edge would have originally come from the def constraint created at the same def by VP's first pass over the loop. Constraints created in the first pass are known to hold on every iteration, and when the corresponding constraints differ in the second pass, they should only be strengthened. |
Yes - having discussed this off-line with Vijay @vijaysun-omr, while processing the second predecessor, the definition from the first predecessor is considered to have not been seen because that definition was on a path that was not part of the path that leads to the second predecessor.
Yes, my understanding of the problem is the same as yours. Thank you for describing it so thoroughly and clearly!
Just so I'm clear - are you suggesting I should revise this pull request so that, if the number of predecessors for the target block is greater than one, an entry will be added to I'll revise the comment to account for your insights and observations. Thank you, again! |
👍
👍
Yes, I think that any time we leave the nonoverridden guard in place, it should be added to |
Thanks, that's an improvement! However, a couple of my earlier points are still outstanding. In particular:
The reason to require (2) that
This is the case where a def occurs between G and the call. But if the call is in a block other than Here, the call node can still be found as long as In a sense, we'd have an incorrect constraint precisely because we considered a def that only reaches the call from elsewhere. |
In case some other similar issue exists, I was wondering if it would make sense to have the code that removes the dead-code and inserts a 'return' to also generate code that would crash (i.e. store to 0). Allowing the dummy return to execute can have far more sinister results (like unexpected behaviour or incorrect calculations). I know this is not directly related to this fix, but maybe this is an opportunity to introduce a change like this? |
Jenkins build all |
I'll defer to @hzongaro on this (since he looked at my case as well), but I believe the issue I was debugging which Henry confirmed was a duplicate resulted in the insertion of a dummy return. I agree that the link to mustTakeException() is not a direct link. I also don't want to slow down the delivery of this fix. |
Oh, I wasn't aware that an interaction had been seen in another duplicate |
The ppc64le Linux failure looks like #6571 Jenkins build plinux |
Thanks, Kevin @klangman - that's a very good idea. However, I'm inclined to handle that as a separate improvement as most of the distinct symptoms that are ultimately rooted in this bug do not involve |
Jenkins build plinux |
Everything has passed except for the known pp64le Linux failure |
@hzongaro, you can squash now |
In applying constraints for a non-overridden call guard, constrainIfcmpeqne looks at the guarded call node to see whether the guard can be eliminated. That's done using canFoldNonOverriddenGuard which calls mergeDefConstrants. That can have the side effect of creating global constraints. However, if there are paths to the call block that do not go through the block that contains the if the information used to create those global constraints might be incomplete. This change adds an additional check of whether the block containing the guarded call has only one predecessor and whether the block containing the call is actually the target of the branch before checking whether the guard can be eliminated. That avoids the possibility of other merge points between the block containing the if and the block containing the call, and hence that definitions from paths that do not include the if might reach the call.
98a034a
to
f247823
Compare
Squashing introduced no new changes, and the PR builds ran beforehand |
When applying constraints for a non-overridden call guard,
constrainIfcmpeqne
looks at the guarded call node to see whether the guard can be eliminated. However, if the call block has more than one predecessor, not all definitions that reach the call block might have been visited on the current walk. That might result in incorrect constraints being applied to the references on the call because of the incomplete information.This change adds an additional check of whether the block containing the guarded call has only one predecessor before checking whether the guard can be eliminated. That is a simple way of ensuring all the predecessors of the guarded call must already have been walked.
This change fixes eclipse-openj9/openj9#14489 and eclipse-openj9/openj9#15311