-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64 - Allow double constants to CSE more often #65176
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsWill try to address: #64794
|
Regarding the arm64 build failures, you can repro it on your machine if you do
|
The builds are still failing with AV. Not sure if |
It is valid, but crossgen2 is failing as it's hitting an assertion. I think we need to make more adjustments to this. I'm not familiar enough with CSE yet so I would need to spend some time with this. |
I verified and it repros on local machine.
|
Draft Pull Request was automatically closed for inactivity. Please let us know if you'd like to reopen it. |
@kunalspathak This is basically done, but we need to merge #71174 to get an accurate assessment of the regressions. |
@dotnet/jit-contrib This is ready. |
@kunalspathak been looking at the
So basically, the pattern I think this is probably fine as this doesn't seem to be common. I did try an experiment by marking the |
What's the current "max CSE" count? Given all the improvements we've had to various places causing more CSEs to be possible, is it potentially worth revisiting this count and upping it (for .NET 8)? |
It's 64 at the moment. @kunalspathak I think the change is the best we can do at the moment. I can't quite figure out why we are not hoisting it out a second time from Increasing the |
Looking at some diffs of We use and then, we store it on stack again (not needed): It might be not because of this PR, but with that, some odd decisions are happening in LSRA that we should understand first, before merging it. Can you get me a before/after JITDump so I can understand why LSRA is behaving this way? |
I went through other diffs and they looks good except the weird LSRA issue and some unwanted diffs in test. Please double check why they are coming. |
Diffs in The diffs in |
@kunalspathak Been deeply looking at With this PR, it creates a CSE for a double constant: But because constant propagation does not happen for CSEs, constant folding does not occur and an assertion prop for I did some tinkering with assertion prop where I could get it to recognize that the cse variable represents a constant but I don't know if we want to do that. I didn't get a lot of diffs, though it would fix this regression when combined with this PR. But, this is part of |
Sounds good to me.
Do you mean CSE limits? I would be still interested in knowing the reasoning behind #65176 (comment) because they do show-up in some benchmark diffs (in hot loops), especially the ones where we spill the constant on stack. |
Pushing this to .NET 8 as we need to investigate more in the regressions. |
@TIHan please put this to draft until you are ready to work on this. |
Will try to address: #35257
Description
At the moment on ARM64, we say that double constants are cheap enough to not CSE if it fits within the range of valid immediate values for
fmov
. I believe the reason we do this is because we think the constants are more likely to be contained in lowering. However, there are no ARM64 floating-point instructions that support immediate values apart fromfmov
, therefore it's not beneficial to disallow CSE more often for double constants.This PR almost matches the behavior of x64 for double constants simply by increasing the cost by 1 when the value is in the immediate value range for
fmov
.There are regressions, but when combined with this change: #71174, there are less regressions because it will not allow CSE'ing of double constants when part of a
GT_SIMD
Init
/InitN
intrinsic.Acceptance Criteria
GT_SIMD
vector constructors #71174