Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement machine dependent peephole optimizer at lower time #8035

Open
russellhadley opened this issue May 9, 2017 · 5 comments
Open

Implement machine dependent peephole optimizer at lower time #8035

russellhadley opened this issue May 9, 2017 · 5 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@russellhadley
Copy link
Contributor

russellhadley commented May 9, 2017

New pass is intended to exploit machine dependent instructions and allow for low level optimization based on specific instruction semantics.

Goals:

  • Run before register allocation to remove issues with false dependencies and allow reduction in register pressure.
  • Exploit specific features of the target ISA. Including but not limited to:
    • Immediate encoding size/shift semantics
    • Particular condition flag implementation
    • Target dependent address mode formation
  • Allow more sophisticated instruction selection:
    • bt{s|r|c} formation
    • Aggressive optimization around condition flag/branch sequences.
    • Avoid machine glass jaws like LCP.
  • Reorder/rework instructions to reduce register pressure.

Particular open questions:

  • Which dataflow formulation should be used? Adjacent in window? Expression temp def/use? Unaliased SSA or some cheaper extended basic block approximation?
  • Run just before RA or at higher tier run after?
  • How to encapsulate a transform.
  • How to enforce a high level of debug dump/tracing functionality for managing the transforms.

This is a big and perhaps controversial feature, but is intended to allow for more proactive engagement on code selection opts for collaborators with interests in particular targets.

category:design
theme:optimization
skill-level:expert
cost:extra-large
impact:large

@russellhadley
Copy link
Contributor Author

Add random repro case for bit shifting that we're currently missing on x86. https://gist.github.com/russellhadley/e55fc9a918626166d98077aa3c8047c6

@mikedn
Copy link
Contributor

mikedn commented May 10, 2017

Add random repro case for bit shifting that we're currently missing on x86.

Presumably you're referring to the and x, 31 added by the C# compiler. That can be easily be removed in lowering or morph. I once tried in morph, saves ~500 bytes in jitdiff fx.

@russellhadley
Copy link
Contributor Author

@mikedn If you have the change we'd take it. :) The example was in an old issue in our internal database and I was just cleaning house.

@mikedn
Copy link
Contributor

mikedn commented May 10, 2017

@russellhadley The morph version is in dotnet/coreclr#8744. I did it in morph thinking that it may be a good idea to get rid of unnecessary instructions early but then gave up on it because morph is already too hairy. I suppose I'll make a lower version one of these days. After all the example you have is written by me, I just found it on my HDD :)

@mikedn
Copy link
Contributor

mikedn commented May 15, 2017

PR for shift count masking removal in lowering: dotnet/coreclr#11594

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
@BruceForstall BruceForstall removed the JitUntriaged CLR JIT issues needing additional triage label Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants