-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce a "try" ZIR and AIR instruction #11772
Comments
This change looks good to me, and I'll be keen to go with it in terms of implementation in the self-hosted x86_64 backend to see how it fares. |
In general, LLVM is bad (and slow) at optimizing a lot of "guards" that exit the function. This is a very severe issue In speculatively optimizing JIT compilers, which have a ton of branches that exit the function. Zig has nowhere near as many "guards" as JavascriptCore. However, this appears to be a less severe case of the same issue. There are lots of branches that immediately exit the function if false. Here is a fascinating post on why JavascriptCore switched away from LLVM to its own B3 backand
I don't think you should switch away from LLVM entirely, but adding a dedicated try instruction should improve your lowering to LLVM bitcode (and even unlock new optimizations for the self-hosted backend). |
Some quick stats on the ZIR part of this: AstGen.zig Before:
AstGen.zig After:
That's a 5% reduction in total ZIR bytes from introducing the new |
This introduces two ZIR instructions: * `try` * `try_inline` This is part of an effort to implement #11772.
Implements semantic analysis for the new try/try_inline ZIR instruction. Adds the new try/try_ptr AIR instructions and implements them for the LLVM backend. Fixes not calling rvalue() for tryExpr in AstGen. This is part of an effort to implement #11772.
This introduces two ZIR instructions: * `try` * `try_inline` This is part of an effort to implement #11772.
Implements semantic analysis for the new try/try_inline ZIR instruction. Adds the new try/try_ptr AIR instructions and implements them for the LLVM backend. Fixes not calling rvalue() for tryExpr in AstGen. This is part of an effort to implement #11772.
This introduces two ZIR instructions: * `try` * `try_inline` This is part of an effort to implement #11772.
Implements semantic analysis for the new try/try_inline ZIR instruction. Adds the new try/try_ptr AIR instructions and implements them for the LLVM backend. Fixes not calling rvalue() for tryExpr in AstGen. This is part of an effort to implement #11772.
Landed in d1bfc83. |
This introduces two ZIR instructions: * `try` * `try_inline` This is part of an effort to implement #11772.
Implements semantic analysis for the new try/try_inline ZIR instruction. Adds the new try/try_ptr AIR instructions and implements them for the LLVM backend. Fixes not calling rvalue() for tryExpr in AstGen. This is part of an effort to implement #11772.
This introduces two ZIR instructions: * `try` * `try_inline` This is part of an effort to implement ziglang#11772.
Implements semantic analysis for the new try/try_inline ZIR instruction. Adds the new try/try_ptr AIR instructions and implements them for the LLVM backend. Fixes not calling rvalue() for tryExpr in AstGen. This is part of an effort to implement ziglang#11772.
Motivation
While investigating #11498 I noticed that one of the problems - perhaps the main problem - is that our stage1 LLVM lowering manages to get optimized into something like this:
Whereas our stage2 LLVM lowering remains in the form that it goes in, which looks more like this:
A direct fix to this would be to implement #283 - something that I plan to investigate as well. However, this led me to notice a related difference in the input LLVM IR that gets generated in stage1 vs stage2. Here is an example:
Specifically for the
try foo();
line, stage1 lowers to this LLVM IR:while stage2 lowers to this:
Perhaps this difference is related to LLVM's relative ability to optimize. At least, the former LLVM IR is simpler since it has fewer basic blocks, creating less potential issues for LLVM both in terms of optimization and compilation speed.
So I wanted to start exploring how to emit the better code like stage1 does. The first thing I did was look at the corresponding AIR:
This is the AIR for the
try foo();
line. After pondering a potential optimization during the LLVM backend in order to lower this as desired, I made the following observations:block
, doing redundant work as the regular lowering, or it would require noticing too late that the optimization could have happened and then rewriting the LLVM IR using LLVM's API.try
is extremely common in Zig source code.Proposed Changes
In short summary, lower
try
more efficiently, taking advantage of a new special purpose ZIR instruction and AIR instruction.Taking the same Zig source code example from above, here is the proposed difference in ZIR:
ZIR Before
ZIR After
ZIR Explanation
This new
try
instruction would implicitly do the checking whether an operand is an error, and its result value would be the unwrapped payload. The body that it provides is what executes in the "is an error" case, which you can see here is running the defer expressions. In the future if #283 is implemented, this would change to simply take another operand which is the block to break from in case of an error. In both cases thetry
body would have the guarantee that there is no control flow possible from inside the try body to directly after the try instruction.Sema
Sema would have the option to lower the ZIR
try
instruction the same as before using already existing AIR instructions, however, introducing atry
AIR instruction as well would result in additional savings and better codegen without optimizations.AIR Before
This is the same snippet as above in the Motivation section, however I reproduced it below for comparison:
AIR After
Conclusion
This AIR is trivial to lower to only 1 branch / 1 basic block rather than the two required by the
block
/cond_br
combo.This would require adding support for this new kind of control flow in all the codegen backends, however, I think it will be worth it because it offers both improved compilation speed and codegen, which is the kind of tradeoff we are intentionally making for this compiler.
The text was updated successfully, but these errors were encountered: