Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Function definition/function call support in Taichi IR #602

Open
archibate opened this issue Mar 15, 2020 · 21 comments
Open

[IR] Function definition/function call support in Taichi IR #602

archibate opened this issue Mar 15, 2020 · 21 comments
Assignees
Labels
feature request Suggest an idea on this project

Comments

@archibate
Copy link
Collaborator

archibate commented Mar 15, 2020

Concisely describe the proposed feature
I would like to add real function support so that no more IR spam space wasting and finally support recursion.

Describe the solution you'd like (if any)

  1. Add FuncBodyStmt & its parser, whose is_container_stmt = true, contains a scope/block.
  2. In OpenGL backend: emit GLSL function when visit(FuncBodyStmt).
  3. Add FuncCallStmt, then implement call in OpenGL backend.
  4. In LLVM backend: use llvm::Function for support, this can be harder than GL one.
  5. Add FuncReturnStmt for them and support return.
  6. Add ti.inline/ti.noinline decorator, also may do detection if a function is better inlined.

Additional comments
I'm giving up #543, it begin from stage 4.
This issue is also related to #536.

@archibate archibate added the feature request Suggest an idea on this project label Mar 15, 2020
@archibate
Copy link
Collaborator Author

archibate commented Mar 15, 2020

@yuanming-hu Should we have distinct FrontendFuncBodyStmt and FuncBodyStmt?
I found FrontendWhileStmt differ from WhileStmt because: Expr cond vs Stmt mask.
Or maybe only FrontendFuncCallStmt is needed? Since Expr args[].

@yuanming-hu
Copy link
Member

@yuanming-hu Should we have distinct FrontendFuncBodyStmt and FuncBodyStmt?
I found FrontendWhileStmt differ from WhileStmt because: Expr cond vs Stmt mask.
Or maybe only FrontendFuncCallStmt is needed? Since Expr args[].

I think so. We will need FrontendFuncDefStmt and FuncCallExpression for the frontend, and FunctionDefStmt and FuncCallStmt for the middle-end.

Before start implementation, we should thoroughly discuss implementation plans. Introducing functions to Taichi IR is a lot of work and needs careful considerations.

@archibate
Copy link
Collaborator Author

Great! I'm adding FuncBodyStmt, with std::unique_ptr<Block> body, like WhileStmt does, was this ok?

@archibate
Copy link
Collaborator Author

archibate commented Mar 15, 2020

Found GL build error.Fixed after merging with master.

@yuanming-hu yuanming-hu changed the title real function support rather than force inline Function definition/function call support in Taichi IR Mar 18, 2020
@yuanming-hu
Copy link
Member

yuanming-hu commented Mar 18, 2020

Describe the solution you'd like (if any)

  1. Add FuncBodyStmt & its parser, whose is_container_stmt = true, contains a scope/block.
  2. In OpenGL backend: emit GLSL function when visit(FuncBodyStmt).
  3. Add FuncCallStmt, then implement call in OpenGL backend.
  4. In LLVM backend: use llvm::Function for support, this can be harder than GL one.
  5. Add FuncReturnStmt for them and support return.
  6. Add ti.inline/ti.noinline decorator, also may do detection if a function is better inlined.

Additional comments
I'm giving up #543, it begin from stage 4.
This issue is also related to #536.

This basically makes sense to me!

Before we introduce the new IR, as you mentioned (#612 (comment)), we'd better consider where to store the definitions of the functions.

Currently, the Taichi IR only has a kernel body (which is a Block). We'd better change this to separate function definitions from the main program.

Pre-mature thoughts

I suggest we create a Module class, with two members

  • functions (std::vector<std::unique_ptr<Stmt>>) to hold all the function definitions
  • main (std::unique_ptr<Block>) to hold the original kernel body (without function definitions)

New IR to be introduced:

  • FrontendFuncDefStmt
    • members: return value, arguments, ...
  • FrontendFuncRetStmt
  • FuncCallExpression
    • func (Stmt *): a raw pointer to FrontendFuncDefStmt
    • arguments (std::vector<Expr>): arguments

(after AST lowering, )

  • FuncDefStmt
  • FuncRetStmt
  • FuncCallStmt
    • func (Stmt *): a raw pointer to FunctionDefStmt
    • arguments (std::vector<Stmt *>): input arguments

Future steps:

  • Functions that return a tuple? This will be useful for functions that return a ti.Vector/Matrix
  • A function inlining pass

More considerations:

  • Frontend function template instantiation
  • Think about vectorization
  • Think about autodiff

@archibate
Copy link
Collaborator Author

archibate commented Mar 18, 2020

I suggest we create a Module class, with two members
* functions (std::vector<std::unique_ptr<Stmt>>) to hold all the function definitions
* main (std::unique_ptr<Block>) to hold the original kernel body (without function definitions)

You may also want main to bestd::vector<std::unique_ptr<Stmt>> for multiple kernel? Namely kernels?

New IR to be introduced:
* FrontendFuncDefStmt
* FuncCallExpression
* func (Stmt *): a raw pointer to FrontendFuncDefStmt
* arguments (std::vector<Expr>): arguments

Currently I'm only doing FuncCallStmt, so no return value worrying me.

@yuanming-hu
Copy link
Member

That makes sense - let me think about it.

(Taking a bath before it's too late. Will be back to this in 30 min)

@archibate
Copy link
Collaborator Author

archibate commented Mar 18, 2020

You also want CompiledFunction implemented some way:

plan A:
std::map<FuncId, std::unique_ptr<Stmt>> for functions, where FuncId contains name and argtypes, with operator==() and __hash__() equipped.

plan B:
std::map<std::string, std::unique_ptr<FuncDefStmt>> for functions, where std::string contains name, and std::map<FuncArgInfo, std::unique_ptr<CompiledFunction> in FuncDefStmt.

plan C:
std::map<FuncId, std::unique_ptr<CompiledFunction>> for functions, where FuncId contains name and argtypes, with operator==() and __hash__() equipped.

@yuanming-hu
Copy link
Member

Back. What does CompiledFunction hold exactly?

@yuanming-hu
Copy link
Member

I feel like this needs a lot of changes to be made and therefore its implementation plan must be considered very carefully. Taichi IR is the core of this project and I believe introducing function support to it worth the time. I suggest we postpone implementing this until things are mature.

Maybe it's also worth considering to finish #583 and #601 first.

@archibate
Copy link
Collaborator Author

Back. What does CompiledFunction hold exactly?

Potentially source_code for GL backend..

@yuanming-hu
Copy link
Member

Back. What does CompiledFunction hold exactly?

Potentially source_code for GL backend..

I see. I guess probably it's easier to re-emit the source code during code-gen instead of saving it? I have to sleep now. Let's try to finalize the two PRs mentioned above first - Good night!

@archibate archibate changed the title Function definition/function call support in Taichi IR [IR] Function definition/function call support in Taichi IR Mar 26, 2020
@archibate archibate mentioned this issue Mar 29, 2020
10 tasks
@k-ye
Copy link
Member

k-ye commented Mar 30, 2020

I've been following some works regarding IRs/lambda calculus supporting automatic differentiation as first-class concepts (e.g. Relay IR in TVM). This makes me wonder if we should consider the IR works at a (potentially much) larger scope.

  • Such an IR might help us loose the Kernel Simplicity Rule? Or is this rule coming from the fact that the gradient computation happens across the Taichi kernels (i.e. where the Tape is necessary)?
  • Maybe such an IR will also enable us to compute derivatives of arbitrary order?

But I'm not sure if my concerns here are valid. @yuanming-hu to comment on this..

A few references that I've found so far (I haven't understood them completely, nor are they close to be exhaustive):

@yuanming-hu
Copy link
Member

I almost forgot the need for supporting autodiff here. Thanks for reminding me of this. Autodiffing functions wouldn't be too hard though. Given a function that takes (a, b, c, d) and returns (x,y, z), we just need to compute d(x,y, z)/d(a, b, c, d) using classic reverse-mode autodiff. Note that in Taichi functions (a, b, c, d, x, y, z) is simply a tuple of scalar values.

  • Such an IR might help us loose the Kernel Simplicity Rule? Or is this rule coming from the fact that the gradient computation happens across the Taichi kernels (i.e. where the Tape is necessary)?

The kernel simplicity rule happens because we currently do not support auto diff for-loops with mutable local variables that are carried through iterations. For example,

p = 1
for i in range(10):
  p += p

ret[None] = p

#581 added support for autodiff mutable variables within if, but the case with for needs implementation. The kernel simplicity rule is imposed to make sure users do not allocate mutable local vars outside the for loop, so that there're no mutable local vars that are carried in the for loops to be differentiated.

  • Maybe such an IR will also enable us to compute derivatives of arbitrary order?

This can be a completely new research direction :-)

Thanks for the references. They seem very interesting. I did read part of these when I was designing DiffTaichi. I'll take a deeper look if necessary.

@MarisaKirisame
Copy link

Relay Dev who designed&implemented the AD algorithm here. Arbitrary order of derivative is trivial as long as the output of ad can be feed back as the input of ad.
One thing taichi should consider, if it want higher order ad, is to support forward mode ad, as the forward-over-reverse (do reverse for first order, then do forward mode ad for higher order) pattern is more efficient then doing reverse multiple time.
Hessian vector product(which also require forward mode ad) is also useful as it is algorithmically more efficient then computing the Hessian.
(TBH I am just subscribing myself to this thread, lol)

@k-ye
Copy link
Member

k-ye commented Apr 2, 2020

Thanks!

Could you also briefly explain the purpose of Let? I'm under the impression that Let converts the IR to be ANF form, which then enables the transformation operator (backpropagator) to work on closures or arbitrary functions?

I guess Taichi doesn't really need to support closures (in the short term). On the other hand, closure at Taichi level is not the same thing as that at the IR level. Would this support be a requirement for the transformation operator to support function defs or higher-order gradients? That is, maybe the operator would produce closures during the IR transformation?

@yuanming-hu
Copy link
Member

Relay Dev who designed&implemented the AD algorithm here. Arbitrary order of derivative is trivial as long as the output of ad can be feed back as the input of ad.
(TBH I am just subscribing myself to this thread, lol)

Welcome! :-) I agree it is trivial as long as our make_adjoint pass (reverse-mode AD) can take its own output as input. However, this is not the case right now in Taichi IR. Compared to a more functional IR system designed for deep learning (Relay as a representative example, correct me if I misunderstand anything) Taichi IR has more branching and even mutable local variables. We also plan to support auto-diffing while loops. To differentiate general SSA programs with control flows, we have to keep the computational history of the mutable local variables using stacks (following https://arxiv.org/pdf/1810.07951.pdf).

This means stack instructions will be inserted into the IR after autodiff. However, currently make_adjoint does not process the stack instructions (which belongs to the output of the previous autodiff) correctly.

One thing taichi should consider, if it want higher order ad, is to support forward mode ad, as the forward-over-reverse (do reverse for first order, then do forward mode ad for higher order) pattern is more efficient then doing reverse multiple time.

This is interesting - are you talking about evaluating the contraction of the n-th order gradient of L (which is a n-order tensor) with an n-1-th order tensor? If so I agree that we can use n-1 forward-mode AD plus 1 reverse-mode AD. I also agree that forward-mode AD is easier & cheaper than reverse-mode AD here.

For n=2 (Hessian vector prod as you mentioned, which I believe is the most common case), I think we can do either forward + reverse or reverse + forward.

Hessian vector product(which also require forward mode ad) is also useful as it is algorithmically more efficient then computing the Hessian.

Right, I think for some problems Taichi is trying to solve, we won't even have the space to store the Hessian matrix with O(#DOF^2) entries. After implementing this, users can always assemble the whole Hessian using its columns provided by our Hessian vector autodiff.

Thanks for the discussions here. I think one possibility is to implement a forward-mode AD pass to that we can do Hessian vector product, which will enable the use of second-order optimizers.

@MarisaKirisame
Copy link

MarisaKirisame commented Apr 3, 2020

The biggest difference between taichi and relay is that they are not even on the same level. Relay do not care about kernel - it treat all kernel as black box, and rely on tvm to optimize them. So, relay support diffing mutable variable and branch just like pytorch.

However, differentiating tvm kernel is hard. There is some pr working on it, but rn we just do it by hand.

If you want to support more kernel, I highly recommend reading the relevant chapter in Evaluating Derivatives. It cover way more cases (with more optimizations) then the Zygote paper.
The halide autodiff paper is good too.

for Hessian Vector Product reverse then forward make more sense, as a function take multiple input and have single output, which make it amicable for reverse mode.

For higher order it is essentially Hessian Vector Product too, but when going forward mode ad, instead of the classical dual number the right hand side is a taylor series truncated to nth order (classical dual number is just right hand side truncated to first order).

ailzhang pushed a commit that referenced this issue Nov 2, 2022
…name (#6495)

Issue: #602

### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lin-hitonami added a commit that referenced this issue Nov 3, 2022
Issue: #602

### Brief Summary
We only need to disassemble the arguments and pass them to the real
function one by one just as we pass matrix arguments to the kernel.

It does not support local matrix and element of matrix field as argument
when real_matrix=True yet.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lin-hitonami added a commit that referenced this issue Nov 7, 2022
Issue: #602

### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lin-hitonami added a commit that referenced this issue Nov 24, 2022
Issue: #602 #6590

### Brief Summary
Only supports scalar struct (every element in the struct is a scalar)
for now.

This PR does the following things:
1. Let `FuncCallStmt` return the `real_func_ret_struct *` result buffer
instead of returning the return value directly.
2. Add `GetElementStmt` and `GetElementExpression` to get the i-th
return value in a result buffer
3. Add `StructType.from_real_func_ret` to construct the returned struct
to the `StructType` in Python

Will add support for nested struct and matrix in struct in the following
PRs.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lin-hitonami added a commit that referenced this issue Nov 28, 2022
#6734)

Issue: #602 #6590
Also fixed the bug that scalarize pass is not run on real functions
thanks to @jim19930609.
### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
…name (taichi-dev#6495)

Issue: taichi-dev#602

### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Issue: taichi-dev#602

### Brief Summary
We only need to disassemble the arguments and pass them to the real
function one by one just as we pass matrix arguments to the kernel.

It does not support local matrix and element of matrix field as argument
when real_matrix=True yet.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
…6522)

Issue: taichi-dev#602

### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Issue: taichi-dev#602 taichi-dev#6590

### Brief Summary
Only supports scalar struct (every element in the struct is a scalar)
for now.

This PR does the following things:
1. Let `FuncCallStmt` return the `real_func_ret_struct *` result buffer
instead of returning the return value directly.
2. Add `GetElementStmt` and `GetElementExpression` to get the i-th
return value in a result buffer
3. Add `StructType.from_real_func_ret` to construct the returned struct
to the `StructType` in Python

Will add support for nested struct and matrix in struct in the following
PRs.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
taichi-dev#6734)

Issue: taichi-dev#602 taichi-dev#6590
Also fixed the bug that scalarize pass is not run on real functions
thanks to @jim19930609.
### Brief Summary

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
lin-hitonami added a commit that referenced this issue Jun 6, 2023
Issue: #602 

Assuming `FuncCallStmt` can load or modify any address. I will write a
pass to find out all the addresses loaded and modified in a
`FuncCallStmt` later.

<!--
copilot:all
-->
### <samp>🤖 Generated by Copilot at cf4766c</samp>

### Summary
📞🔧🧪

<!--
1. 📞 for enhancing the control flow graph analysis and optimization to
support function calls.
2. 🔧 for improving the simplification of basic blocks by removing a
redundant check and adding a condition for function call statements.
3. 🧪 for testing some experimental features of taichi, such as template
arguments and assertions in kernels.
-->
This pull request enhances the control flow graph analysis and
optimization to support function calls in taichi kernels. It also
improves the simplification of basic blocks and tests some experimental
features of taichi. The main files affected are
`taichi/ir/control_flow_graph.cpp`, `taichi/transforms/simplify.cpp`,
and `tests/python/test_function.py`.

> _We unleash the power of `function calls`_
> _We optimize the `control flow graph`_
> _We test the limits of `taichi`'s core_
> _We simplify the `basic blocks` of wrath_

### Walkthrough
* Add support for function calls in control flow graph analysis and
optimization
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR220-R222),
[link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL979-R983))
* Prevent simplifying function call statements that may have side
effects or modify global variables
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edL98-L100),
[link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edR107-R108))
* Test the experimental template feature that allows passing arguments
to functions as template parameters
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L158-R166),
[link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L175-R177))
* Test the support for assertions inside kernels
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L327-L329),
[link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L335-R344),
[link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L361-R362))
* Remove a redundant check for function call statements in
`BasicBlockSimplify`
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-58b0ebe6a129091d8ae4753ba2bba80c7cc000e7f8eab635a337094582f543edL98-L100))
* Remove a comment that is no longer relevant in `test_function.py`
([link](https://github.com/taichi-dev/taichi/pull/8139/files?diff=unified&w=0#diff-d6f5c74e17462c8ff96d5bba06ebf81d16c015ca667fa945513a80be17bef017L327-L329))
lin-hitonami added a commit that referenced this issue Jun 12, 2023
…8155)

Issue: #602
Pass `gather_func_store_dests` gathers all destinations whose content
may change after a real function is called. The change may happen in the
real function or in another real function that the real function calls.
This pass uses Tarjan's strongly connected components algorithm to find
the store destinations for all real functions a kernel calls, and store
them in `store_dests` of the respective function.

The global pointers are lowered in `lower_access`, so we need to gather
the store destinations twice: before and after pass `lower_access`.

<!--
copilot:all
-->
### <samp>🤖 Generated by Copilot at 2c5586e</samp>

### Summary
📝🛠️🚀

<!--
1. 📝 This emoji represents the addition of a new file and a new analysis
pass declaration, which are documentation-related changes.
2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class
and the removal of some redundant or incorrect checks, which are
bug-fixing or improvement-related changes.
3. 🚀 This emoji represents the introduction of a new enum type, a new
method, and a new parameter, which are feature-related changes.
-->
This pull request introduces a new analysis pass
`gather_func_store_dests` that can handle function calls in the IR and
optimize their memory access and aliasing. It updates the `Function`,
`FuncCallStmt`, and `ControlFlowGraph` classes and the
`compile_function` and `compile_taichi_functions` transforms to use a
new enum type `IRStage` and a new parameter `target_stage` to track and
control the IR stage of each function. It also modifies some existing
analysis functions and adds some include directives and forward
declarations to support the new pass.

> _To optimize function calls in the IR_
> _We need a new pass to infer_
> _The store destinations_
> _At different stages_
> _And use `IRStage` instead of `IRType` for sure_

### Walkthrough
* Add a new analysis pass `gather_func_store_dests` to collect the store
destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216))
* Modify the `FuncCallStmt` class to inherit from the `Store` trait and
implement the `get_store_destination` method, using the arguments of the
function call and the `store_dests` set of the called function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080))
* Remove or modify the checks for `FuncCallStmt` in the
`ControlFlowGraph` class, and use the `store_dests` set of the called
function to update the reaching definition analysis
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990))
* Add a new member variable `func_store_dests` to the `ControlFlowGraph`
class, which is a map from `Function` pointers to sets of `Stmt`
pointers, representing the store destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118))
* Replace the old enum type `IRType` with the new enum type `IRStage`,
which has more values to indicate different IR stages of function
compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the signature of the `compile_function` function to use the new
parameter `target_stage` instead of the old parameter `start_from_ast`,
to indicate the desired IR stage of the function compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the definition of the `compile_to_offloads` function to add two
calls to the new analysis pass `gather_func_store_dests`, before and
after the call to the `compile_taichi_functions` function, and to pass
different `target_stage` parameters to the `compile_taichi_functions`
function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51))
* Add or modify the include directives and forward declarations for the
header files `function.h`, `statements.h`, and `unordered_set` in the
source files and header files that use the `Function` class, the
`FuncCallStmt` class, or the `std::unordered_set` container
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3))
* Modify some comments in the header file `transforms.h` to remove the
mentions of not demoting dense struct fors or reducing the number of
statements before inlining, since these are no longer relevant or
necessary after the new analysis pass
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
L2ncE pushed a commit to L2ncE/taichi that referenced this issue Jun 20, 2023
…aichi-dev#8155)

Issue: taichi-dev#602
Pass `gather_func_store_dests` gathers all destinations whose content
may change after a real function is called. The change may happen in the
real function or in another real function that the real function calls.
This pass uses Tarjan's strongly connected components algorithm to find
the store destinations for all real functions a kernel calls, and store
them in `store_dests` of the respective function.

The global pointers are lowered in `lower_access`, so we need to gather
the store destinations twice: before and after pass `lower_access`.

<!--
copilot:all
-->
### <samp>🤖 Generated by Copilot at 2c5586e</samp>

### Summary
📝🛠️🚀

<!--
1. 📝 This emoji represents the addition of a new file and a new analysis
pass declaration, which are documentation-related changes.
2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class
and the removal of some redundant or incorrect checks, which are
bug-fixing or improvement-related changes.
3. 🚀 This emoji represents the introduction of a new enum type, a new
method, and a new parameter, which are feature-related changes.
-->
This pull request introduces a new analysis pass
`gather_func_store_dests` that can handle function calls in the IR and
optimize their memory access and aliasing. It updates the `Function`,
`FuncCallStmt`, and `ControlFlowGraph` classes and the
`compile_function` and `compile_taichi_functions` transforms to use a
new enum type `IRStage` and a new parameter `target_stage` to track and
control the IR stage of each function. It also modifies some existing
analysis functions and adds some include directives and forward
declarations to support the new pass.

> _To optimize function calls in the IR_
> _We need a new pass to infer_
> _The store destinations_
> _At different stages_
> _And use `IRStage` instead of `IRType` for sure_

### Walkthrough
* Add a new analysis pass `gather_func_store_dests` to collect the store
destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216))
* Modify the `FuncCallStmt` class to inherit from the `Store` trait and
implement the `get_store_destination` method, using the arguments of the
function call and the `store_dests` set of the called function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080))
* Remove or modify the checks for `FuncCallStmt` in the
`ControlFlowGraph` class, and use the `store_dests` set of the called
function to update the reaching definition analysis
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990))
* Add a new member variable `func_store_dests` to the `ControlFlowGraph`
class, which is a map from `Function` pointers to sets of `Stmt`
pointers, representing the store destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118))
* Replace the old enum type `IRType` with the new enum type `IRStage`,
which has more values to indicate different IR stages of function
compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the signature of the `compile_function` function to use the new
parameter `target_stage` instead of the old parameter `start_from_ast`,
to indicate the desired IR stage of the function compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the definition of the `compile_to_offloads` function to add two
calls to the new analysis pass `gather_func_store_dests`, before and
after the call to the `compile_taichi_functions` function, and to pass
different `target_stage` parameters to the `compile_taichi_functions`
function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51))
* Add or modify the include directives and forward declarations for the
header files `function.h`, `statements.h`, and `unordered_set` in the
source files and header files that use the `Function` class, the
`FuncCallStmt` class, or the `std::unordered_set` container
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3))
* Modify some comments in the header file `transforms.h` to remove the
mentions of not demoting dense struct fors or reducing the number of
statements before inlining, since these are no longer relevant or
necessary after the new analysis pass
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
PGZXB pushed a commit to PGZXB/taichi that referenced this issue Jul 13, 2023
…aichi-dev#8155)

Issue: taichi-dev#602
Pass `gather_func_store_dests` gathers all destinations whose content
may change after a real function is called. The change may happen in the
real function or in another real function that the real function calls.
This pass uses Tarjan's strongly connected components algorithm to find
the store destinations for all real functions a kernel calls, and store
them in `store_dests` of the respective function.

The global pointers are lowered in `lower_access`, so we need to gather
the store destinations twice: before and after pass `lower_access`.

<!--
copilot:all
-->
### <samp>🤖 Generated by Copilot at 2c5586e</samp>

### Summary
📝🛠️🚀

<!--
1. 📝 This emoji represents the addition of a new file and a new analysis
pass declaration, which are documentation-related changes.
2. 🛠️ This emoji represents the update of the `ControlFlowGraph` class
and the removal of some redundant or incorrect checks, which are
bug-fixing or improvement-related changes.
3. 🚀 This emoji represents the introduction of a new enum type, a new
method, and a new parameter, which are feature-related changes.
-->
This pull request introduces a new analysis pass
`gather_func_store_dests` that can handle function calls in the IR and
optimize their memory access and aliasing. It updates the `Function`,
`FuncCallStmt`, and `ControlFlowGraph` classes and the
`compile_function` and `compile_taichi_functions` transforms to use a
new enum type `IRStage` and a new parameter `target_stage` to track and
control the IR stage of each function. It also modifies some existing
analysis functions and adds some include directives and forward
declarations to support the new pass.

> _To optimize function calls in the IR_
> _We need a new pass to infer_
> _The store destinations_
> _At different stages_
> _And use `IRStage` instead of `IRType` for sure_

### Walkthrough
* Add a new analysis pass `gather_func_store_dests` to collect the store
destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-0bfbe49ff08844a76d5d2e1c5b81c2cf813be4a9089422b997bc380ec9a68eadR1-R103),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45L216-R216))
* Modify the `FuncCallStmt` class to inherit from the `Store` trait and
implement the `get_store_destination` method, using the arguments of the
function call and the `store_dests` set of the called function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R277-R289),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260L1062-R1062),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-917d9436dcaafa0f1e41ae9bad90273a303f036f00da94e417788a7fa1dc5260R1074-R1080))
* Remove or modify the checks for `FuncCallStmt` in the
`ControlFlowGraph` class, and use the `store_dests` set of the called
function to update the reaching definition analysis
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL164-L167),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL219-R216),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR695),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fL982-R977),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR988-R990))
* Add a new member variable `func_store_dests` to the `ControlFlowGraph`
class, which is a map from `Function` pointers to sets of `Stmt`
pointers, representing the store destinations of each function in the IR
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR117-R118))
* Replace the old enum type `IRType` with the new enum type `IRStage`,
which has more values to indicate different IR stages of function
compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L17-R17),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-aa860f71a793b08676a24cab247b43f5ed8d105a6493eeb1a035369b916bddc2L32-R32),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL9-R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dL31-R50),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L13-R33),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-f78d8ce92dcf8a10d2a446d35cc26f47fd2a42314b0799d263196b6eb858fe76L39-R48),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the signature of the `compile_function` function to use the new
parameter `target_stage` instead of the old parameter `start_from_ast`,
to indicate the desired IR stage of the function compilation
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L199-R200),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL330-R390))
* Modify the definition of the `compile_to_offloads` function to add two
calls to the new analysis pass `gather_func_store_dests`, before and
after the call to the `compile_taichi_functions` function, and to pass
different `target_stage` parameters to the `compile_taichi_functions`
function
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL47-R51))
* Add or modify the include directives and forward declarations for the
header files `function.h`, `statements.h`, and `unordered_set` in the
source files and header files that use the `Function` class, the
`FuncCallStmt` class, or the `std::unordered_set` container
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-837b90142d1730f6a3ab20c91f1f35c95335ef82a021c74fd4dbdb05ff0e164fR9),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-67e7205404aa056a1553f930af38b359e460f98a4ec335faec7d54aaf9df727fR10),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-05e2a2d0a9c9879a4fb5fde9baf5a43738c7601fc53e234a40ab9bc27d1512a5R5),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934R20),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-af3316673541832f351d12d7c2f45b3c49ba5caeafdad3a6356cb13d2524be3dR3))
* Modify some comments in the header file `transforms.h` to remove the
mentions of not demoting dense struct fors or reducing the number of
statements before inlining, since these are no longer relevant or
necessary after the new analysis pass
([link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L160-R161),
[link](https://github.com/taichi-dev/taichi/pull/8155/files?diff=unified&w=0#diff-448ac6e85e192a27e5ec7c54cd8a91545dc7c83f62d030eafb9c190383cfe934L192-R190))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Suggest an idea on this project
Projects
None yet
Development

No branches or pull requests

6 participants