Skip to content

Commit

Permalink
[perf] Fix Taichi CPU backend compile parameter to pair performance w…
Browse files Browse the repository at this point in the history
…ith Numba. (#7731)

Issue: #7442

### Brief Summary

In this issue, Numba is a magnitude faster than Taichi due to the
absence of automatic vectorization.
The root cause is the incorrect passage of the `fast_flag`.

To solve this problem, `fast_flag` is now added to the initialization of
cpu codegen. Numba and Taichi now reveal comparable performance.
Here's perf comparison:
numba:            13052.542478MFlops
taichi(master): 6544.274409MFlops
taichi(this pr):  12778.240179MFlops

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
zxlbig and pre-commit-ci[bot] authored Apr 13, 2023
1 parent 0d26ffa commit 4eea1ec
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions taichi/codegen/llvm/codegen_llvm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2542,6 +2542,13 @@ void TaskCodeGenLLVM::initialize_context() {
TI_ASSERT(tlctx != nullptr);
llvm_context = tlctx->get_this_thread_context();
builder = std::make_unique<llvm::IRBuilder<>>(*llvm_context);
if (compile_config.fast_math) {
llvm::FastMathFlags fast_flags;
fast_flags.setNoInfs();
fast_flags.setNoSignedZeros();
fast_flags.setAllowReassoc();
builder->setFastMathFlags(fast_flags);
}
}

llvm::Value *TaskCodeGenLLVM::get_arg(int i) {
Expand Down

0 comments on commit 4eea1ec

Please sign in to comment.