Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt] Add ExtractPointers pass for dynamic index #7051

Merged
merged 2 commits into from
Jan 5, 2023

Conversation

strongoier
Copy link
Contributor

@strongoier strongoier commented Jan 4, 2023

Issue: #2590

Brief Summary

Under pure dynamic_index setting, MatrixPtrStmts are not scalarized. It actually produces 2n more instructions (n ConstStmts and n MatrixPtrStmts) than the scalarized setting, where n is the number of usages of MatrixPtrStmts. This PR adds ExtractPointers pass to eliminate all the redundant instructions. See comments in the code for details.

After this PR, the number of instructions after the scalarize() pass of the script in #6933 under dynamic index reduces from 49589 to 26581, and the compilation time reduces from 20.02s to 7.82s.

@strongoier strongoier added the full-ci Run complete set of CI tests label Jan 4, 2023
@netlify
Copy link

netlify bot commented Jan 4, 2023

Deploy Preview for docsite-preview ready!

Name Link
🔨 Latest commit 462ba87
🔍 Latest deploy log https://app.netlify.com/sites/docsite-preview/deploys/63b5612b5526980008c49587
😎 Deploy Preview https://deploy-preview-7051--docsite-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@strongoier strongoier added this to the v1.4.0 milestone Jan 4, 2023
Copy link
Contributor

@jim19930609 jim19930609 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking MatrixInitStmt might have the same optimization opportunity, if we change std::vector<Stmt*> elements to sth like std::vector<std::optional<Stmt*, TypedConstant>> elements:

%1 = ConstStmt(1);
%2 = ConstStmt(2);
%3 = MatrixInitStmt({%1, %1, %2, %2});

@strongoier
Copy link
Contributor Author

I was thinking MatrixInitStmt might have the same optimization opportunity, if we change std::vector<Stmt*> elements to sth like std::vector<std::optional<Stmt*, TypedConstant>> elements:

%1 = ConstStmt(1);
%2 = ConstStmt(2);
%3 = MatrixInitStmt({%1, %1, %2, %2});

Ah nice point. My concern here is that MatrixInitStmt is composed of actual computation results, which might not be that repetitive. But yeah we might explore this idea in other scenarios in the future. Or we should have a better CSE pass in general (it is currently too slow to handle the case in this PR).

@strongoier strongoier merged commit ae0882c into taichi-dev:master Jan 5, 2023
@strongoier strongoier deleted the extract-ptrs branch January 5, 2023 02:15
feisuzhu pushed a commit to feisuzhu/taichi that referenced this pull request Jan 5, 2023
Issue: taichi-dev#2590

### Brief Summary

Under pure `dynamic_index` setting, `MatrixPtrStmt`s are not scalarized.
It actually produces `2n` more instructions (`n` `ConstStmt`s and n
`MatrixPtrStmt`s) than the scalarized setting, where `n` is the number
of usages of `MatrixPtrStmt`s. This PR adds `ExtractPointers` pass to
eliminate all the redundant instructions. See comments in the code for
details.

After this PR, the number of instructions after the `scalarize()` pass
of the script in taichi-dev#6933 under dynamic index reduces from 49589 to 26581,
and the compilation time reduces from 20.02s to 7.82s.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this pull request May 13, 2023
Issue: taichi-dev#2590

### Brief Summary

Under pure `dynamic_index` setting, `MatrixPtrStmt`s are not scalarized.
It actually produces `2n` more instructions (`n` `ConstStmt`s and n
`MatrixPtrStmt`s) than the scalarized setting, where `n` is the number
of usages of `MatrixPtrStmt`s. This PR adds `ExtractPointers` pass to
eliminate all the redundant instructions. See comments in the code for
details.

After this PR, the number of instructions after the `scalarize()` pass
of the script in taichi-dev#6933 under dynamic index reduces from 49589 to 26581,
and the compilation time reduces from 20.02s to 7.82s.

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full-ci Run complete set of CI tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants