Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenCL][Textures] Always use SSA for texture loading #14397

Merged
merged 8 commits into from
Mar 30, 2023

Conversation

echuraev
Copy link
Contributor

In some cases we must use SSA for textures loading but we didn't do
that. Example of such cases:

  1. Storing texture (NCHW4c) directly (w/o temporary buffer) to the
    output buffer (NCHW). In this case we have to use SSA because we
    need to get only one channel from the pixel. In case of storing to
    the local buffer the SSA was used because the buffer was allocated
    in kernel and the logic was written that if the buffer was allocated
    then we should use SSA. But if we store the same texture directly to
    the output buffer then SSA wasn't used and this OpenCL code wasn't
    compiled.
  2. Casting texture (NCHW4c) to another data type and then storing it to
    the buffer (NCHW). The SSA for textures was disabled in case of cast
    operation. As a result it was necessary to take a channel from the
    pixel but we got the vector data type (e.g. float4) and then we
    tried to cast it to scalar data type. This code also wasn't
    compiled.

In this PR SSA form was enabled for all cases when texture2d_load is
used. The relevant tests cases were added.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Mar 24, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@echuraev
Copy link
Contributor Author

Took tests on injective for Adreno from PR #13780. The PR #13780 will be closed because the original problem should be solved in this PR.

@echuraev echuraev force-pushed the echuraev/fix_ssa_textures branch 3 times, most recently from 0c182fd to 9d12480 Compare March 24, 2023 11:25
@echuraev echuraev marked this pull request as ready for review March 25, 2023 03:45
@echuraev echuraev requested a review from csullivan March 25, 2023 03:45
@echuraev
Copy link
Contributor Author

cc: @elvin-n, @masahi, @csullivan please review it.

Comment on lines -475 to -486
// Only use local SSA if texture is not already being stored
if (need_texture_ssa_) {
std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4));
if (op->args.back().as<RampNode>()) {
os << rhs;
} else {
os << "((";
this->PrintType(op->dtype.with_lanes(1), os);
os << "*)&" << rhs << ")[";
this->PrintExpr(op->args.back(), os);
os << "]";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, without checking if the data is already stored, this ended up repeated stores, resulting in less optimal opencl code. It might be optimized by the runtime compiler but I recall a non-trivial perf difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review! I have added several new test cases. They cover case when the data was storage and local variable is reused. Also add a new one which check that writing float [4] to image2d_t will be handled correctly:

float out_local[4];
// ...
write_imagef(out, (int2)((((((int)get_group_id(0)) * 14) + ((int)get_local_id(0))) % 38), (((((int)get_group_id(0)) * 64) + (((int)get_local_id(0)) >> 1)) / 19)), vload4(0, out_local + 0));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, if you find a case with this in the future please consider adding it as a follow up test. Otherwise all LGTM, thanks @echuraev!

@echuraev echuraev force-pushed the echuraev/fix_ssa_textures branch from f9ac7ce to 4925686 Compare March 28, 2023 09:47
@echuraev
Copy link
Contributor Author

@tvm-bot rerun

Comment on lines -475 to -486
// Only use local SSA if texture is not already being stored
if (need_texture_ssa_) {
std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4));
if (op->args.back().as<RampNode>()) {
os << rhs;
} else {
os << "((";
this->PrintType(op->dtype.with_lanes(1), os);
os << "*)&" << rhs << ")[";
this->PrintExpr(op->args.back(), os);
os << "]";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, if you find a case with this in the future please consider adding it as a follow up test. Otherwise all LGTM, thanks @echuraev!

echuraev and others added 8 commits March 29, 2023 06:51
In some cases we must use SSA for textures loading but we didn't do
that. Example of such cases:
1. Storing texture (NCHW4c) directly (w/o temporary buffer) to the
   output buffer (NCHW). In this case we have to use SSA because we
   need to get only one channel from the pixel. In case of storing to
   the local buffer the SSA was used because the buffer was allocated
   in kernel and the logic was written that if the buffer was allocated
   then we should use SSA. But if we store the same texture directly to
   the output buffer then SSA wasn't used and this OpenCL code wasn't
   compiled.
2. Casting texture (NCHW4c) to another data type and then storing it to
   the buffer (NCHW). The SSA for textures was disabled in case of cast
   operation. As a result it was necessary to take an channel from the
   pixel but we got the vector data type (e.g. float4) and then we
   tried to cast it to scalar data type. This code also wasn't
   compiled.

In this PR SSA form was enabled for all cases when `texture2d_load` is
used. The relevant tests cases were added.
@echuraev echuraev force-pushed the echuraev/fix_ssa_textures branch from 93a11bf to be82808 Compare March 29, 2023 03:52
@echuraev
Copy link
Contributor Author

@tvm-bot rerun

1 similar comment
@masahi
Copy link
Member

masahi commented Mar 29, 2023

@tvm-bot rerun

@echuraev echuraev merged commit 4011280 into apache:main Mar 30, 2023
@echuraev echuraev deleted the echuraev/fix_ssa_textures branch March 30, 2023 05:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants