Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hexagon] Enable depthwise conv2d NHWC with an HWIO kernel layout #13414

Merged
merged 11 commits into from
Dec 13, 2022

Conversation

farshidsp
Copy link
Contributor

This PR adds support for depthwise_conv2d_NHWC with an HWIO kernel layout.

@Lunderberg @mehrdadh

@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 17, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@farshidsp farshidsp marked this pull request as ready for review November 17, 2022 07:08
Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the feature, but I'm not quite following how the permutation specified results in a HWIO kernel layout.

* Filter[
di, dj, idxdiv(c, channel_multiplier), idxmod(c, channel_multiplier)
].astype(out_dtype)
* Filter.__getitem__(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explicit __getitem__ isn't required. Filter[x,y,z] is syntactic sugar for Filter[(x,y,z)]. If you already have a tuple, then Filter[tuple( ... )] can be called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, I have changed this as your recommendation.

kernel_permutation_to = [0, 1] + list(range(2, dim + 2))
elif kernel_layout == "HWIO":
filter_height, filter_width, channel_multiplier, filter_channel = Filter.shape
kernel_permutation_to = [dim + 1, dim] + list(range(dim))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing how this permutation generates HWIO. This defines kernel_permutation_to as [3, 2, 0, 1], so kernel_permutation_from is [2, 3, 1, 0]. With the usage below, that would permute from [di, dj, c//channel_multiplier, c%channel_multiplier] to [c//channel_multiplier, c%channel_multiplier, dj, di], which would be OIWH.

Should this be list(range(dim)) + [dim + 1, dim] instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lunderberg You are right, my bad. Not sure why I thought this would result in HWIO. Thanks so much for catching this bug.

filter_height, filter_width, channel_multiplier, filter_channel = Filter.shape
kernel_permutation_to = [dim + 1, dim] + list(range(dim))

kernel_permutation_from = np.argsort(kernel_permutation_to)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to defining in terms of kernel_permutation_to instead of directly defining kernel_permutation_from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to follow this as much as possible so in the future we can add more features if needed. If you think this is not the best way I could change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. It looks like that implementation is trying to be more clever, and to identify the permutation by inspecting the string. In this case, since we're only supporting two explicitly enabled shapes, I'd lean for defining a kernel_permutation directly for each one.

if kernel_layout == 'HWOI':
    kernel_permutaiton = [0,1,2,3]
elif kernel_layout == 'HWIO':
    kernel_permutaiton = [0,1,3,2]
else:
    raise ValueError(f'Unsupported kernel layout: {kernel_layout}')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, if there are many locations where the same kernel permutation definitions are used, it may be useful to pull it out into a common utility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the way of permutation from your recommendation. @Lunderberg

@TejashShah
Copy link

@farshidsp Is kernel layout or filter layout a better word to indicate the change in layout?

@farshidsp
Copy link
Contributor Author

@farshidsp Is kernel layout or filter layout a better word to indicate the change in layout?

In other topi files, we have used kernel_permutation to indicate the change in the kernel layout and that's why I used the same term.

Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the changes, and LGTM!

@Lunderberg Lunderberg merged commit 12311dc into apache:main Dec 13, 2022
fzi-peccia pushed a commit to fzi-peccia/tvm that referenced this pull request Mar 27, 2023
…ache#13414)

Enable depthwise conv2d NHWC with HWIO kernel layout.  The default kernel layout is HWOI, matched to previous behavior.
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
…ache#13414)

Enable depthwise conv2d NHWC with HWIO kernel layout.  The default kernel layout is HWOI, matched to previous behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants