Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate dot() in the Metal backend #7085

Merged
merged 2 commits into from
Oct 17, 2022
Merged

Generate dot() in the Metal backend #7085

merged 2 commits into from
Oct 17, 2022

Conversation

vksnk
Copy link
Member

@vksnk vksnk commented Oct 14, 2022

Basically, it will generate dot() call for vector_reduce(Add, mul(float, float)). I've tested it locally to make sure it is actually generated. It'd be nice to have something similar to simd_op_check for GPU targets, but it doesn't exist (#7084).

@steven-johnson
Copy link
Contributor

Monday morning review ping

@shoaibkamil
Copy link
Contributor

We also should think longer-term how we want to do these kinds of pattern matches. @rootjalex @abadams In the future, should we consider moving some of these kinds of rules into earlier passes and do them via pattern-matching?

@steven-johnson steven-johnson merged commit e70b7d9 into main Oct 17, 2022
@steven-johnson steven-johnson added the backport me This change should be backported to release versions label Oct 17, 2022
@steven-johnson
Copy link
Contributor

Marking for backport to release/15.x

@steven-johnson steven-johnson deleted the vksnk/metal-dot branch October 17, 2022 22:08
@rootjalex
Copy link
Member

@shoaibkamil We've discussed separating instruction selection from CodeGen (and I have a slightly stale but still active PR open for doing so on x86, #6884). I was planning on doing this for ARM/HVX as well, but we could definitely do this for some of the GPU backends as well. I think we still need to spend some time on the correct design of the IR for this sort of thing, I think @abadams and I have not come to a conclusion on a few design principles.

@rootjalex
Copy link
Member

rootjalex commented Oct 18, 2022

I was planning on doing this for ARM/HVX as well

And I know HVX already has HexagonOptimize - but I want to turn some of those passes into proper term-rewriting systems. The current model of "exact pattern goes to specific intrinsic" is rather restrictive and does not support many of the rules that my project has generated.

steven-johnson pushed a commit that referenced this pull request Oct 24, 2022
* dot() support for Metal backend)

* Restrict dot() to floats
steven-johnson pushed a commit that referenced this pull request Oct 24, 2022
* dot() support for Metal backend)

* Restrict dot() to floats
steven-johnson pushed a commit that referenced this pull request Oct 24, 2022
* dot() support for Metal backend)

* Restrict dot() to floats
steven-johnson added a commit that referenced this pull request Oct 24, 2022
* Generate dot() in the Metal backend (#7085)

* dot() support for Metal backend)

* Restrict dot() to floats

* Fix subtle CMake Install bugs (#7103)

* Update CMakeLists.txt

* Update CMakeLists.txt

* Fix some dead links to the 'master' branch (#7107)

* Attempt to fix pip build issues (#7098)

* Add evaluate() and evaluate_may_gpu() to Python bindings (#7108)

* Add evaluate() and evaluate_may_gpu() to Python bindings

* pacify clang-tidy

Co-authored-by: Volodymyr Kysenko <vksnk@google.com>
Co-authored-by: Andrew Adams <andrew.b.adams@gmail.com>
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* dot() support for Metal backend)

* Restrict dot() to floats
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport me This change should be backported to release versions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants