-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ACL] Stateless feature impacts winograd convolution performance #2324
Comments
This might be a known problem because we put in a workaround for winograd. |
@theComputeKid I checked the latest |
@alvoron Sorry, I half typed my thoughts. What I meant was, when we converted the conv to stateless, we realised that there were segfaults happening in ACL due to thread safety issues, so while we work on fixing those in ACL, we put in a workaround for oneDNN where we reinit the object every time, and this workaround for the segfault is probably causing the performance issues. |
@theComputeKid I see, so we keep this issue open (since it's valid) and wait for the fix from your side, right? |
@alvoron Absolutely, I'm tracking open issues. Please also keep your other issue in ACL open as first we fix ACL, then we remove the workaround in oneDNN. |
ACL stateless feature integrated into oneDNN in the recent releases affects winograd convolution performance.
The performance issue has been reproduced on Ampere and Apple M2 Pro.
Several benchdnn reproducers
ACL without stateless feature gives 0.39 ms / 0.24 ms / 0.23 ms respectively on Apple M2 Pro.
ACL with stateless feature (vanilla ACL 24.11.1) gives 17.79 ms / 4.06 ms / 1.14 ms respectively on Apple M2 Pro.
To get ACL without stateless feature the following commits were reverted:
The text was updated successfully, but these errors were encountered: