Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ht_dec.c: Improve MSVC arm64 popcount performance #1479

Merged
merged 3 commits into from
Dec 9, 2023

Conversation

PeterJohnson
Copy link
Contributor

Use NEON instructions for ARM64 (implementation based on microsoft/STL#2127).

Godbolt output here: https://godbolt.org/z/q7GPTqT14

@asmorkalov
Copy link

Friendly reminder.

@rouault
Copy link
Collaborator

rouault commented Dec 8, 2023

Is testing for defined(OPJ_COMPILER_MSVC) && defined(_M_ARM64) sufficient to guarantee Neon instructions to be available ?
https://github.com/microsoft/STL/pull/2127/files has much more conditions

@rouault rouault merged commit 41c25e3 into uclouvain:master Dec 9, 2023
12 checks passed
asmorkalov pushed a commit to opencv/opencv that referenced this pull request Dec 10, 2023
ht_dec.c: Improve MSVC arm64 popcount performance #24205

Use NEON instructions for ARM64 (implementation based on microsoft/STL#2127, which is Apache licensed).

Godbolt output here: https://godbolt.org/z/q7GPTqT14
Related patch to openjpeg: uclouvain/openjpeg#1479

### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
ht_dec.c: Improve MSVC arm64 popcount performance opencv#24205

Use NEON instructions for ARM64 (implementation based on microsoft/STL#2127, which is Apache licensed).

Godbolt output here: https://godbolt.org/z/q7GPTqT14
Related patch to openjpeg: uclouvain/openjpeg#1479

### Pull Request Readiness Checklist

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants