-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚀 Optimized entropy functions #69
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* The values are passed as pointers, and the overall logic has been simplified.
* See commit fd906f0
* The benchmarks have not changed on the entropy-function. The local tests indicated that the new implementation was faster - this will be investigated.
serkor1
added
enhancement
New feature or request
optimze
Various optimizations to source code
labels
Feb 1, 2025
2 tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 81d28c7 in 1 minute and 16 seconds
More details
- Looked at
1074
lines of code in27
files - Skipped
1
files when reviewing. - Skipped posting
12
drafted comments based on config settings.
1. src/classification_CrossEntropy.h:49
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The combination of parallel for and simd here actually makes sense - they operate at different levels of parallelism. Parallel for splits work across threads, while simd enables vectorization within each thread. The operations in the inner loops (simple additions and multiplications) are perfect candidates for SIMD vectorization. The comment appears to be incorrect about inefficiencies.
I could be wrong about the interaction between these OpenMP directives. There might be specific hardware architectures where this combination is problematic.
While there might be edge cases, the code follows a common and well-established pattern of combining thread-level and SIMD parallelism. The operations are simple enough that vectorization should be beneficial.
The comment should be deleted as it suggests removing SIMD directives that are actually beneficial for performance when properly combined with parallel for as done in this code.
2. src/classification_CrossEntropy.h:62
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
3. src/classification_CrossEntropy.h:123
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
4. src/classification_Entropy.h:44
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
5. src/classification_Entropy.h:57
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
6. src/classification_Entropy.h:117
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
7. src/classification_RelativeEntropy.h:50
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
8. src/classification_RelativeEntropy.h:65
- Draft comment:
Using#pragma omp simd
and#pragma omp parallel for
together can lead to inefficiencies. Consider using only#pragma omp parallel for
for the outer loop parallelization. - Reason this comment was not posted:
Marked as duplicate.
9. src/classification_CrossEntropy.h:13
- Draft comment:
All Rcpp functions and classes should be namespace qualified. For example, useRcpp::NumericVector
instead ofNumericVector
. This applies throughout the C++ code. - Reason this comment was not posted:
Comment was on unchanged code.
10. src/classification_Entropy.cpp:14
- Draft comment:
All Rcpp functions and classes should be namespace qualified. For example, useRcpp::NumericVector
instead ofNumericVector
. This applies throughout the C++ code. - Reason this comment was not posted:
Marked as duplicate.
11. src/classification_Entropy.h:11
- Draft comment:
All Rcpp functions and classes should be namespace qualified. For example, useRcpp::NumericVector
instead ofNumericVector
. This applies throughout the C++ code. - Reason this comment was not posted:
Marked as duplicate.
12. src/classification_RelativeEntropy.h:14
- Draft comment:
All Rcpp functions and classes should be namespace qualified. For example, useRcpp::NumericVector
instead ofNumericVector
. This applies throughout the C++ code. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_L0sjpnWlN4E1W2HS
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📚 What?
This PR optimizes all entropy functions. The run-time on
200 x 1e6
matrices have been decreased significantly, see below:Before optimization (Without OpenMP)
After optimization (Without OpenMP)