Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metadata Immutability] Change different indices lookup objects from array type to lists #14723
base: main
Are you sure you want to change the base?
[Metadata Immutability] Change different indices lookup objects from array type to lists #14723
Changes from all commits
c261581
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 521 in server/src/main/java/org/opensearch/action/admin/cluster/health/TransportClusterHealthAction.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/action/admin/cluster/health/TransportClusterHealthAction.java#L521
Check warning on line 339 in server/src/main/java/org/opensearch/action/admin/indices/datastream/GetDataStreamAction.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/action/admin/indices/datastream/GetDataStreamAction.java#L339
Check warning on line 709 in server/src/main/java/org/opensearch/cluster/metadata/IndexNameExpressionResolver.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/cluster/metadata/IndexNameExpressionResolver.java#L709
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be expensive operation to copy the set to a list every time for index resolution.
Have run micro benchmark on your changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here ConcreteAllIndices(allIndices) are unmodifiableSet, so List.copyOf() will not really create a copy .
There is an implementation Note in javadoc of this API - "If the given Collection is an unmodifiable List, calling copyOf will generally not create a copy". Also if we look in to the internal implementation of Lisy.copyOf() it's clear that, it's not creating a copy if collection is an AbstractImmutableList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are doing
Collections.unmodifiableSet(allIndices);
which creates new object ofUnmodifiableSet
.UnmodifiableSet
extendsUnmodifiableCollection
and it is not extendingAbstractImmutableList
. It will end up copying the elements. Can you cross check?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are correct. So what do you suggest here?
concrete indices treated as Set in the existing implementation -
OpenSearch/server/src/main/java/org/opensearch/cluster/metadata/Metadata.java
Line 1636 in 67a2e4c
That's why I kept only concrete indices as Set and all other indices as List. If we keep concrete indices also as List, then this conversion won't be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes right, lets check the usage in the code if it has to be Set, else i agree keeping it as list would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akolarkunnu Can you please take these changes to closure by profiling health API as in #15492
That should bring confidence on whether or not to get these changes in code or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will resume this work soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran performance test by crating indices using a script, which ran for 2 hours and 3 minutes. On comparing with changes and without changes, with changes I am consistently able to create ~30 more indices. So it is clearly a performance improvement.
Also I captured FlameGraph of 5 mins after 2 hours of index creation by using async profiler, and I don't see anything unusual there. Attaching both cpu and alloc with and without changes.
flamegraphs.zip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akolarkunnu : Thanks for benchmarking the creation times. Can you please benchmark
resolveEmptyOrTrivialWildcard
method with and without your change. This resolution would happen in the context of every indexing/ search request which could turn out to be very expensive with this copy operationThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwetathareja A quick clarification - so if
resolveEmptyOrTrivialWildcard
is called in every search/indexing request - will the indexing/search benchmarks suffice to establish the impact?If yes, please checkout the comparison results - #14723 (comment) - I do see a good improvement in reduction of search and indexing latencies, increase in majority of indexing/search throughputs.
One catch I see is that the indices+aliases size is relatively low in the OSB benchmark, but I guess the slight increase in performance we see can be accounted for the reduction in conversions back and forth to arrays/lists. What do you think?
Check warning on line 169 in server/src/main/java/org/opensearch/index/recovery/RemoteStoreRestoreService.java
Codecov / codecov/patch
server/src/main/java/org/opensearch/index/recovery/RemoteStoreRestoreService.java#L169