Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script: Metadata validateMetadata optimization #93333

Merged

Conversation

joegallo
Copy link
Contributor

Map.copyOf

The first load-bearing bit here is the change from this.properties = Collections.unmodifiableMap(properties); to this.properties = Map.copyOf(properties);. Since all non-test callers pass in a properties map that was constructed via a call to Map.of, it's already an "immutable map" and Map.copyOf ends up being a no-op.

The precise details of this are very annoying, but if you trace Map.copyOf and Collections.unmodifiableMap through, both do have shortcutting behavior for multiple calls to themselves, but they are not mutually shortcutting -- so even though all our non-test properties maps are constructed as immutable via a call to Map.of, they were still wrapped in another layer on unmodifiability via the call to Collections.unmodifiableMap. On its own that's not so bad, it's just an allocation, but the wrapping also extends to another immutable wrapper being generated around each individual Entry when you loop over one of these maps via entrySet, which we do on construction for all of these maps in the form of validateMetadata().

That is, prior to this PR, we were at O(N*M) allocations here -- N on the number of Metadata being constructed, and M on the number of entries in their associated properties. After this PR there are no additional allocations, just N no-oping method calls.

getOrDefault

The second important change is that rather than calling .containsKey(key) and then immediately calling .get(key) after that, we can call getOrDefault(key, NOT_FOUND) just once and avoid an extra hashmap lookup. It's only a factor of 2 (so not actually speedup in terms of big O, right?), but it is still a small speedup in reality.


Overall, this changes the picture from this flamegraph before:

Screen Shot 2023-01-29 at 4 11 24 PM

To this flamegraph after:

Screen Shot 2023-01-29 at 4 11 47 PM

I've normalized the Matched: x.xx% to be as compared to all of the time spent in the IngestService, so these changes together are shaving 0.5% off our overall ingest time. Not an enormous win, but the changes themselves are pretty tidy and self-contained, so I think this is a nice small speedup as low-hanging fruit.

The safety guarantees are essentially the same, but in practice this
prevents us from wrapping an Map.of(...) -generated map (i.e. an
ImmutableMap) in another layer of unmodifiability (i.e. an
UnmodifiableMap).
We only call it from within this package or in subclasses via calls to
`super(...)`.
@joegallo joegallo added >enhancement :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache Team:Core/Infra Meta label for core/infra team v8.7.0 labels Jan 29, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

/**
* Check that all metadata map contains only valid metadata and no extraneous keys
*/
protected void validateMetadata() {
int numMetadata = 0;
for (Map.Entry<String, FieldProperty<?>> entry : properties.entrySet()) {
String key = entry.getKey();
if (map.containsKey(key)) {
Object value = map.getOrDefault(key, NOT_FOUND);
if (value != NOT_FOUND) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you flip this condition? I know why it's written this way atm but not NOT_FOUND doesn't read nicely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, it's way better as the reverse. Thanks! fdd1f27

numMetadata++;
// check whether it's permissible to have the value for the property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be dropped, the other is sufficient. Consider adding a comment of the form "getOrDefault is faster than hasKey + get".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, fdd1f27

// we can't tell the compiler that properties must be a java.util.ImmutableCollections.AbstractImmutableMap, but
// we can use this copyOf + assert to verify that at runtime.
this.properties = Map.copyOf(properties);
assert this.properties == properties : "properties map must be constructed via Map.of(...) or Map.copyOf(...)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for this assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah, yeah, I fall upon my sword. 0c97aa3

Copy link
Contributor

@stu-elastic stu-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change.

@joegallo joegallo requested a review from thecoop January 30, 2023 15:46
@joegallo joegallo merged commit 37c510c into elastic:main Jan 30, 2023
@joegallo joegallo deleted the script-metadata-properties-optimization branch January 30, 2023 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Team:Core/Infra Meta label for core/infra team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants