Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immutable cluster state controller #88224

Closed
wants to merge 25 commits into from

Conversation

grcevski
Copy link
Contributor

@grcevski grcevski commented Jun 30, 2022

[Making this PR obsolete, splitting further]

This PR adds the immutable cluster state controller, which part of the File based settings PR (#86224).

This is a reusable component that has number of Immutable State Handler components that know how to update parts of the Elasticsearch state. This component is only used by the File Settings Service at the moment, but the intent is for modules and plugins to be able to call it to save settings and entities in the cluster state that cannot be changed by the end user. For example work on using this component to update state from plugins/modules, please see #88341

Relates to #86224.

@grcevski grcevski marked this pull request as ready for review July 6, 2022 00:55
@grcevski grcevski added :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team labels Jul 6, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@@ -44,53 +42,6 @@ public TransformState transform(Object source, TransformState prevState) throws
);
}

public void testAsMapAndFromMap() throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore. After I refactored the parsing and typed the handlers by the type they receive in transform, I don't need the mapTo methods anymore

return;
}

// Do we need to retry this, or it retries automatically?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an outstanding question if there's anything else I need to do here to ensure the cluster state update is retried if it fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are no retries here, this state update will be attempted just once. Typically you would be calling this from a TransportMasterNodeAction which does handle certain retryable failures (i.e. NotMasterException and FailedToCommitClusterStateException) up to a timeout (defaulting to 30s). This particular update can also fail in ways that don't make sense to retry. At the moment this doesn't seem to be called from any production code, retrying or otherwise.

@grcevski grcevski requested a review from DaveCTurner July 6, 2022 18:39
@grcevski
Copy link
Contributor Author

grcevski commented Jul 6, 2022

@DaveCTurner I hope you don't mind me tagging you for review here, I want to make sure I did the updating of the cluster state correctly, and if I need to do anything special to retry.


Map<String, ?> source = asMap(input);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, I'm splitting the parsing of XContent from the transform step, so that we can construct immutable cluster state updates without having to supply JSON, e.g. we can send plain Java objects. Example of this can be found here: https://github.com/elastic/elasticsearch/pull/88224/files#diff-2c79b9f3f906ff489178070b9a25efc8db03fe98234156ffb8ab19ab05b8c7d7R384. The ILM plugin will use this to save plugin state packages.

@@ -49,11 +49,12 @@ public String name() {
private ClusterUpdateSettingsRequest prepare(Object input, Set<String> previouslySet) {
final ClusterUpdateSettingsRequest clusterUpdateSettingsRequest = Requests.clusterUpdateSettingsRequest();

Map<String, ?> source = asMap(input);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change here in the interface, so that I can split the XContent parsing from the state transformation.

@@ -95,4 +93,18 @@ default void validate(MasterNodeRequest<?> request) {
throw new IllegalStateException("Validation error", exception);
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting off the parsing part of handling immutable cluster state, so that we can reuse the immutable cluster state update controller to work with plain Java objects. This will come in handy when plugins/modules will want to save state on initialization.

@grcevski grcevski mentioned this pull request Jul 6, 2022
@grcevski grcevski removed the WIP label Jul 10, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've created a changelog YAML for you.

}
),
ClusterStateTaskConfig.build(Priority.URGENT),
new ImmutableStateUpdateStateTask.ImmutableUpdateStateTaskExecutor(namespace, clusterService.getRerouteService())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a new executor instance each time will mean that these tasks are unbatched. Probably not a big deal in this context but as a rule we should re-use executor instances so that state updates are always batched.

return;
}

// Do we need to retry this, or it retries automatically?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are no retries here, this state update will be attempted just once. Typically you would be calling this from a TransportMasterNodeAction which does handle certain retryable failures (i.e. NotMasterException and FailedToCommitClusterStateException) up to a timeout (defaulting to 30s). This particular update can also fail in ways that don't make sense to retry. At the moment this doesn't seem to be called from any production code, retrying or otherwise.

@grcevski
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/part-1

@grcevski
Copy link
Contributor Author

This PR was split off and the remaining bits missing are now part of the FileService PR #88329.

@grcevski grcevski closed this Jul 20, 2022
@grcevski grcevski deleted the operator/controller branch July 20, 2022 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants