Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload secret store #28244

Closed

Conversation

albertzaharovits
Copy link
Contributor

@albertzaharovits albertzaharovits commented Jan 16, 2018

This is another take on the 'reloadable keystore' issue, after a discussion with @simonw .
I will link to #28001 for consistency, although they are independent from each other.

This change adds the secretStorePassword field to the ClusterUpdateSettingsRequest (in addition to the persistent and transient existing ones). If this field is present on the request, the TransportClusterUpdateSettingsAction will send a newly defined node transport action to each node (conveying the secret store password), after the cluster state has been acknowledged. The newly defined node action handler has the responsibility to 'reinitialize' the components with 'reloadabale' SecureSettings (this latter part is not included in this PR, the handler only re-opens and decrypts the KeyStoreWrapper). Note that there is a slim delay between plain text settings push (via cluster state apply) and secure settings re-read (via node actions). It works on simple manual tests, but it is sketchy and does not have tests, see below.

The rationale behind this was that it is simpler to have a one-stop API to update all cluster settings and, more importantly, to account for the case of dependencies between clear text and secure settings. If this is the case, one would require transactional update of all settings to prevent inconsistencies.
Nevertheless the semantics for the secure settings update is different: you don't specify values or keys and you don't get them in response either; what you should get back in response is an acknowledgement from each node, coupled with some aggregation safety check that all settings on all the nodes are consistent (same key-value hashes). Moreover, true transactional setting updates would require pushing the decrypt password via the cluster state update, together with all the other settings (I haven't given much thought on this). Given that we don't allow dependencies between 'persistent' and 'transient' it might be easier to forbid dependencies between secure and plain text ones in a similar fashion.
For these reasons, and after getting through with this implementation sketch, I think this API should stand alone as a specialization of the TransportNodesAction, leaving TransportClusterUpdateSettingsAction to handle only clear text settings. Dependencies between plain text and secure settings should not be permitted. Transactional setting updates are not available across plain text and secure settings. This proposed API would be similar to a node reinit/refresh/restart flavor.

What do you think?

@albertzaharovits
Copy link
Contributor Author

I had another discussion on the topic with @simonw .

It appears that the 'reinit secret store' operation should stand on it's own, as a specialization of the TransportNodesAction. The new node operation will convey the secret store password to nodes. The updated secret store file should already be present on disk before the operation hits the node.
If not for the more neat code, in this way, we are explicitly acknowledging the transient inconsistency between settings applied by the cluster state update and the secret settings loaded by the node transport operation handler. To clarify, the inconsistency is perceived by the settings consumer when only plain text settings have been updated and there are dependencies between plain text setting and secure ones which have not yet been updated.

To get around this perceived inconsistency, when dealing with secure settings the cluster state update will include a marker value (such as _keystore_) for such settings. The setting consumer, in the process cluster update handler, will not immediately apply setting sets which include the marker, because those settings are not yet accessible. Instead, it caches the update and defers the apply for when the secure settings are available, that is when the new transport node action hits the node with the password to decrypt the local secret store.

I will create another branch to track this progress with more granular commits.

@colings86 colings86 added the :Core/Infra/Core Core issues without another label label Apr 24, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jaymode
Copy link
Member

jaymode commented Apr 27, 2018

@albertzaharovits can we close this?

@albertzaharovits
Copy link
Contributor Author

yes, sorry for letting it linger.
The progress on this is tracked by #29135 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants