Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new secure setting type that is stored (encrypted) in cluster state #32727

Closed
talevy opened this issue Aug 9, 2018 · 5 comments
Closed
Labels
:Core/Infra/Settings Settings infrastructure and APIs >feature Team:Core/Infra Meta label for core/infra team

Comments

@talevy
Copy link
Contributor

talevy commented Aug 9, 2018

Goal:
There are some features (one example) that could make use of a secure setting that is consistent across nodes.

Problem with existing secure settings:
Existing secure settings stored in the keystore have the problem that they are not enforced to be consistent across nodes. The key-store settings are primarily used for node start-up secure settings.

Features that would enable such a setting:
In order to have an arbitrary set of settings stored securely out in the open in the cluster-state, there must be a way to encrypt and decrypt these values. One solution would be to introduce a sort-of system key that will be stored in the secure keystore, and all these new cluster-state-secure-settings would use this key to encrypt and decrypt their values when needed. These settings probably won't change, and it may make sense to enforce that by disallowing updates to these settings.

rotated system key:
Although these secure settings may not need key-rotation, the master system key used to cypher these settings should have a way to rotate itself. This means having a hand-off mechanism where old key decrypts and new key encrypts and re-inserts into the cluster-state

I've left further implementation details out of this description, since those may be hashed out upon investigation

cc @elastic/es-distributed @elastic/es-security

@talevy talevy added :Core/Infra/Settings Settings infrastructure and APIs >feature labels Aug 9, 2018
@DaveCTurner
Copy link
Contributor

These settings probably won't change, and it may make sense to enforce that by disallowing updates to these settings.

This kind of setting would seem like a useful place to store credentials to third-party services that aren't needed for node start-up (e.g. snapshot repositories, Watcher actions) which are themselves subject to rotation. I think it'd be useful if they were updatable.

@albertzaharovits
Copy link
Contributor

The problem with secure settings not being replicated on nodes is a limitation of the secret store "backend" we currently use.

Storing hidden and encrypted data in the cluster state achieves the consistency (single source of truth) and fail-over (each node has a copy of the secret). However, the encryption key has to be conveyed in a different manner. Having data encrypted with the key next to it adds no value to only hiding the plain text value. We should keep the way password is currently broadcasted away from the cluster state. But then we need a rotate key API? And a new update settings API, because you can only update secret values if you know the password?

I also think we could use HashiCorp Vault as an alternative secret store "backend". This will provide a single consistent source of secret values. In this case, nodes will use access tokens, instead of decryption passwords, to reach the "vault" service. This external service handles key rotation, time limited credentials, secret generation (for example generating S3 access tokens that can be revoked when the snapshot/restore is completed). The access token can be propagated with the existing node broadcast API.

In essence having an external service makes reasoning easier. The external service can also be used to store settings required before cluster formation. Another, more conceptual advantage of an external service, is that we avoid storing encrypted state: besides fail-over there is no point in keeping encrypted state because it cannot be used. No one reads this state, until the password ingress. But then, as a node, I can make a request for the encrypted content, I don't have to keep it around and update it constantly in case I need it at some time. But I need to keep it in case of fail-over. It has a different lifecycle altogether compared to the unencrypted state.

I think each alternative improves the existing one. But the devil is in the details and they come with their own complications. Are they worth it? A simple ansible playbook can rotate and update settings reliably and it is easy to understand for the administrator.

@jasontedor
Copy link
Member

Having data encrypted with the key next to it adds no value to only hiding the plain text value. We should keep the way password is currently broadcasted away from the cluster state.

I think there is a misunderstanding of the proposal here. The key would be stored in the keystore, not in the cluster state.

I also think we could use HashiCorp Vault as an alternative secret store "backend".

We should avoid this as this will complicate our stack for the vast majority of our users. We should seek a solution that does not rely on third-party infrastructure.

jkakavas added a commit to jkakavas/elasticsearch that referenced this issue Aug 14, 2018
Elasticsearch versions earlier than 6.4.0 cannot properly run in a
FIPS 140 JVM. This commit ensures that we do not try to run bwc
tests that entail spinning up < 6.4.0 nodes when CI is run in a FIPS
140 JVM.
It also reverts e497173 and e64bb48 as the workarounds inserted there
are no longer required.

Resolves elastic#32727
@danielkasen
Copy link

So I want to +1 this and explain our current issue. We use the Pagerduty integrations for over 200 services. This means that each node has to include 200+ values in the keystore and when your cluster is 400+ nodes this takes a long time to execute. And basically it doesn't scale well. Now if we could store these in an-encrypted fashion in the cluster settings or a subset of .security index this would drastically increase the speed of our rollout.

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@rjernst rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020
@gwbrown
Copy link
Contributor

gwbrown commented Dec 4, 2020

This issue has been addressed by the introduction of consistent secure settings in #40416. While that PR did not address all concerns raised in this discussion (such as the high number of keystore entries causing slow rollouts), the original goal stated in the issue has now been unblocked. As such, I'm going to close this issue.

If you have a problem which is not addressed by consistent secure settings, please open a new issue on that topic with details of the problem.

@gwbrown gwbrown closed this as completed Dec 4, 2020
@gwbrown gwbrown removed the needs:triage Requires assignment of a team area label label Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Settings Settings infrastructure and APIs >feature Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

7 participants