Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: HA cluster failover breaks AWS static role rotation #21935

Closed
isometry opened this issue Jul 19, 2023 · 5 comments · Fixed by #28775
Closed

Bug: HA cluster failover breaks AWS static role rotation #21935

isometry opened this issue Jul 19, 2023 · 5 comments · Fixed by #28775
Labels
bug Used to indicate a potential bug secret/aws

Comments

@isometry
Copy link

Describe the bug

AWS Static Role rotation stops when active node moves within a HA cluster

To Reproduce

Steps to reproduce the behavior:

  1. Configure a three-node HA Vault cluster with Raft storage backend.
  2. Configure AWS secrets engine with root user credentials.
  3. Configure an AWS static-role with rotation period (e.g. 5m).
    vault write aws/static-roles/foo username=foo rotation_period=300
  4. Confirm static role credentials are returned correctly and rotated according to the defined rotation period.
    vault read aws/static-creds/foo && sleep 310 && vault read aws/static-creds/foo
  5. Force active node change.

Expected behavior

Static role credentials continue to be rotated according to the configured rotation period.

Actual behavior

Static role credentials completely stop being rotated.

Environment:

  • Vault Server Version (retrieve with vault status): v1.14.0
  • Vault CLI Version (retrieve with vault version): v1.14.0
  • Server Operating System/Architecture: Debian 12 / amd64

Vault server configuration file(s):

cluster_name = "vault"

api_addr = "https://vault.example.com:443"

# Listener: https://www.vaultproject.io/docs/configuration/listener/tcp
listener "tcp" {
  address = "0.0.0.0:8200"

  tls_cert_file   = "/etc/vault.d/vault.example.com.crt"
  tls_key_file    = "/etc/vault.d/vault.example.com.key"
  tls_min_version = "tls13"

  tls_disable_client_certs = "true"

  x_forwarded_for_authorized_addrs = ["10.12.0.0/24", "10.13.0.0/24"]
}

# UI: https://www.vaultproject.io/docs/configuration/ui/index.html
ui = true

cluster_addr = "https://vault-3.example.com:8201"

storage "raft" {
  path = "/opt/vault/"
  node_id = "vault-3"

  leader_ca_cert_file = "/etc/vault.d/raft_ca.crt"
  leader_client_cert_file = "/etc/vault.d/vault-3_raft.crt"
  leader_client_key_file = "/etc/vault.d/vault-3_raft.key"

  performance_multiplier = 1

  retry_join {
    leader_api_addr = "https://vault-1.example.com:8200"
  }
  retry_join {
    leader_api_addr = "https://vault-2.example.com:8200"
  }
}

log_level = "info"

plugin_directory = "/usr/local/lib/vault"
@mkushakov
Copy link

We have the same issue.

@gneveu
Copy link

gneveu commented Oct 23, 2024

We have the same issue here, this is breaking our whole credentials propagation and putting our operations at risk. Please consider this as a fix

@heatherezell heatherezell added bug Used to indicate a potential bug secret/aws labels Oct 23, 2024
@heatherezell
Copy link
Contributor

We have the same issue here, this is breaking our whole credentials propagation and putting our operations at risk. Please consider this as a fix

Hi there! Which version are you using? Thanks!

@isometry
Copy link
Author

@heatherezell : we're still experiencing this behaviour on v1.16.3, but I haven't seen any recent changes to the relevant code.

It needs a separate issue, but we've also recently started experiencing a related issue whereby whenever Vault does rotate static-creds on schedule, it fails to store the just-rotated credentials, resulting in it essentially invalidating the credentials that it continues to serve to clients. This started when we reached around ~70 static-creds roles :-/
(It bombs out on https://github.com/hashicorp/vault/blob/main/builtin/logical/aws/rotation.go#156)

@heatherezell
Copy link
Contributor

@heatherezell : we're still experiencing this behaviour on v1.16.3, but I haven't seen any recent changes to the relevant code.

It needs a separate issue, but we've also recently started experiencing a related issue whereby whenever Vault does rotate static-creds on schedule, it fails to store the just-rotated credentials, resulting in it essentially invalidating the credentials that it continues to serve to clients. This started when we reached around ~70 static-creds roles :-/ (It bombs out on main/builtin/logical/aws/rotation.go#156)

Is it related to your other reported issue? Please feel free to ping me on these directly, so we can get them all correlated and don't end up playing whack-a-mole. :) Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to indicate a potential bug secret/aws
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants