Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backport -> release/3.8.x] fix(vault): let shdict secret vault cache presist enough time during resurrect_ttl #13673

Closed

Conversation

team-gateway-bot
Copy link
Collaborator

Automated backport to release/3.8.x, triggered by a label in #13471.

Original description

Summary

This PR fixes an issue that rotate_secret may flush a secret value with NEGATIVE_CACHED_VALUE when vault backend is down and a secret value stored in the shared dict has passed its ttl and hasn't finished consuming its resurrect_ttl.

TLDR; this issue happens easily when a reference is being used via the vault PDK function in custom codes(serverless functions, custom plugins, etc.), and some of the worker processes may not be triggered via the service/routes that use these custom codes, and these worker processes do not hold a valid LRU cache for the secret value

The issue was first reported in FTI-6137.

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

FTI-6137

…resurrect_ttl (#13471)

This PR fixes an issue that rotate_secret may flush a secret value with NEGATIVE_CACHED_VALUE when vault backend is down and a secret value stored in the shared dict has passed its ttl and hasn't finished consuming its resurrect_ttl.

TLDR; this issue happens easily when a reference is being used via the vault PDK function in custom codes(serverless functions, custom plugins, etc.), and some of the worker processes may not be triggered via the service/routes that use these custom codes, and these worker processes do not hold a valid LRU cache for the secret value

The issue was first reported in FTI-6137.

---------

Signed-off-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
Co-authored-by: Aapo Talvensaari <aapo.talvensaari@gmail.com>
(cherry picked from commit 9269195)
@ms2008
Copy link
Contributor

ms2008 commented Sep 18, 2024

@windmgc Is this a duplicate of #13561?

@windmgc
Copy link
Member

windmgc commented Sep 18, 2024

@ms2008 Yes I think so. There seems to be some kind of bug last week that caused the bot to create duplicate PRs(on both CE/EE)

@windmgc windmgc closed this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee core/pdk size/L size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants