Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: allow templating to indefinitely retry if explicitly set to zero #11711

Closed
wants to merge 4 commits into from

Conversation

calvn
Copy link
Contributor

@calvn calvn commented May 26, 2021

This PR re-introduces the ability for agent's template subsystem to perform indefinite retries that was removed in 1.6.0 through #9670. Indefinite retry can be toggled on by setting vault.retry.num_retries = 0.

Since vault.retry.num_retries is used by both the cache client and the template config, the changeset moves setting these values closer to where the update is called. This reduces the behavioral change to the bare minimum, but introduces the difference on what 0 means for template vs cache.

Templating:

vault.retry.num_retries Old New
-1 Disabled Disabled
0 Default (12x) ∞ retries
N > 0 N retries N retries
Absent Default (12x) Default (12x)

Cache:

vault.retry.num_retries Old New
-1 Disabled Disabled
0 Default (12x) API Client default (2x)
N > 0 N retries N retries
Absent Default (12x) Default (12x)

Related to #9670
Related to #11113

Backport note:
We need to backport #11696 if/when we backport this PR.

TODO:

  • Update/add tests
  • Update docs
  • Include changelog

@calvn calvn added this to the 1.8 milestone May 26, 2021
@vercel vercel bot temporarily deployed to Preview – vault-storybook May 26, 2021 23:27 Inactive
@calvn
Copy link
Contributor Author

calvn commented May 27, 2021

One case that may look odd -- if vault.retry.num_retries = 0 and persist is non-nil, it means that the client will try 2x per request. Functionally, this doesn't matter too much since the number of attempts by template is infinite, but does make each template attempt be performed 12x by the cache client.

command/agent.go Outdated
case -1:
retries = 0
case 0:
retries = defaultClientRetryAttempts
Copy link
Contributor Author

@calvn calvn May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to change this to be api.Client default of 2 retries here (for a total of 3 calls). Even though 12 is the current value, the client actually times out before making all 12 attempts due to the default request timeout value of 60s, thus the failed to renew: context deadline exceeded errors per attempt.

[WARN] (view) vault.read(database/creds/readonly): vault.read(database/creds/readonly): context deadline exceeded (retry attempt 1 after "250ms")
[TRACE] (view) vault.read(database/creds/readonly) starting fetch
[TRACE] vault.read(database/creds/readonly): starting renewer
2021-05-26T15:54:39.425-0700 [INFO]  cache: received request: method=PUT path=/v1/sys/leases/renew
2021-05-26T15:54:39.425-0700 [DEBUG] cache.leasecache: forwarding request: method=PUT path=/v1/sys/leases/renew
2021-05-26T15:54:39.425-0700 [INFO]  cache.apiproxy: forwarding request: method=PUT path=/v1/sys/leases/renew
[WARN] vault.read(database/creds/readonly): failed to renew: context deadline exceeded
[WARN] vault.read(database/creds/readonly): renewer done (maybe the lease expired)
[TRACE] vault.read(database/creds/readonly): GET /v1/database/creds/readonly
2021-05-26T15:55:39.428-0700 [INFO]  cache: received request: method=GET path=/v1/database/creds/readonly
2021-05-26T15:55:39.428-0700 [DEBUG] cache.leasecache: forwarding request: method=GET path=/v1/database/creds/readonly
2021-05-26T15:55:39.428-0700 [INFO]  cache.apiproxy: forwarding request: method=GET path=/v1/database/creds/readonly
[WARN] (view) vault.read(database/creds/readonly): vault.read(database/creds/readonly): context deadline exceeded (retry attempt 2 after "500ms")
[TRACE] (view) vault.read(database/creds/readonly) starting fetch
[TRACE] vault.read(database/creds/readonly): starting renewer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this change

Copy link
Contributor Author

@calvn calvn Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone ahead and made this change on 03d248f, and also updated the table on the PR description.

@calvn
Copy link
Contributor Author

calvn commented May 27, 2021

Another suggestion that was made was to break retry values by the cache's client and the template to be different values, e.g. vault.retry.cache_retries and vault.retry.template_retries. We'd still have to support the existing num_retries even if we deprecate it though.

@calvn calvn requested a review from tomhjp June 1, 2021 16:25
@vercel vercel bot temporarily deployed to Preview – vault-storybook June 2, 2021 20:47 Inactive
@vercel vercel bot temporarily deployed to Preview – vault June 2, 2021 20:47 Inactive
@calvn
Copy link
Contributor Author

calvn commented Jun 10, 2021

Closing in favor of #11775.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants