Permit configuring IMDS client behavior on timeout #1233
Labels
feature-request
A feature should be added or improved.
needs-triage
This issue or PR still needs to be triaged.
Describe the feature
The IMDS client in aws-config, both for internal usage (e.g., credential fetching) and as a public-facing client (e.g., to resolve instance metadata in user programs) should support being configured to an "expected to exist" mode where e.g. TCP connects are retried (unlike the current default).
Use Case
On EC2 instances, IMDS is not always 100% available; we periodically see short blips of unavailability in production, like with any other service. Retries avoid these bubbling out as e.g. service launch failures due to inability to provision credentials or discover local identity (instance ID).
Proposed Solution
At minimum, the client should expose a knob to enable retrying TCP failures (connect and read timeouts). Ideally, the solution would allow for us to specify that we do in fact expect a response and so the normal SDK behaviors should happen -- rather than having to chase those over time with more knobs.
The SDK defaults may make less sense for cases where IMDS may not be available, but explicit usage of the IMDS client seems like a good indicator of "I expect it to work" (at least as a default).
We would want to configure this for the implicit IMDS client created within credentials provider chains, but are OK explicitly threading a client into that state if needed.
Other Information
No response
Acknowledgements
A note for the community
Community Note
The text was updated successfully, but these errors were encountered: