Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected breaking change in RunInstances (not idempotent anymore. Even with client token) #4406

Closed
1 task
arianvp opened this issue Jan 26, 2025 · 5 comments
Labels
bug This issue is a confirmed bug. response-requested Waiting on additional information or feedback.

Comments

@arianvp
Copy link

arianvp commented Jan 26, 2025

Describe the bug

Calling RunInstances on a terminated instance with the same client token as launched used to succeed and that's also the documented behaviour in the EC2 docs

The operation started failing with (IdempotentInstanceTerminated) exception which makes the idempotency token not act idempotent anymore. It is a documented feature of the EC2 API that if the request succeeds once that ant subsequent call of the API with the same idempotency token also succeeds. That promise is now broken.

This is a pretty bad breaking changes as it breaks the idempotency promise of the EC2 API. Which hasn't changed in like a decade.

It's causing our automation to fail after working for years.

https://github.com/NixOS/amis/actions/runs/12969672545/job/36174001261

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Don't throw (IdempotentInstanceTerminated). Instead the instance id of the terminated instance is returned and the instance state is terminated

Current Behavior

It throws

botocore.exceptions.ClientError: An error occurred (IdempotentInstanceTerminated) when calling the RunInstances operation: The client token you have provided is associated with a terminated instance. Please use a different client token.

Reproduction Steps

Call RunInstances on a terminated instance with same client token.

Possible Solution

No response

Additional Information/Context

No response

SDK version used

1.35.30

Environment details (OS name and version, etc.)

Ubuntu 24

@arianvp arianvp added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jan 26, 2025
@arianvp
Copy link
Author

arianvp commented Jan 26, 2025

Our integration test suite heavily relies on the idempotency guarantees

The behaviour I'm relying on has been documented here: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Run_Instance_Idempotency.html

Idempotency ensures that an API request completes no more than one time. With an idempotent request, if the original request completes successfully, any subsequent retries complete successfully without performing any further actions. However, the result might contain updated information, such as the current creation status

The returned error also isn't documented in
https://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html#CommonErrors

It will require significant refactoring of our code to adopt for this non-idempotent behaviour. For one we rely on retrieving the instance id of the terminated instance through RunInstances idempotency.

Please undo whatever change was made on the server or update the docs that RunInstances is not idempotent anymore. But I expect a lot of downstream fallout. A lot of tooling relies on the idempotency of the RunInstances API

@arianvp
Copy link
Author

arianvp commented Jan 27, 2025

Update: seems the error code is documented here now: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html

@SamRemis
Copy link
Contributor

Hi @arianvp,

Thank you for reporting the issue. At first glance, there's a bit of a contradiction in the EC2 docs. As you linked above, the docs do state that subsequent requests will succeed as long as they're the same input parameters with the same client token:

Idempotency ensures that an API request completes no more than one time. With an idempotent request, if the original request completes successfully, any subsequent retries complete successfully without performing any further actions.

Looking at the docs for the error, they imply that the above does not apply if the instance is terminated:

The request to launch an instance uses the same client token as a previous request for which the instance has been terminated.

If I'm reading this correctly, the first docs should be updated to include this caveat. You mentioned that this has been running for years without issue- can you confirm that nothing else has changed on your side that might cause the instance to now terminate before this call?

I've also reached out to the EC2 team to confirm the expected behavior - (internal ticket ID: V1656514867)

@SamRemis SamRemis added the response-requested Waiting on additional information or feedback. label Jan 27, 2025
@khushail khushail removed the needs-triage This issue or PR still needs to be triaged. label Jan 27, 2025
@SamRemis
Copy link
Contributor

The EC2 team confirmed that this is not a behavioral change and is working as expected. Repeated calls with the same token can result in errors if the instance is terminated, if the idempotency token expires, or if the request parameters change between the two calls.

I've requested a docs update on the paragraph you quoted to clarify when idempotency does not apply.

If you're looking to get rid of this error, please update the token after termination of your instance.

@arianvp
Copy link
Author

arianvp commented Jan 28, 2025

Thanks for the clarification! I'll update our code to handle the exception.

@arianvp arianvp closed this as completed Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. response-requested Waiting on additional information or feedback.
Projects
None yet
Development

No branches or pull requests

3 participants