Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform AWS provider unable to assume role using profile that assumes a role itself #8052

Closed
rekahsoft opened this issue Mar 22, 2019 · 31 comments
Labels
bug Addresses a defect in current functionality. provider Pertains to the provider itself, rather than any interaction with AWS.
Milestone

Comments

@rekahsoft
Copy link

rekahsoft commented Mar 22, 2019

When using a a chain of aws cli profiles, one of which assumes a role, the aws provider fails to assume roles, as there are no credentials in ~/.aws/credentials for the corresponding profile. That is, given 2 profiles, A and R where:

  • A is an IAM user and thus credentials for this profile exist within~/.aws/credentials
  • R is a role assumed using the profile A. Note this means there are no credentials available for this profile in ~/.aws/credentials

Finally, there exists a role T which can be assumed by R.

When using terraform with the profile R, the aws provider is unable to assume role T. However, when using the awscli, this works with the following configuration:

[profile A]
region=<region>

[profile R]
source_profile=A
role_arn=arn:aws:iam::xxxxxxxxxxxx:role/Role-A

[profile T]
source_profile=R
role_arn=arn:aws:iam::xxxxxxxxxxxx:role/Role-T

All of the following calls succeed and use the correct role/identity, implying that the A profile can assume the role arn:aws:iam::xxxxxxxxxxxx:role/Role-A via the profile R which can then assume the role arn:aws:iam::xxxxxxxxxxxx:role/Role-T via the profile T.

aws --profile A sts get-caller-identity
aws --profile R sts get-caller-identity
aws --profile T sts get-caller-identity

This issue can be worked around by using the profile A after allowing it to assume the role T, however this greatly increases our maintenance overhead and is not acceptable.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.11.11
+ provider.aws v2.3.0

Your version of Terraform is out of date! The latest version
is 0.11.13. You can update by downloading from www.terraform.io/downloads.html

Affected Resource(s)

Unable to provision resources as role cannot be assumed by the aws provider.

Terraform Configuration Files

variable "region" {
  default = "us-west-2"
}

variable "cluster_master_role" {
  default = "arn:aws:iam::xxxxxxxxxxxx:role/Role-T"
}

provider "aws" {
  version = "~> 2.0"
  region = "${var.region}"
}

provider "aws" {
  version = "~> 2.0"
  alias   = "eks_master"
  region  = "${var.region}"

  assume_role = [{
    role_arn = "${var.cluster_master_role}"
  }]
}

Debug Output

I'm not providing debug output as it contains private information, however here are a few small snippets that seem relevant:

2019/03/22 23:12:04 [DEBUG] [aws-sdk-go] <ErrorResponse xmlns="https://iam.amazonaws.com/doc/2010-05-08/">
  <Error>
    <Type>Sender</Type>
    <Code>ValidationError</Code>
    <Message>Must specify userName when calling with non-User credentials</Message>
  </Error>
  <RequestId>xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx</RequestId>
</ErrorResponse>
2019-03-22T23:12:16.824-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:16 [INFO] assume_role configuration set: (ARN: "arn:aws:iam::xxxxxxxxxxxx:role/Role-T", SessionID: "", ExternalID: "", Policy: "")
2019-03-22T23:12:16.824-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:16 [INFO] Building AWS auth structure
2019-03-22T23:12:16.824-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:16 [INFO] Setting AWS metadata API timeout to 100ms
2019-03-22T23:12:17.494-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:17 [INFO] Ignoring AWS metadata API endpoint at default location as it doesn't return any instance-id
2019-03-22T23:12:17.630-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:17 [INFO] Ignoring AWS metadata API endpoint at default location as it doesn't return any instance-id
2019-03-22T23:12:17.631-0400 [DEBUG] plugin.terraform-provider-aws_v2.3.0_x4: 2019/03/22 23:12:17 [INFO] Attempting to AssumeRole arn:aws:iam::xxxxxxxxxxxx:role/Role-T (SessionName: "", ExternalId: "", Policy: "")
2019/03/22 23:12:17 [ERROR] root: eval: *terraform.EvalConfigProvider, err: No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

Panic Output

N/A

Expected Behavior

Terraform aws provider assumes the role arn:aws:iam::xxxxxxxxxxxx:role/Role-T using the profile R.

Actual Behavior

Terraform fails to assume the role, failing with the following error message:

Error: Error running plan: 1 error(s) occurred:

* provider.aws.eks_master: No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

Steps to Reproduce

When using terraform, the role with arn arn:aws:iam::xxxxxxxxxxxx:role/Role-T cannot be assumed by the provider:

export AWS_SDK_LOAD_CONFIG="true"
export AWS_PROFILE=R
terraform init

terraform plan

Important Factoids

N/A

References

@bflad bflad added the provider Pertains to the provider itself, rather than any interaction with AWS. label Mar 23, 2019
@andywirv
Copy link

andywirv commented Mar 26, 2019

Similar behaviour with latest version of terraform and the roles defined in ~/.aws/credentials and aws provider config specifying profile = rather than assume_role . Fine with aws cli but fails with error

provider.aws.dev: Error creating AWS session: SharedConfigAssumeRoleError: failed to load assume role for arn:aws:iam::[******]:role/Operations, source profile has no shared credentials

terraform --version
Terraform v0.11.13
+ provider.aws v2.3.0

@rekahsoft
Copy link
Author

So I have determined why this is occurring. terraform-provider-aws uses the library aws-sdk-go-base which takes care of retrieving credentials for the provider. Within aws-sdk-go-base, the aws-go-sdk credentials package is used to obtain credentials for the provider via a ChainProvider. Now you would think that the EnvProvider used in the ChainProvider would behave the same as the aws-go-sdk session package, in that it would respect the environment variable AWS_SDK_LOAD_CONFIG, however it does not, and because of this, any profile that doesn't have credentials in the shared credentials file (by default ~/.aws/credentials) will not work with the terraform aws provider assume_role or profile options.

I'm happy to submit a PR to fix this, however feel that the PR would be better suited for the aws-go-sdk instead of the terraform-provider-aws or aws-sdk-go-base, as this issue will occur for any user of the aws-go-sdk credential package.

See:

@YakDriver
Copy link
Member

YakDriver commented May 9, 2019

I believe this is fixed with hashicorp/aws-sdk-go-base#5 PR. It sounds very similar.

@ianwsperber
Copy link

I'm encountering what I believe to be the same issue, using an AWS profile with a source_profile, eg

[profile me]
role_arn = ROLE_ARN
source_profile = foobar

I first noticed this when trying to add a provider which used an assume_role to access a resource in another AWS account, but have noticed this happens even when I do not provide the assume_role part - all I need to do is provide a second AWS provider to encounter the error

provider "aws" {
  region = "${var.region}"
}

provider "aws" {
  alias   = "my_alias"
  region = "${var.region}"
}

The error I see:

No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

@seddarj
Copy link

seddarj commented Jun 14, 2019

Same thing happening to me with a configuration similar to @ianwsperber's except instead of using 2 providers it happens with one provider and an S3 bucket as the backend. Works fine without the backend.

@bflad
Copy link
Contributor

bflad commented Jun 20, 2019

Please note that #8987, which was just merged and will release in version 2.16.0 of the Terraform AWS Provider later today, included this upstream fix aws/aws-sdk-go#2579, which is listed in the AWS Go SDK CHANGELOG as:

Adds support chaining assume role credentials from the shared config/credentials files. This change allows you to create an assume role chain of multiple levels of assumed IAM roles. The config profile the deepest in the chain must use static credentials, or credential_source. If the deepest profile doesn't have either of these the session will fail to load.

Hopefully this will help here. I also submitted this in Terraform Core to ensure the S3 Backend gets this update as well: hashicorp/terraform#21815

@aeschright aeschright added the needs-triage Waiting for first response or review from a maintainer. label Jun 24, 2019
@aeschright aeschright added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Jul 1, 2019
@bflad
Copy link
Contributor

bflad commented Jul 4, 2019

Hi folks 👋

This should be resolved in the S3 Backend as of Terraform version 0.12.3 and in the Terraform AWS Provider as of version 2.16.0.

I verified this locally via this configuration:

provider "aws" {
  region  = "us-east-2"
  version = "2.17.0"
}

data "aws_caller_identity" "current" {}

output "caller_arn" {
  value = "${data.aws_caller_identity.current.arn}"
}

This setup of AWS credentials and configuration files locally:

$ cat ~/.aws/config

[profile tf-acc-assume-role]
role_arn = arn:aws:iam::--OMITTED--:role/tf-acc-assume-role
source_profile = tf-acc

[profile tf-acc-assume-role-2]
role_arn = arn:aws:iam::--OMITTED--:role/tf-acc-assume-role-2
source_profile = tf-acc-assume-role

$ cat ~/.aws/credentials

[tf-acc]
aws_access_key_id = --OMITTED--
aws_secret_access_key = --OMITTED--

And running:

$ export AWS_SDK_LOAD_CONFIG=1
$ export AWS_PROFILE=tf-acc-assume-role-2
$ terraform apply
data.aws_caller_identity.current: Refreshing state...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

caller_arn = arn:aws:sts::--OMITTED--:assumed-role/tf-acc-assume-role-2/1562206728701794000

For future bug reports or feature requests relating to provider authentication, even if they look similar to the error messages reported here, please submit new GitHub issues following the bug report and feature request issue templates for further triage. These types of issues tend to be very environment specific. 👍

@bflad bflad closed this as completed Jul 4, 2019
@bflad bflad added this to the v2.16.0 milestone Jul 4, 2019
@ianwsperber
Copy link

ianwsperber commented Jul 7, 2019

@bflad Unfortunately I'm still encountering this issue. Note that my validation method was slightly different. I am using a profile with only a single layer of assumed roles (tf-acc-assume-role, in your example above), and am receiving an error on the below provider block, which itself assumes a role:

provider "aws" { # This works fine
  version             = "~> 2.17.0"
  region              = var.region
}

provider "aws" { # Error happens here
  version = "~> 2.17.0"
  alias   = "my_alias"
  region  = var.region

  assume_role {
    role_arn = var.my_arn_var
  }
}
› terraform --version 
Terraform v0.12.3

Error

Error: No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

  on main.tf line 8, in provider "aws":
   8: provider "aws" {

I believe this is more similar to the use case for the original comment than that you provided. Could we reopen the issue?

@rekahsoft
Copy link
Author

@ianwsperber, did you set AWS_SDK_LOAD_CONFIG to some non-empty string before running terraform?

@ianwsperber
Copy link

ianwsperber commented Jul 7, 2019

@rekahsoft I did! I've included details below. Was your original problem fixed by this release? It closely resembles my own, so if it fixed yours I'd expect it to fix mine :/

› echo $AWS_SDK_LOAD_CONFIG 
1
› echo $AWS_PROFILE
my.assume.role

I've quadruple checked my config files are setup correctly

$ cat ~/.aws/config

[profile base]
region = us-east-1

[profile my.assume.role]
role_arn = arn:aws:iam::--OMITTED--:role/OrganizationAccountAccessRole
source_profile = base

$ cat ~/.aws/credentials

[base]
aws_access_key_id = --OMITTED--
aws_secret_access_key = --OMITTED--

Moreover aws sts get-caller-identity succeeds so I know that I am authenticated.

@timoguin
Copy link
Contributor

This is failing for me as well with Terraform v0.12.5 and provider 2.20.0. These are roles that work fine with TF 0.11.

Both of these scenarios fail for me:

  • Running Terraform locally using AWS credentials set via environment variables with aws-vault
  • Running Terraform via CI/CD from an ECS service with a task role

Interestingly in my case, the Terraform plan works fine. It's only the apply that fails. This is especially odd because the remote state backend is configured to assume the same role, and that part seems to be working since Terraform can read the remote state during the plan.

@shots-crazy
Copy link

@timoguin did you ever find how to fix this? I’m running Terraform via CI/CD and credentials are set via environment variables as well.
I still can not assume a role and I have tried everything.
I tested if I can assume a role with those same credentials via CLI and it works but not with Terraform.

@timoguin
Copy link
Contributor

timoguin commented Aug 1, 2019

@shots-crazy No, I've not figured it out. I'm running all my 0.12 Terraform by manually assuming roles into each account after establishing an MFA session with aws-vault. Before 0.12, Terraform would use those credentials from the environment variables to actually assume the role defined in the assume_role block for the provider. It seems like Terraform is ignoring the environment variables and trying to assume the role without them, which fails because we force MFA for everything.

Our CI/CD system is completely broken by this. This is the error I get trying to apply plans:

Error: The role "arn:aws:iam::redacted:role/RoleName" cannot be assumed.

  There are a number of possible causes of this - the most common are:
    * The credentials used in order to assume the role are invalid
    * The credentials do not have appropriate permission to assume the role
    * The role ARN is not valid

  on providers.tf line 1, in provider "aws":
   1: provider "aws" {

@shots-crazy
Copy link

@timoguin I am getting the same error when running via CI/CD
I resorted to having keys in every account instead of trying to assume a role into those accounts.
I still have multiple providers but I have to specify a secret key & access key for each provider.

@ianwsperber
Copy link

Has anyone been able to try @YakDriver's solution? I promised to try it out but have been too busy to do this work :/ If we can validate that works hopefully the TF team can iterate on a fix more quickly: hashicorp/aws-sdk-go-base#5 (comment)

@jgartrel
Copy link

I have tried @YakDriver 's solution, but it does not seem to work for me

$ export AWS_SDK_LOAD_CONFIG=1
$ export AWS_PROFILE=base
$ terraform apply
data.aws_caller_identity.current: Refreshing state...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

caller_arn = arn:aws:sts::--OMITTED--:assumed-role/--OMITTED--/--OMITTED--

$ export AWS_SDK_LOAD_CONFIG=1
$ export AWS_PROFILE=tf-acc-assume-role
$ terraform apply

Error: No valid credential sources found for AWS Provider.
	Please see https://terraform.io/docs/providers/aws/index.html for more information on
	providing credentials for the AWS Provider

  on main.tf line 1, in provider "aws":
   1: provider "aws" {
$ terraform -v
Terraform v0.12.6
+ provider.aws v2.23.0
$ tail -3 ~/go/src/github.com/terraform-providers/terraform-provider-aws/go.mod 
)

replace github.com/hashicorp/aws-sdk-go-base => github.com/YakDriver/aws-sdk-go-base v0.0.0-20190503174753-82bd97734e8f
$ cat main.tf 
provider "aws" {
  region  = "us-west-2"
}

data "aws_caller_identity" "current" {}

output "caller_arn" {
  value = "${data.aws_caller_identity.current.arn}"
}
$ cat ~/.aws/config 
[default]
region=us-west-2
output=json

[profile base]
region=us-west-2
output=json

[profile tf-acc-assume-role]
role_arn = arn:aws:iam::--OMITTED--:role/aws-reserved/sso.amazonaws.com/us-west-2/--OMITTED--
source_profile = base
$ cat ~/.aws/credentials 
[default]
aws_access_key_id=--OMITTED--
aws_secret_access_key=--OMITTED--

[base]
aws_access_key_id = --OMITTED--
aws_secret_access_key = --OMITTED--
aws_session_token = --OMITTED--

I followed YakDriver's instructions posted above to do the build with the addition of:

$ brew install go
$ brew install bzr
$ export GOPATH=~/go
$ export PATH=$PATH:$GOPATH/bin

@jgartrel
Copy link

@bflad Still encountering this issue, can we reopen it?

@rekahsoft
Copy link
Author

@bflad I second @jgartrel, I still can reproduce this problem as originally described 😢

@woodcockjosh
Copy link

Still broken

@timoguin
Copy link
Contributor

It's worth noting that, in my case, the S3 backend is configured to assume the same role as the provider is. It reads the remote state just fine. It can run a plan just fine. It's only the apply it fails on.

My configuration is simply having AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN set as environment variables, and those credentials have IAM permissions to assume the role(s) defined in the Terraform.

I also tried building everything with the patched aws-sdk-go. I had the same unsuccessful result as @jgartrel.

@woodcockjosh
Copy link

Use this tool https://github.com/remind101/assume-role

@YakDriver
Copy link
Member

YakDriver commented Aug 22, 2019

Two big issues remain. Also, I suggest moving this conversation to hashicorp/aws-sdk-go-base#4, which is still open.

1 - Testing framework

Credentials being key to everything, the maintainers are hesitant to move forward without automated regression tests. The code in question is very old, moved from place to place. Even still, everyone knows what to expect. They don't want to fix a 3% issue and break 97%. Help creating regression tests would be welcome.

2 - Fixing all the issues

My fix seems to have fixed some but not all of the issues. We need to figure out what else remains. The feedback on this issue is very helpful in that regard.

@YakDriver
Copy link
Member

@rekahsoft If you have a minute, can you contribute this to my collection of credential tests? I'm trying to get an easily reproducible set of problems together: https://github.com/YakDriver/terraform-cred-tests

@rekahsoft
Copy link
Author

@YakDriver will do. Sorry for the latent response, been on vacation. I'm back next week and will send a PR to your repo. Thanks for putting this together.

@udayanms
Copy link

I have credentails in env variables,
set credentials and config environment vars.
Here is my scenarios

  1. user tfdev (account A) assume role to org_admin under (Payers's account B) alias it B_org_admin
  2. Call module "setup" with provider alias B_org_admin
  3. Under Setup Module create a new provider alias "C_org_admin" which tries to switch to "org_admin" under account C
  4. Provider cannot assume Role org_admin under Account C

I could verify that while executing module setup the role is org_admin under account C (using caller identity)

but I see cloudtrail under Account A that it failed to assume role org_admin under Account C.

Should it not try to assume role from Account B to Account C. Why is provider still trying to Assume from it from account A -> Account C when provider was created under setup module which was invoked with provider B_org_admin.

Questions

  1. Is provider always trying to switch from default provider.
  2. I have also created profiles and setup roles under this but TF isnt picking it,

@j0nathontayl0r
Copy link

From what I'm reading, this ticket is outstanding and we're not able to assume roles from a primary provider using an alias? Why is the ticket closed?

I'm getting a similar issue. Output:

Error: No valid credential sources found for AWS Provider.
  Please see https://terraform.io/docs/providers/aws/index.html for more information on
  providing credentials for the AWS Provider

  on create-users-groups-and-apps.tf line 73, in provider "aws":
  73: provider "aws" {

My code:

provider "aws" {
  version = "~> 2.8"
  region            = "ap-southeast-2"
  access_key = var.Secrets.AccessKey
  secret_key = var.Secrets.SecretKey
}

provider "aws" {
  version = "~> 2.8"
  alias = "AnAccount-SuperAdmin-AssumedRole"
  region = "ap-southeast-2"
  assume_role {
    role_arn = "arn:aws:iam::1111111111111:role/SuperAdmin"
    session_name = "JDTAY_SuperAdmin"
  }
}

@woodcockjosh
Copy link

Some project owners have a policy of closing tickets when they are too hard to fix so that it doesn't run up their median time for opened tickets

@udayanms
Copy link

udayanms commented Oct 4, 2019

Actually this worked for me. I used a better strategy although this is not documented anywhere.

My learning is remove the Access and Secret key credentials from the environment variables.if not remove the TF does not behave as expected.

Set the config and credentials environment variables.

AWS_SHARED_CREDENTIALS_FILE – Specifies the location of the file that the AWS CLI uses to store access keys. The default path is ~/.aws/credentials).

AWS_CONFIG_FILE – Specifies the location of the file that the AWS CLI uses to store configuration profiles. The default path is ~/.aws/config).

AWS_SDK_LOAD_CONFIG="true"

Instead of assuming roles as stated above set them under config. You can go any level in assuming role and all you have to do is set the profile in providers definition and use it as alias(if required).

e.g config
[default]

[profile AnAccount]
role_arn=arn:aws:iam::1111111111111:role/SuperAdmin
source_profile=default
Above code shall change to this

provider "aws" {
version = "~> 2.8"
region = "ap-southeast-2"
}

provider "aws" {
version = "~> 2.8"
profile = "AnAccount"
alias = "AnAccount_ap2"
region = "ap-southeast-2"
}

#How to use it
module "create_account" {
source = "./account"
params = local.params
providers = {
aws = aws.AnAccount_ap2
}
}
Read about provider when using with modules & alias.

This logic works perfectly.

@aeschright
Copy link
Contributor

Hi folks, the fix @YakDriver described above is scheduled to be released with v2.32.0 next week. If you upgrade and the problem you had is still happening, please open a new issue so we can address the errors separately. Thanks!

@jurajseffer
Copy link
Contributor

In my case the problem with role assumption was talking to AWS at all because the docker container (alpine) didn't have the certificate installed (I noticed it because Terraform version checker call failed as well) - this doesn't show up even in trace logs. Installing ca-certificates package fixed it.

@ghost
Copy link

ghost commented Nov 1, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. provider Pertains to the provider itself, rather than any interaction with AWS.
Projects
None yet
Development

No branches or pull requests