Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale ETCD authentication for >10k V3 client hosts #9615

Closed
jxuan opened this issue Apr 24, 2018 · 23 comments
Closed

Scale ETCD authentication for >10k V3 client hosts #9615

jxuan opened this issue Apr 24, 2018 · 23 comments

Comments

@jxuan
Copy link

jxuan commented Apr 24, 2018

We are using ETCD for grpc service discovery for over 1000 hosts. In the future, I want to be able to scale up to 10 times more hosts using one ETCD server cluster (7 hosts). One issue concerning is the speed of the authentication process on the ETCD server module. It is using the default BCryptCost which is 10, which makes bcrypt the bottleneck to authenticate 10k hosts at the same time, especially after a leader re-election in the ETCD cluster. The slow speed of the default bcrypt makes it very difficult to authenticate 10k hosts at the same time.

We do not have strong security requirement for ETCD right now, as we only uses ETCD in our internal network. And we have building end-to-end TLS for internal traffic. So we are ok to reduce the bcrypt cost factor, sacrifice a little security in order to gain significantly on scalability.

Can you make BCryptCost configurable? Right now you are using the default value which is 10 in server code, the minimum value which is 4 in test code. We want to use 6 or 7 which based on our test is good for our use case.

Thanks

@gyuho
Copy link
Contributor

gyuho commented Apr 24, 2018

@mitake @yudai The cost is hard-coded in server side, and not configurable.

Should we make it configurable?

ref. https://github.com/coreos/etcd/blob/3da4c8a585884fa0cc3d42c82f113bf6669d41e0/auth/store.go#L70-L71

@jxuan
Copy link
Author

jxuan commented Apr 24, 2018

More ref.

auth/store_test.go
37:func init() { BcryptCost = bcrypt.MinCost }

clientv3/main_test.go
32:func init() { auth.BcryptCost = bcrypt.MinCost }

vendor/golang.org/x/crypto/bcrypt/bcrypt.go
21 const (
22 MinCost int = 4 // the minimum allowable cost as passed in to GenerateFromPassword
23 MaxCost int = 31 // the maximum allowable cost as passed in to GenerateFromPassword
24 DefaultCost int = 10 // the cost that will actually be set if a cost below MinCost is passed into GenerateFromPassword
25 )

@jxuan
Copy link
Author

jxuan commented Apr 24, 2018

@gyuho if you do not have time for this, I can make the change for you. But need your review and approval.

@gyuho
Copy link
Contributor

gyuho commented Apr 25, 2018

@jxuan Sure, I would add a flag etcd --bcrypt-cost "min/max/default". Please let me know if you need help on that. Start from making auth.NewAuthStore take bcrypt cost, as a parameter. /cc @mitake

@jxuan
Copy link
Author

jxuan commented Apr 25, 2018

Sure.

@mitake
Copy link
Contributor

mitake commented Apr 25, 2018

@jxuan @gyuho I agree with the idea of configurable bcrypt cost. But I'm not sure just providing min/max/default is a good idea. Why not add the integer flag for the config?

@gyuho
Copy link
Contributor

gyuho commented Apr 25, 2018

@jxuan @mitake Integer sounds good (within the range of min <= x <= max, ref https://godoc.org/golang.org/x/crypto/bcrypt#GenerateFromPassword).

@jxuan
Copy link
Author

jxuan commented Apr 25, 2018

I like integer as well. In case the value is out of the valid range (between min and max), the default value will be used.

@gyuho
Copy link
Contributor

gyuho commented Apr 25, 2018

Sounds good.

In case the value is out of the valid range

We can return an error from auth package, as well.

Thanks.

@ericchiang
Copy link
Contributor

ericchiang commented Apr 25, 2018

Has etcd ever considered generating random tokens for users instead of using a password? You bcrypt passwords to prevent leaking sensitive data during a database breach. If you're using a randomly generated token, that data is no longer sensitive, in the sense that if I breach the database, I can't use that username/credential combination in some other domain.

E.g. expand the user definition to:

// User is a single entry in the bucket authUsers
message User {
  bytes name = 1;
  bytes password = 2;
  repeated string roles = 3;

  // Randomly generated token
  string token = 4;
}

Then creating a user also creates a random token server side:

$ etcdctl user add myusername
( prints random token )

The returned token is some combination of the username and random token generated by etcd.

@gyuho
Copy link
Contributor

gyuho commented Apr 25, 2018

^ @mitake

@jxuan
Copy link
Author

jxuan commented Apr 25, 2018

@ericchiang, if you breach the database, you can get other people's token and represent them. It does not matter whether the token is generated randomly or from the password.

In case for JWT tokens, there is no need to store all issued tokens in the database. Only verify the signature is supposed to be enough to verify the identity. If the database is breached but the private key is stored elsewhere, it is still not easy to represent others.

@ericchiang
Copy link
Contributor

ericchiang commented Apr 25, 2018

In case for JWT tokens, there is no need to store all issued tokens in the database. Only verify the signature is supposed to be enough to verify the identity. If the database is breached but the private key is stored elsewhere, it is still not easy to represent others.

Sure, you could still have the returned token be a JWT with one of the claims also stored as part of the user object to do revocation.

I'm trying to argue that we should question why bcrypt is used in this process at all. Instead of making the bcrypt cost configurable, getting away from storing passwords might be a more holistic solution.

@jxuan
Copy link
Author

jxuan commented Apr 26, 2018

Username+Password is the way (and probably the only way) to authenticate a user when there is no 3rd-party identity token. Once the user provides the correct username+password, the backend can issue tokens to identify the user. In future requests, token will replace username+password as user identity. So I think hashing is still needed at the beginning of the authentication flow to verify whether the password client provides is correct.

@jxuan
Copy link
Author

jxuan commented Apr 26, 2018

bcrypt actually store the bcrypt-cost along with the hashed version of the password, as bcrypt-cost is not a parameter of https://godoc.org/golang.org/x/crypto/bcrypt#CompareHashAndPassword.

Thus if anybody wants to change bcrypt-cost for their ETCD cluster, not only do they need to restart the ETCD cluster with the new bcrypt-cost parameter, but also go through the change password API to recalculate the bcrypt hash of each account using the new cost. The code in ChangePassword does not check whether the old password is the same as the new one. Thus you can just change password to the same password as before. However the hash will change.

@jxuan
Copy link
Author

jxuan commented May 3, 2018

The squashed commit can be found in #9687

@gyuho
Copy link
Contributor

gyuho commented May 4, 2018

@jxuan Your PR has just been merged. Can you try the master branch and close this? Thanks!

@jxuan
Copy link
Author

jxuan commented May 7, 2018

The commit is merged into master. Closing now.

@jxuan jxuan closed this as completed May 7, 2018
@jxuan
Copy link
Author

jxuan commented May 7, 2018

Two things to notice.

First, the bcrypt-cost of a hash is stored along with the hash. If you change bcrypt-cost when launching ETCD server but did not change password hash in storage, all existing password validation will go through the same bcrypt-cost value instead of the new value. Also all password verification will succeed. To truly use the new bcrypt-cost, use change password command to re-calculate new hash and store the new hash in storage. Note that you can use the same password in the change password command.

Second, using bcrypt-cost 6 or 7, the same etcd host can handle about 20x more authentication requests at the same time, which means that you can have about 20 times more clients connecting to your ETCD cluster.

@jxuan
Copy link
Author

jxuan commented May 16, 2018

@gyuho When do you think that this change will be released? It is not in 3.3.5.

@gyuho
Copy link
Contributor

gyuho commented May 16, 2018

@jxuan Won't be backported to v3.3 since it's a new feature. It will be released with v3.4, which is still a few months away.

@jxuan
Copy link
Author

jxuan commented May 16, 2018

I see. Thanks for the reply, gyuho.

@JamesXNelson
Copy link

First, the bcrypt-cost of a hash is stored along with the hash. If you change bcrypt-cost when launching ETCD server but did not change password hash in storage, all existing password validation will go through the same bcrypt-cost value instead of the new value. Also all password verification will succeed.

Um, I may be misunderstanding, so can I request a little more information here?

Does this mean if we change the brcypt cost when running etcd, but don't rehash the passwords, then any password used will succeed?

And, if true, does this change apply in all situations?

We recently had a customer who was upgrading from 3.3 to 3.5 and were simply restoring the old snapshot to the new system. During that process, they also added bcrypt-cost: 4 (I'm not sure how much worse that is for security, but all our security-conscious customers already run etcd peer-to-peer in isolated network). When I asked them to test, using an incorrect password correctly failed. Maybe the restore process redid the hashes... but I feel like I should at least test how this behaves on a running cluster w/o the snapshot restore.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants