Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy authentication issues with auto-tls #9301

Closed
YeruchamB opened this issue Feb 8, 2018 · 11 comments
Closed

Proxy authentication issues with auto-tls #9301

YeruchamB opened this issue Feb 8, 2018 · 11 comments

Comments

@YeruchamB
Copy link

Hey I'm having similar problems as ###7930

I'm currently just trying to run a simple single-node cluster with one proxy as a POC.
I need the traffic between the proxy and server to be encrypted but not between the client and the proxy.

Cluster configuration:
#!/usr/bin/env bash

THIS_IP="$1"
THIS_NAME=infra-${THIS_IP}

TOKEN=token-02
CLUSTER_STATE=new
etcd --data-dir=data.etcd --name ${THIS_NAME}
--auto-tls --peer-auto-tls
--initial-advertise-peer-urls https://${THIS_IP}:2380 --listen-peer-urls https://${THIS_IP}:2380
--advertise-client-urls https://${THIS_IP}:2379 --listen-client-urls https://${THIS_IP}:2379
--discovery https://discovery.etcd.io/48f750c4f2254d71e7726b45abef4379
--initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}

Proxy Configuration:
etcd --proxy on
--listen-client-urls http://127.0.0.1:2379
--peer-auto-tls
--discovery https://discovery.etcd.io/48f750c4f2254d71e7726b45abef4379 \

Trying to run a simple etcdctl command results in the following error:
ETCDCTL_API=2 ./etcdctl member list
2018-02-04 09:52:57.871465 I | proxy/httpproxy: failed to direct request to https://10.40.4.112:2379: x509: certificate signed by unknown authority
2018-02-04 09:52:57.871492 I | proxy/httpproxy: marked endpoint https://10.40.4.112:2379 unavailable
2018-02-04 09:52:57.871513 I | proxy/httpproxy: unable to get response from 1 endpoint(s)
2018-02-04 09:52:57.872778 I | proxy/httpproxy: zero endpoints currently available
client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: getsockopt: connection refused
; error #1: client: etcd member http://127.0.0.1:2379 has no leader

@hexfusion
Copy link
Contributor

hexfusion commented Feb 8, 2018

x509: certificate signed by unknown authority

@YeruchamB I believe you need to also set auto-tls for proxy, re @heyitsanthony note about InsecureSkipVerify in the issue you reference.

https://github.com/coreos/etcd/blob/b309bc6403097bf0984d5ccba792411cde86091e/etcdmain/etcd.go#L203-L204

@YeruchamB
Copy link
Author

@hexfusion Wouldn't that mean that I'd have to use authentication between the client and the proxy?
Or is that configuration just for the communication between proxy and server?

@hexfusion
Copy link
Contributor

hexfusion commented Feb 8, 2018

@YeruchamB yeah but I was able to get that to work using curl. I think the issue I run into is that v2 etcdctl lacks --insecure-verify-tls flag. Where if I setup the following proxy <-> server.

https://github.com/hexfusion/etcd-compose-examples/blob/master/issues/9301/docker-compose.yml

and then curl --insecure https://172.16.26.42:2379/v2/members against proxy and it works.

I think if cluster is set with auto-tls proxy must also or proxied client connections will fail. But as etcdctl doesn't support the --insecure-verify-tls with V2 it is kind of moot. This is how I read it at least, perhaps we could add this but why not use grpc-proxy in v3?

@YeruchamB
Copy link
Author

@hexfusion Can you explain the --insecure-verify-tls flag a bit? I'm really just looking for a way to have all communication be encrypted without having to deal directly with certificates.

If I decide to forgo the proxy and use a Go client to communicate directly with the cluster, am I able to:

  1. Configure the cluster to auto-tls
  2. Have the communication between client and server be encrypted
  3. Skip the authentication in the client-server communication
    If so, what configuration/flags do I need to set?

@hexfusion
Copy link
Contributor

hexfusion commented Feb 9, 2018

@hexfusion Can you explain the --insecure-verify-tls flag a bit?

Sure so in the case you are outlining you have.
1.) self signed cert which is using a non trusted Certificate Authorities ie auto-tls.
2.) connection where you are not passing the CA that signed the certs directly. IE curl --cacert /path/to/ca.crt you are basically encrypting data but not using TLS for authentication because of the lack of trust, which is your goal anyways per above ^^.

etcd and the go crypto/tls package will let you know the situation in the error message. Below I have the proxy setup with auto-tls and resulting error.

ETCDCTL_API=2 etcdctl --endpoints=https://172.16.26.42:2379 member list
client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate signed by unknown authority

So that is client <-> proxy we can disable this check in various ways --insecure -k with curl and in V3 etcdctl --insecure-verify-tls. So explicitly saying I understand the trust issues I just want my data encrypted.

Now on the proxy <-> server level we have the same problem. This was addressed in the commit d5a0d4d

https://github.com/coreos/etcd/blob/b309bc6403097bf0984d5ccba792411cde86091e/etcdmain/etcd.go#L203-L204

So what this is saying is we are going to basically pass the go crypto/tls package the equivalent to --insecure -k outlined above if the --auto-tls , --peer-auto-tls flags are set.

So you can't have etcd server with auto-tls and expect proxy to connect to it without that flag. etcd forces you to use auto-tls on the proxy to set that. This was just a way to promote end to end encryption best I can tell. Which puts the final part of the trust on your client.

So basically if you want to use v2 proxy and v2 etcdctl in the above manner. The way I see it you are out of luck currently. Which is why I brought up the v3 grpc-proxy.

If I decide to forgo the proxy and use a Go client to communicate directly with the cluster, am I able to:

  1. Configure the cluster to auto-tls
  • YES
  1. Have the communication between client and server be encrypted
  • YES
  1. Skip the authentication in the client-server communication
    -YES assuming clientv3 you could use this https://godoc.org/google.golang.org/grpc#WithInsecure which seems to be the default? https://github.com/coreos/etcd/blob/f5d02f02791716ed99c489ffa2441b8cc7925457/clientv3/client.go#L278-L282

If so, what configuration/flags do I need to set?

Please read this over https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md it outlines a lot of what I just said. But your bash script from what I can tell does what you want.

Sorry to long be winded I hope this is useful

@hexfusion
Copy link
Contributor

@gyuho mind reading this over please?

@YeruchamB
Copy link
Author

@hexfusion I've read the security.md guide previously but there doesn't exist an explanation there for how to the client is expected to communicate with the servers when they're configured to auto-tls..

Anyway, I took your advise and am running into a different error now..
I spun up a single node cluster the same way I defined it earlier and when using curl --insecure, i'm able to successfully contact etcd from the client. But if i try to connect directly through etcdctl (version 3), i receive "Error: context deadline exceeded"

I ran it with --debug and got:
ETCDCTL_API=3 ./etcdctl --endpoints=10.0.0.55:2379 --insecure-skip-tls-verify --debug member list
ETCDCTL_KEY=
ETCDCTL_USER=
ETCDCTL_WATCH_KEY=
ETCDCTL_WATCH_RANGE_END=
ETCDCTL_WRITE_OUT=simple
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: IDLE, 0xc4201da180
INFO: 2018/02/11 14:37:27 dialing to target with scheme: ""
INFO: 2018/02/11 14:37:27 could not get resolver for scheme: ""
INFO: 2018/02/11 14:37:27 balancerWrapper: is pickfirst: false
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: [{10.0.0.55:2379 }]
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: new subconn: [{10.0.0.55:2379 0 }]
INFO: 2018/02/11 14:37:27 balancerWrapper: handle subconn state change: 0xc4201cc700, CONNECTING
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: CONNECTING, 0xc4201da180
INFO: 2018/02/11 14:37:27 balancerWrapper: handle subconn state change: 0xc4201cc700, READY
INFO: 2018/02/11 14:37:27 clientv3/balancer: pin "10.0.0.55:2379"
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: READY, 0xc4201da180
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: [{10.0.0.55:2379 }]
INFO: 2018/02/11 14:37:27 clientv3/retry: error "rpc error: code = Unavailable desc = transport is closing" on pinned endpoint "10.0.0.55:2379"
INFO: 2018/02/11 14:37:27 clientv3/balancer: "10.0.0.55:2379" is marked unhealthy ("rpc error: code = Unavailable desc = transport is closing")
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: []
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: removing subconn
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: [{10.0.0.55:2379 }]
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: new subconn: [{10.0.0.55:2379 0 }]
INFO: 2018/02/11 14:37:27 balancerWrapper: handle subconn state change: 0xc4201cc700, SHUTDOWN
INFO: 2018/02/11 14:37:27 clientv3/balancer: "10.0.0.55:2379" is marked unhealthy ("grpc: the connection is closing")
INFO: 2018/02/11 14:37:27 clientv3/balancer: unpin "10.0.0.55:2379" ("grpc: the connection is closing")
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: TRANSIENT_FAILURE, 0xc4201da180
INFO: 2018/02/11 14:37:27 balancerWrapper: handle subconn state change: 0xc4201cc980, CONNECTING
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: CONNECTING, 0xc4201da180
INFO: 2018/02/11 14:37:27 clientv3/retry: switching from "10.0.0.55:2379" due to error "rpc error: code = Unavailable desc = transport is closing"
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: [{10.0.0.55:2379 }]
INFO: 2018/02/11 14:37:27 balancerWrapper: handle subconn state change: 0xc4201cc980, READY
INFO: 2018/02/11 14:37:27 clientv3/balancer: pin "10.0.0.55:2379"
INFO: 2018/02/11 14:37:27 ccBalancerWrapper: updating state and picker called by balancer: READY, 0xc4201da180
INFO: 2018/02/11 14:37:27 balancerWrapper: got update addr from Notify: [{10.0.0.55:2379 }]
INFO: 2018/02/11 14:37:27 clientv3/retry: error "rpc error: code = Unavailable desc = transport is closing" on pinned endpoint "10.0.0.55:2379"
......
And this continues until it ultimately fails

@hexfusion
Copy link
Contributor

hexfusion commented Feb 11, 2018

@hexfusion I've read the security.md guide previously but there doesn't exist an explanation there for how to the client is expected to communicate with the servers when they're configured to auto-tls..

Good point so I did a bit more digging.

@YeruchamB, I made some assumptions there which were false it seems regarding the client connection. So as noted in the issue #7654 auto-tls generates local key/cert. These are still required to connect your client even with --insecure-skip-tls-verify. Also noted is this should be used for testing only. So you would want to make sure your etcd is listening to localhost as well for this to work. But for production you would still generate your own certs.

# ETCDCTL_API=3 etcdctl \
>     --insecure-skip-tls-verify \
>     --endpoints localhost:2379 \
>     --cert etcd-01.etcd/fixtures/client/cert.pem \
>     --key etcd-01.etcd/fixtures/client/key.pem \
>     member list
6aca7ad94cfe3058, started, etcd-01, https://172.16.26.41:2280, https://172.16.26.41:2379

@YeruchamB
Copy link
Author

@hexfusion Thanks. I would suggest that you possibly change it so that if im using --insecure-skip-tls-verify, it seems unnecessary to require a cert and key.

Moving on, I generated a self-signed key pair and was able to run etcdctl succesfully. I tried writing a simple go binary that would just connect and watch on "foo" and im having connection errors...

go code:
package main

import (
"time"
"crypto/tls"
"flag"
"fmt"
"github.com/coreos/etcd/clientv3"
"io/ioutil"
"context"
"log"
)

func main() {
certPath := flag.String("cert", "", "path to certificate")
keyPath := flag.String("key", "", "path to key")
flag.Parse()

cert, err := ioutil.ReadFile(*certPath)
if err != nil {
	log.Fatal(err)
}

key, err := ioutil.ReadFile(*keyPath)
if err != nil {
	log.Fatal(err)
}

certificate, err := tls.X509KeyPair(cert, key)
if err != nil {
	log.Fatal(err)
}

cfg := clientv3.Config{
	TLS: &tls.Config{ InsecureSkipVerify: true, Certificates: []tls.Certificate{certificate} },
	Endpoints:   []string{"127.0.0.1:2379"},
	DialTimeout: 30 * time.Second}
client, err := clientv3.New(cfg)
if err != nil {
	log.Fatal(err)
}
rch := client.Watch(context.Background(), "foo")
for wresp := range rch {
	for _, ev := range wresp.Events {
		fmt.Printf("%s %q : %q\n", ev.Type, ev.Kv.Key, ev.Kv.Value)
	}
}

}

Error:
./simple-etcd-client --cert cert.pem --key key2.pem
2018-02-12 16:33:58.210907 I | dial tcp 127.0.0.1:2379: getsockopt: connection refused

Can you tell me what im doing wrong?

@hexfusion
Copy link
Contributor

@hexfusion Thanks. I would suggest that you possibly change it so that if im using --insecure-skip-tls-verify, it seems unnecessary to require a cert and key.

Yeah, I need to review that further. It seems that way for a reason, probably to stop folks from using auto-tls in production?

Can you tell me what im doing wrong?

@YeruchamB would love to help you but we are kind of moving off topic. Would you mind closing this issue and opening a new one re: clientv3? Maybe just back ref this ticket? Thanks!

@idleyoungman
Copy link

idleyoungman commented Aug 24, 2023

I recently found that you can connect to the v2 API with etcdctl if you use the self-signed cert both as the cert file and the CA file:

ETCDCTL_API=2 etcdctl --endpoints https://0.0.0.0:2379/ --cert-file /var/lib/etcd/fixtures/client/cert.pem --key-file /var/lib/etcd/fixtures/client/key.pem --ca-file /var/lib/etcd/fixtures/client/cert.pem --no-sync

I believe this because of https://pkg.go.dev/crypto/x509#CreateCertificate:

The certificate is signed by parent. If parent is equal to template then the certificate is self-signed. The parameter pub is the public key of the certificate to be generated and priv is the private key of the signer.

In our use case (inside Kubernetes pods) the --no-sync flag is also required, since otherwise etcdctl will try to contact the other etcd hosts on hostnames that don't match the 0.0.0.0 hostname in the self-signed cert:

client: etcd cluster is unavailable or misconfigured; error #0: x509: certificate is not valid for any names, but wanted to match etcd-2.etcd                                                                                                                     
; error #1: x509: certificate is not valid for any names, but wanted to match etcd-0.etcd                   
; error #2: x509: certificate is not valid for any names, but wanted to match etcd-1.etcd

Our working example:

# ETCDCTL_API=2 etcdctl --endpoints https://0.0.0.0:2379 --cert-file /var/lib/etcd/fixtures/client/cert.pem --key-file /var/lib/etcd/fixtures/client/key.pem --ca-file /var/lib/etcd/fixtures/client/cert.pem --no-sync member list
739c8a514be42732: name=etcd-1 peerURLs=https://etcd-1.etcd:2380 clientURLs=https://etcd-1.etcd:2379 isLeader=false
9eba678f83b54c43: name=etcd-0 peerURLs=https://etcd-0.etcd:2380 clientURLs=https://etcd-0.etcd:2379 isLeader=false
c5c173d6ddab43a5: name=etcd-2 peerURLs=https://etcd-2.etcd:2380 clientURLs=https://etcd-2.etcd:2379 isLeader=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants