Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL connection not working when using the lets-encrypt public cert as CA #1814

Closed
1 of 4 tasks
randomcatgamer opened this issue May 22, 2018 · 11 comments
Closed
1 of 4 tasks

Comments

@randomcatgamer
Copy link

randomcatgamer commented May 22, 2018

Description

Hi. This is actually a problem we encountered when we tried to connect our nodejs service to kafka via ssl using the node-rdkafka library that wraps around librdkafka. Since we could not see any visible error we decided to try connecting directly with the lib and the examples. When we try to use the librdkafka examples with the lets-encrypt-x3-cross-signed.pem.txt file as ssl.ca.location our java consumer connects, but the librdkafka consumer does not.

This is what we ran in the examples folder ./rdkafka_consumer_example -X debug='all' -X security.protocol='ssl' -X ssl.ca.location='/kafka/ssl/lets-encrypt-x3-cross-signed.pem.txt' -b localhost:19093 test.topic.ouch.

This is some of the output:

1527005162.440 RDKAFKA-7-SSL: rdkafka#consumer-1: [thrd:app]: Loading CA certificate(s) from file /kafka/ssl/lets-encrypt-x3-cross-signed.pem.txt 1527005162.441 RDKAFKA-7-MEMBERID: rdkafka#consumer-1: [thrd:app]: Group "rdkafka_consumer_example": updating member id "(not-set)" -> "" 1527005162.441 RDKAFKA-7-WAKEUPFD: rdkafka#consumer-1: [thrd:app]: ssl://localhost:19093/bootstrap: Enabled low-latency partition queue wake-ups 1527005162.441 RDKAFKA-7-WAKEUPFD: rdkafka#consumer-1: [thrd:app]: ssl://localhost:19093/bootstrap: Enabled low-latency ops queue wake-ups 1527005162.441 RDKAFKA-7-BROKER: rdkafka#consumer-1: [thrd:app]: ssl://localhost:19093/bootstrap: Added new broker with NodeId -1 % Subscribing to 1 topics 1527005162.441 RDKAFKA-7-BRKMAIN: rdkafka#consumer-1: [thrd::0/internal]: :0/internal: Enter main broker thread 1527005162.441 RDKAFKA-7-STATE: rdkafka#consumer-1: [thrd::0/internal]: :0/internal: Broker changed state INIT -> UP 1527005162.441 RDKAFKA-7-BROADCAST: rdkafka#consumer-1: [thrd::0/internal]: Broadcasting state change 1527005162.441 RDKAFKA-7-BRKMAIN: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Enter main broker thread 1527005162.441 RDKAFKA-7-CONNECT: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: broker in state INIT connecting 1527005162.442 RDKAFKA-7-CONNECT: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Connecting to ipv4#127.0.0.1:19093 (ssl) with socket 7 1527005162.442 RDKAFKA-7-STATE: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Broker changed state INIT -> CONNECT 1527005162.442 RDKAFKA-7-BROADCAST: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: Broadcasting state change 1527005162.442 RDKAFKA-7-CONNECT: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Connected to ipv4#127.0.0.1:19093 1527005162.444 RDKAFKA-7-BRKREASSIGN: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" management reassigned from broker (none) to :0/internal 1527005162.444 RDKAFKA-7-CGRPSTATE: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" changed state init -> wait-broker (v1, join-state init) 1527005162.444 RDKAFKA-7-BROADCAST: rdkafka#consumer-1: [thrd:main]: Broadcasting state change 1527005162.444 RDKAFKA-7-BRKASSIGN: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" management assigned to broker :0/internal 1527005162.444 RDKAFKA-7-CGRPOP: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" received op SUBSCRIBE (v0) in state wait-broker (join state init, v1 vs 0) 1527005162.444 RDKAFKA-7-SUBSCRIBE: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example": subscribe to new subscription of 1 topics (join state init) 1527005162.444 RDKAFKA-7-UNSUBSCRIBE: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example": unsubscribe from current unset subscription of 0 topics (leave group=no, join state init, v1) 1527005162.444 RDKAFKA-7-GRPLEADER: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example": resetting group leader info: unsubscribe 1527005162.444 RDKAFKA-7-CGRPJOINSTATE: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" changed join state init -> wait-unassign (v1, state wait-broker) 1527005162.444 RDKAFKA-7-UNASSIGN: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example": unassign done in state wait-broker (join state wait-unassign): without new assignment: unassign (no previous assignment) 1527005162.444 RDKAFKA-7-CGRPJOINSTATE: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example" changed join state wait-unassign -> init (v1, state wait-broker) 1527005162.445 RDKAFKA-7-CGRPQUERY: rdkafka#consumer-1: [thrd:main]: Group "rdkafka_consumer_example": no broker available for coordinator query: intervaled in state wait-broker 1527005162.493 RDKAFKA-7-BROKERFAIL: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: failed: err: Local: SSL error: (errno: Success) 1527005162.493 RDKAFKA-3-FAIL: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Failed to verify broker certificate: unable to get issuer certificate 1527005162.493 RDKAFKA-3-ERROR: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Failed to verify broker certificate: unable to get issuer certificate 1527005162.493 RDKAFKA-7-STATE: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: ssl://localhost:19093/bootstrap: Broker changed state CONNECT -> DOWN 1527005162.493 RDKAFKA-3-ERROR: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: 1/1 brokers are down 1527005162.493 RDKAFKA-7-BROADCAST: rdkafka#consumer-1: [thrd:ssl://localhost:19093/bootstrap]: Broadcasting state change

What could be the issue. For the java client all we did was keytool -import -alias CARoot -file lets-encrypt-x3-cross-signed.pem.txt -keystore /tmp/kafka.client.truststore.jks -storepass changeit -noprompt followed by /tmp/confluent-4.1.0/bin/kafka-console-consumer --bootstrap-server localhost:19093 --consumer.config /tmp/client.config --topic test.topic.ouch --from-beginning.
The client.config looks like this:
security.protocol=SSL ssl.truststore.location=/tmp/kafka.client.truststore.jks ssl.truststore.password=changeit

Also, when configuring kafka for ssl using either of these examples to generate the CA's and certificates https://github.com/confluentinc/cp-docker-images/blob/master/examples/kafka-cluster-ssl/secrets/create-certs.sh or https://github.com/edenhill/librdkafka/blob/master/tests/gen-ssl-certs.sh both our nodejs and librdkafka clients connect.

How to reproduce

Use the public lets encrypt certificate as root CA for the kafka certificates.

Checklist

Please provide the following information:

  • librdkafka version (release number or git tag): v0.11.4
  • Apache Kafka version: Confluent 4.0.0
  • Operating system: [Amazon Machine Image 2018.03](https://aws.amazon.com/amazon-linux-ami/2018.03-release-notes/)
  • Critical issue
@randomcatgamer
Copy link
Author

randomcatgamer commented May 22, 2018

Might be due to the fact that the fact that java already has DST Root CA X3 (issuer of the lets encrypt cert) added to the jdk. Will check and follow up.

@randomcatgamer
Copy link
Author

So concatenating the lets encrypt certificate to https://curl.haxx.se/ca/cacert.pem and using that, it connected. Hope this information helps if someone will be experiencing the same conundrum 😸

@MrMoronIV
Copy link

Are you willing to elaborate a bit on this?

I'm a bit confused if there is a way around refreshing the client cert files every 60 days and if it can be avoided. The last things I want to do is any manual changes to make sure all my consumers and producers keep working every 2-3 months.

Is setting the validity days to 50 years or something with self-signed certs the only way to avoid needing to refresh files on the consumer end?

Or how does the letsencrypt approach work in this case? Bit confused as to why the java approach is so complicated in comparison to any other langauge.

@lifeofguenter
Copy link

I was having a similar issue but with aws acm-pca. With the help of https://medium.com/datamindedbe/aws-msk-secure-python-kafka-client-1d25dae39207 I figured that I needed to add the following config (on an alpine docker with ca-certificates installed):

ssl.ca.location=/etc/ssl/cert.pem

then it worked!

What is strange: why do I need to add a public ca for a private ca?

@rolandyoung
Copy link

For two applications to communicate using TLS (the protocol used by SSL features), at least one of them (usually the server) has to present a certificate and the other has to trust it. To trust a certificate, either its public key or the public key of the CA that signed it has to be available. For librdkafka, that means "found in the location given by ssl.ca.location". Some applications will also trust a self-signed certificate presented over the wire by the server; librdkafka has a config option for this, but it should hardly ever need to be used.

Note that trusting a self-signed certificate presented over the wire by the server is risky, but trusting a certificate because it has been signed by another certificate which you trust is not. If the "certificate you trust" is a CA certificate, it may well be self-signed, but your client needs only to have access to its public key, in its local trust store (e.g. ssl.ca.location)

@lifeofguenter
Copy link

Sorry, forgot to mention @rolandyoung I did add the server-ca as ssl.ca.location, but it did not work. Being a private-ca, I am curious why having a public-ca will work though

@rolandyoung
Copy link

Public CA vs Private CA is not the same as public key vs private key. All TLS certificates have a public key and a private key. You can see this if you open a PEM file in a text editor. The private key should be available only to the the server that uses the certificate to authenticate itself or, if the certificate is used as a CA signing certificate, to the process used to sign certificates. The public key is provided by the server each time a client connects. If (and only if) the client has access to the public key of the CA signing certificate (whether a public CA or private CA), it can verify the server certificate.
So there are four things here: CA-pub, the CA's public key, available to all interested parties, CA-private, available only to the CA and used to sign server certificates, Server-pub, the server's public key, signed by CA-private and Server-private, the server's private key, used by the server to prove that it is the owner of Server-pub. If the CA is a private CA, then "all interested parties" is a fairly small group, if it is a public CA, then "all interested parties" may be "all devices anywhere that trust the CA", but there is always a public key for any CA.

@rolandyoung
Copy link

I should also mention the concept of "chain of trust". Most private CAs have a self-signed signing certificate, so the public key of this certificate needs to be provide to all interested parties. But if the signing certificate is itself signed by a public CA, then it will be trusted by any party that trusts that CA. If AWS MSK uses such a signing certificate, then the certificate of that public CA is what you need to set as ssl.ca.location. Using one of the AWS MSK server certificates as ssl.ca.location would not work, because the brokers may well use separate server certificates, and you need to trust the CA, not an individual server.

@lifeofguenter
Copy link

lifeofguenter commented Apr 29, 2020

@rolandyoung this is a private CA (e.g. AWS ACM-PCA). It has no public CA. It is like I would
have created my own CA.

I then generate a CSR + Private Key, and receive from ACM-PCA a certificate back. I also receive the complete cert-chain (which only involves the private CA and no public CA from the outside).

I assumed, I could do the following:

ssl.key.location=my-private-key.pem
ssl.certificate.location=cert-i-got-from-private-ca-after-my-csr.pem
ssl.ca.location=cert-chain-of-private-ca.pem

That did not work, I get a an SSL error. When I change ssl.ca.location to a public one, like the one installed by ca-certificates - the SSL errors go away.

At no point was any public-ca involved. The private-ca is root - there is no CA above it.

@edenhill
Copy link
Contributor

It sounds really weird that it is able to verify your private CA-based broker certificate with a public CA.

What librdkafka and OpenSSL versions are you on? (set debug=security and you'll see both versions reported during startup).

You can also troubleshoot this further with the openssl s_client .. command, it provides detailed handshake information.

@lifeofguenter
Copy link

I am so stupid, sorry.. and thanks for the pointer @edenhill

I still have no clue how it works, but at least I can see part of what my wrong assumptions were.

As a reminder, this is Amazon MSK with encryption-in-transit, and TLS Auth via Amazon ACM-PCA.

When I connect to one of the brokers via openssl s_client, it indeed shows me a publicly owned Amazon cert - in which my PCA is nowhere part of the chain. Which is why ssl.ca.location indeed requires a public ca file. They do that most probably so the random-aws-hostname matches up (=verifies)

I am guessing for auth its just a "fancy" way of password exchange in kafka which then makes full use of my PCA, but does not really do any transporting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants