Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure external Cassandra instance on cloud platform #1179

Closed
MarianZoll opened this issue Sep 2, 2020 · 9 comments · Fixed by #1243
Closed

Configure external Cassandra instance on cloud platform #1179

MarianZoll opened this issue Sep 2, 2020 · 9 comments · Fixed by #1243
Labels
cassandra The issues related to Cassandra storage needs-info We need some info from you! If not provided after a few weeks, we'll close this issue. wontfix This will not be worked on

Comments

@MarianZoll
Copy link

Bug description
To increase operational efficiency, we do want to host Cassandra as a storage backend outside the cluster on Azure. Cassandra on Azure does provide a host, port, username and password which should be everything that we need to create a connection for Jaeger.

Unfortunately, it seems like Jaeger does not longer take the CQLSH_PORT into consideration. Instead, the host is transferred as servers but the port is neglected. Further, we are not longer possible to require TLS on this connection. Both should have been possible for the Helm installation which is deprecated.

We did check on the storage definition to see whether we missed something in the official docs:
https://github.com/jaegertracing/jaeger-operator/blob/master/pkg/storage/cassandra_dependencies.go

Expected behavior
Guess it would be nice to have a configuration option to supply an external Cassandra service outside of the cluster. So the definition of Cassandras host, port as well as TLS options should be possible.

Steps to reproduce the bug
We tried both Operator yamls:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger
  namespace: istio-system
spec:
  strategy: allInOne
  storage:
    type: cassandra
    options:
      cassandra:
        servers: {{NAME}}.cassandra.cosmos.azure.com:10350 
        keyspace: "jaeger-tracing-cassandra-keyspace"
        username: {{USERNAME}}
        password: "{{PASSWORD}}"
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: jaeger
  namespace: istio-system
spec:
  strategy: allInOne
  storage:
    type: cassandra
    options:
      cassandra:
        servers: {{NAME}}.cassandra.cosmos.azure.com
        keyspace: "jaeger-tracing-cassandra-keyspace"
        username: {{USERNAME}}
        password: "{{PASSWORD}}"

We would really appreciate information if this is a bug or if it works as intended and we have to rely on a cluster internal storage.

Thanks in advance!

@ghost ghost added the needs-triage New issues, in need of classification label Sep 2, 2020
@jpkrohling jpkrohling added cassandra The issues related to Cassandra storage needs-info We need some info from you! If not provided after a few weeks, we'll close this issue. and removed needs-triage New issues, in need of classification labels Sep 2, 2020
@jpkrohling
Copy link
Contributor

Are you able to configure Jaeger on your local machine (bare metal or Docker) to connect to your target Cassandra cluster? If so, it should also be possible with the operator. If it's a limitation in Jaeger itself, I'll move this issue to that repository.

@MarianZoll
Copy link
Author

MarianZoll commented Sep 3, 2020

So after some additional investigation I guess it is kind of both. So I am able to connect to the external cassandra using the following Helm installation on AKS:
helm install jaeger -n istio-system jaegertracing/jaeger -f helm-yamls/jaeger-values.yml

where the values are:

provisionDataStore:
  cassandra: false

storage:
  cassandra:
    host: {{NAME}}.cassandra.cosmos.azure.com
    port: 10350
    user: {{USERNAME}}
    password: "{{PASSWORD}}"
    keyspace: "jaeger-tracing-cassandra-keyspace"
    env:
      CQLSH_PORT: 10350

This spins up the required pods, but the connection is failing due to the SSL connection creation I guess:

$ kubectl logs -n istio-system jaeger-cassandra-schema-5mwdt                                                                                                                    
Connection error: ('Unable to connect to any servers', {'52.169.219.183': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})
Cassandra is still not up at ecedevweubugsvtops.cassandra.cosmos.azure.com. Waiting 1 second.

This is something that should be fixed by proper TLS certificate injection (ideally, we would be able to also configure TLS easily for certs which are from well known CAs without providing the secret at all). For the regular TLS setup with a TLS secret, there is the following discussion and PR open in the Jaeger project:
jaegertracing/helm-charts#15
jaegertracing/helm-charts#145

Nevertheless, the server does respond but the database connection cannot be created.

What I investigated as well is that the storage.cassandra.port is completely ignored at all by the Cassandra client. Instead only adding custom envs directly solved the issue.

So I guess we have to figure out a couple of things:

  • Fix the port setup in Jaeger
  • Enable easy TLS configuration
  • Adopt the operator to take on the configuration

To detail the easier TLS configuration, we might want to supply a configuration, which configures the connection with a similar setup like the one described in the Python snippet. It's python because that one was handy on the Azure Quick Start tab.

    from cassandra.cluster import Cluster
    from ssl import PROTOCOL_TLSv2, CERT_REQUIRED

    ssl_opts = {
        'ca_certs': '<path-to-your-.pem-file>',
        'ssl_version': PROTOCOL_TLSv2,
        'cert_reqs': CERT_REQUIRED  # Certificates are required and validated
    }

    auth_provider = PlainTextAuthProvider(username={{USERNAME}}, password="{{PASSWORD}}")
    cluster = Cluster("{{NAME}}.cassandra.cosmos.azure.com", port = 10350, auth_provider=auth_provider, ssl_options=ssl_opts)
    session = cluster.connect("jaeger-tracing-cassandra-keyspace")

If one of you guys are pointing us to the right files we might be able to add a couple lines of code :)

@jpkrohling
Copy link
Contributor

Fix the port setup in Jaeger

This is only about the create-schema job, right? If so, you'll probably need something like this to obtain the port:

host := jaeger.Spec.Storage.Options.Map()["cassandra.servers"]
if host == "" {
jaeger.Logger().Info("Cassandra hostname not specified. Using 'cassandra' for the cassandra-create-schema job.")
host = "cassandra" // this is the default in the image
}

And then, set the env var to the job, like:

Env: []corev1.EnvVar{{
Name: "CQLSH_HOST",
Value: host,
}, {
Name: "MODE",
Value: jaeger.Spec.Storage.CassandraCreateSchema.Mode,
}, {
Name: "DATACENTER",
Value: jaeger.Spec.Storage.CassandraCreateSchema.Datacenter,
}, {
Name: "TRACE_TTL",
Value: traceTTLSeconds,
}, {
Name: "KEYSPACE",
Value: keyspace,
}, {
Name: "CASSANDRA_USERNAME",
Value: username,
}, {
Name: "CASSANDRA_PASSWORD",
Value: password,
}},

Finally, you'll need to add support for the new env var here:

https://github.com/jaegertracing/jaeger/blob/043d00cc9402378c886aeace116d85e578a44834/plugin/storage/cassandra/schema/docker.sh#L6-L15

Enable easy TLS configuration

I suppose this is again only for the create-schema, as it should be relatively easy to add volumes/volume mounts with the TLS certs and use them as part of --cassandra.tls.ca. We don't seem to honor that either for the create-schema, and apparently, the create schema supports an env var named CQLSH_SSL. Do you know what this is about? Hopefully we can just use the same solution as the port above. You did mention recognizing well-known CAs by default: a good place to add this would be the same script I linked above: if the CQLSH_SSL isn't set, you could try to lookup the certs from a well-known place. We use cassandra:3.11 as the base image, not sure what the 'well-known` place is for that, but shouldn't be that hard to figure out.

@MarianZoll
Copy link
Author

Ok we will create a PR for it. Keep you posted.

@mergify mergify bot closed this as completed in #1243 Oct 12, 2020
mergify bot pushed a commit that referenced this issue Oct 12, 2020
Signed-off-by: Ashmita Bohara <ashmita.bohara152@gmail.com>

Since jaegertracing/jaeger#2472 is merged, adding support for custom port here.

Partially Fixes: #1179
@jpkrohling
Copy link
Contributor

jpkrohling commented Oct 12, 2020

Reopening, as the "well-known TLS" part is still pending.

@jpkrohling jpkrohling reopened this Oct 12, 2020
@jpkrohling
Copy link
Contributor

@Ashmita152, I think there are two things that could be pursued here:

  1. the --cassandra.tls.ca flag should be honored by the create-schema script
  2. if there are certs available in a well-known place, use them. I'm not sure what's a well-known place, though: perhaps take a look at the Cassandra images and see if it ships with certs somewhere?

Each one of those should be a different issue/PR, and I would focus on the 1st only, at the moment.

@Ashmita152
Copy link
Contributor

Hi @jpkrohling

Thank you for the detailed response. I looked at the cassandra image earlier. It is based on Ubuntu image in which the standard place for ca-certificates is /etc/ssl/certs/ca-certificates.crt. I will work on it and raise two PRs accordingly.

@jpkrohling
Copy link
Contributor

Wait to work on the second, we might need to have a better understanding of the whole situation there: it might not be appropriate to load a set of certificates for the create-schema and not load the same set with the other components.

@stale
Copy link

stale bot commented Dec 12, 2020

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the wontfix This will not be worked on label Dec 12, 2020
@stale stale bot closed this as completed Dec 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cassandra The issues related to Cassandra storage needs-info We need some info from you! If not provided after a few weeks, we'll close this issue. wontfix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants