Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SM can't make alternator query ping when authorization is required #4036

Open
Michal-Leszczynski opened this issue Sep 16, 2024 · 5 comments
Assignees
Labels
alternator bug Something isn't working

Comments

@Michal-Leszczynski
Copy link
Collaborator

Up to this point, we were testing alternator without enforced authorization:

alternator_enforce_authorization: false

We should test the enabled version instead, but it looks like SM does not handle this case well.
Simply enforcing authorization results in:

=== RUN   TestPingIntegration/query
    dynamoping_integration_test.go:34: MissingAuthenticationTokenException: Authorization header is mandatory 

Here is the dynamodb config used for pinging alternator:

// Config specifies the ping configuration, note that timeout is mandatory and
// zero timeout will result in errors.
type Config struct {
	Addr                   string
	RequiresAuthentication bool
	Username               string
	Password               string
	Timeout                time.Duration
	TLSConfig              *tls.Config
}

Even though it contains the Password and Username fields, they are never set.
The same goes for the RequiresAuthentication field.
They are only used here:

	sess := session.Must(session.NewSessionWithOptions(session.Options{
		Config: aws.Config{
			Endpoint:   aws.String(config.Addr),
			Region:     aws.String("scylla"),
			HTTPClient: httpClient(config),
		},
	}))

	if config.RequiresAuthentication && config.Username != "" && config.Password != "" {
		sess.Config.Credentials = credentials.NewStaticCredentials(config.Username, config.Password, "")
	} else {
		sess.Config.Credentials = credentials.AnonymousCredentials
	}

Which means that we always use the anonymous credentials.

The problem here is not about simply setting Password, Username, RequiresAuthentication fields to known values.
According to alternator docs, the password used for authenticating alternator queries is not the CQL password, but rather its salted hash kept in system.roles table. This means that even when user specifies CQL credentials for the cluster, we are still missing the alternator password.

The short workaround would be to only use QueryPing when RequiresAuthentication = false.
The proper fix would be to add a new cluster fields --alternator-user, --alternator-password and use them for alternator healthcheck.

I was also thinking about SM retrieving the salted hash from Scylla itself when the CQL credentials are set (by querying the system.roles table), but this seems like something easy to break in the future. Not to say that SM might not have the right permissions to read from system.roles table in the first place.

@Michal-Leszczynski Michal-Leszczynski added bug Something isn't working alternator labels Sep 16, 2024
Michal-Leszczynski added a commit that referenced this issue Sep 16, 2024
@Michal-Leszczynski
Copy link
Collaborator Author

FYI @tzach @karol-kokoszka

@Michal-Leszczynski Michal-Leszczynski self-assigned this Sep 16, 2024
@Michal-Leszczynski
Copy link
Collaborator Author

Is it even important for use to use the query ping on alternator healthcheck instead of the simple ping?
Maybe we are fine with a short workaround that does not require any additional cluster flags?
Query ping:

// QueryPing checks if host is available, it returns RTT and error. Special errors
// are ErrTimeout and ErrUnauthorised. Ping is based on executing
// a real query.

Simple ping:

// SimplePing sends GET request to alternator port and expects 200 response code.

@karol-kokoszka
Copy link
Collaborator

@Michal-Leszczynski
The short-term workaround with calling simple ping on alternator is fine.
But It's not only the healthcheck service that is affected. Credentials are needed for backup and restore as well.
I'm pretty sure we are not testing these services with the alternator at all.

So, the short term is not only to change the alternator ping to simple ping when authorization is required, but document this fact in the documentation + test backup and restore against alternator.

The long term is to change the API and include the flags you mentioned.

@karol-kokoszka
Copy link
Collaborator

OK, thanks for the explanation @Michal-Leszczynski on slack.
This is just alternator API password. Backup and restore use CQL API, so the CQL password is enough.

So, let's use the simple ping at the moment.

@Michal-Leszczynski
Copy link
Collaborator Author

Here is some proof that enforcing alternator authentication results in failed alternator healthcheck (even when CQL creds are provided):

miles@fedora:~/scylla-manager$ ./sctool.dev cluster add --host 192.168.200.11 --auth-token token -u cassandra -p cassandra --name myc
351bc3d3-1f05-4deb-9eba-8a5cddd50624
 __  
/  \     Cluster added! You can set it as default, by exporting its name or ID as env variable:
@  @     $ export SCYLLA_MANAGER_CLUSTER=351bc3d3-1f05-4deb-9eba-8a5cddd50624
|  |     $ export SCYLLA_MANAGER_CLUSTER=myc
|| |/    
|| ||    Now run:
|\_/|    $ sctool status -c myc
\___/    $ sctool tasks -c myc

miles@fedora:~/scylla-manager$ ./sctool.dev status -c myc
Datacenter: dc1
╭────┬────────────┬───────────┬───────────┬────────────────┬──────────┬──────┬─────────┬────────┬──────────┬──────────────────────────────────────╮
│    │ Alternator │ CQL       │ REST      │ Address        │ Uptime   │ CPUs │ Memory  │ Scylla │ Agent    │ Host ID                              │
├────┼────────────┼───────────┼───────────┼────────────────┼──────────┼──────┼─────────┼────────┼──────────┼──────────────────────────────────────┤
│ UN │ DOWN (9ms) │ UP (7ms)  │ UP (2ms)  │ 192.168.200.11 │ 2h36m33s │ 16   │ 31.069G │ 6.0.1  │ Snapshot │ e7240eff-d6ef-4972-a137-d5bad1d6a02c │
│ UN │ DOWN (9ms) │ UP (16ms) │ UP (3ms)  │ 192.168.200.12 │ 2h36m33s │ 16   │ 31.069G │ 6.0.1  │ Snapshot │ a78f1f32-c3f5-4497-bfcb-95101599970b │
│ UN │ DOWN (8ms) │ UP (15ms) │ UP (20ms) │ 192.168.200.13 │ 2h36m33s │ 16   │ 31.069G │ 6.0.1  │ Snapshot │ 103304a2-6df0-4da3-9bc6-4cf8a83f43ec │
╰────┴────────────┴───────────┴───────────┴────────────────┴──────────┴──────┴─────────┴────────┴──────────┴──────────────────────────────────────╯
Errors:
- 192.168.200.11 alternator: MissingAuthenticationTokenException: Authorization header is mandatory for signature verification
        status code: 400, request id: 
- 192.168.200.12 alternator: MissingAuthenticationTokenException: Authorization header is mandatory for signature verification
        status code: 400, request id: 
- 192.168.200.13 alternator: MissingAuthenticationTokenException: Authorization header is mandatory for signature verification
        status code: 400, request id: 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alternator bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants