Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KIP 368: SASL Reauthentication #3754

Closed

Conversation

vctoriawu
Copy link

Added SASL reauthentication for the following feature: KIP-368

@showuon
Copy link

showuon commented Mar 12, 2022

@edenhill , could you please take a look? Thank you.

@showuon showuon mentioned this pull request Mar 12, 2022
7 tasks
@showuon
Copy link

showuon commented Mar 12, 2022

@julesbovet , please also take a look. Thank you.

Copy link

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vctoriawu , thanks for the PR. This is a very important feature to get supported! Sorry, I'm not quite familiar with the code base. Left some minor comments. Thanks.

@@ -2274,6 +2373,13 @@ static void rd_kafka_broker_connect_auth(rd_kafka_broker_t *rkb) {
: RD_KAFKA_BROKER_STATE_AUTH_LEGACY);
rd_kafka_broker_unlock(rkb);

// Check if reauth is in progress, if so then client close
if (rkb->rkb_rk->rk_conf.sasl.enable_refresh) {
if (rkb->rkb_sasl.session_lifetime_ms > 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if (rkb->rkb_rk->rk_conf.sasl.enable_refresh && rkb->rkb_sasl.session_lifetime_ms > 0)

_RK(sasl.enable_refresh),
"Enable token refresh for SASL connections. "
"This verification can be extended by the application by "
"implementing a oauthbearer_token_refresh_cb.",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: by implementing oauthbearer_token_refresh_cb callback. (remove a, add callback)

@@ -926,6 +926,12 @@ static const struct rd_kafka_property rd_kafka_properties[] = {
"This builtin handler should only be used for development "
"or testing, and not in production.",
0, 1, 0, _UNSUPPORTED_OAUTHBEARER},
{_RK_GLOBAL, "enable.sasl.token.refresh", _RK_C_BOOL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this property needed? I don't see anything like that in the Java client.

It is also not clear to me if SASL re-auth is useful for the other SASL mechanisms, such as PLAIN and SCRAM, since there is no way for the application to actually update the sasl.username/password

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've no insight into the need for the flag but I do think re-authentication is generally valid.

The intention of re-auth is to ensure that the client permissions are still valid so it is useful across all SASL auth types.

Every re-auth period the PLAIN and SCRAM tokens would be re-presented (unchanged on the client side) to allow the server to re-validate the session. If the account represented by the token had been suspended the re-auth call will fail and thus the session will be terminated.

Copy link
Contributor

@edenhill edenhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

I think fiddling with the response buffer queue is a bit risky, and I believe a simpler approach is to add a new broker state (e.g., PREPARE_REAUTH) that simply waits for all oustanding requests to be completed and a new oauthbearer token to be set (triggered by refresh callback), and then transition to one of the existing AUTH states.

It is also not clear how re-auth is supposed to work with something other than OAUTHBEARER since there is currently no mechanism to update SASL credentials.

But do hold off on updating the PR, there are additional requirements inbound.

Thanks

@vctoriawu
Copy link
Author

@edenhill any updates on the additional requirements?

@showuon
Copy link

showuon commented Jun 8, 2022

@edenhill , any update? Thanks.

@showuon
Copy link

showuon commented Sep 27, 2022

@edenhill , any update for the requirements?

@k-wall
Copy link

k-wall commented Oct 27, 2022

Any way we can help with this feature?

@edenhill
Copy link
Contributor

edenhill commented Nov 1, 2022

I think the proposed solution is overly complex, and I don't think the special handling of the reauth requests is really needed.

Can't we get away with just sending the SaslHandshakeRequest upon reauth timer expiry and handling the results on response; if reauth succeeds nothing needs to be done (just restart the timer), if it fails we close the broker connection and perhaps raise a fatal error.

There's also the addition of sasl_set_credentials() in #4033 that is also needed for reauth to be meaningful.

@vctoriawu
Copy link
Author

I agree that that could be a potential solution, but reading the description of the KIP leads me to believe that this is a requirement to hold back other requests:

"the broker will communicate the session expiration time as part of the final SASL_AUTHENTICATE response. If this value is positive, then the client will automatically re-authenticate before anything else unrelated to re-authentication is sent beyond that expiration point."

also:

"Note also that the client cannot queue up additional send requests beyond the one that triggers re-authentication to occur until re-authentication succeeds and the triggering one is sent."

Not sure if I am misinterpreting

@milindl
Copy link
Contributor

milindl commented May 30, 2023

Thanks a lot @vctoriawu for starting this.

We discussed this quite a bit internally, and we decided to do it with the new broker state as Magnus suggested in the comment above. The approach does take care of the case you mention here..

The PR #4301 is a draft of that approach. I've taken the commits from this PR and built upon those. I've also added a few tests. (The credential refresh callback is yet to be implemented, though, it's only the reauthentication bit -- any credentials, if the user wants to reset them, have to be done on their own for SASL PLAIN and SCRAM.).

If you have any comment or question about this approach, please send them to that PR.

milindl added a commit that referenced this pull request Jun 14, 2023
Here's how we're doing the reauthentication:

1. In case we get a non-zero `session_lifetime_ms` in the SaslAuthenticate response for a broker `rkb`, start a timer for that broker at 90% of that.
2. [main thread] The timer is hit and the callback triggered. The callback enqueues an op on the broker `rkb`.
3. [broker thread] We get this op in `rd_kafka_broker_op_serve`, and we set max_inflight request to 1, and change the broker state into one of reauth.
4.  [broker thread] When we encounter the reauth state in `rd_kafka_broker_thread_main`, we do some cleanup for the preexisting SASL state, and just do Auth exactly the same as the normal way (when we do it the first time around). This takes care of resetting max_inflight to the correct value, too.

As the KIP and the discussion in #3754 points out, we can't send anything between the auth requests. 
Setting max_inflight to 1 means that only one request may be in flight, and since the Sasl* requests have a high priority in the queue (RD_KAFKA_PRIO_FLASH), they will actually hold the other requests back till authentication is complete. Setting it to 1 also means that any requests already in flight will await responses before the auth sequence starts.

For OAUTHBEARER, the token itself has an expiry time. There are two cases here:
1. broker's connections.max.reauth.ms > time left to token's expiry:
 In this case, the session_lifetime_ms in the SaslAuthenticate response is set to the time left for the token's expiry. Since our OAUTHBEARER callback runs at 80% of (time left to token's expiry) and our reauth runs at 90% of (time left to token's expiry), we'll refresh the token before the reauth.
 It's somewhat trickier than that, because `next_token_refresh_time := client_time + 0.8*(token_expiry - client_time)` and `reauth_time := client_time + 0.9*session_lifetime_ms`. Since session_lifetime_ms is calculated on the server, it might have some drift between the clocks, as well as well as discount the time it takes for the communication between the client/server. But it's expected that even if `token_expiry - client_time != session_lifetime_ms`, the 0.8/0.9 factors will make up for it, as typical token refresh/re-authentication times are on the order of hours (and not seconds).
2. broker's connections.max.reauth.ms < time left to token's expiry:
The broker returns connections.max.reauth.ms as the session_lifetime_ms. Since the token has a later expiry, we just use the same token to reauthenticate. 

---------

Co-authored-by: Victoria Wu <victoriawu@microsoft.com>
Co-authored-by: Emanuele Sabellico <esabellico@confluent.io>
@milindl
Copy link
Contributor

milindl commented Jul 13, 2023

Closing this with the 2.2.0 release which has this feature. Thanks

@milindl milindl closed this Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants