Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the use of metric.reporters in OAuth metrics, use strimzi.metric.reporters instead #193

Merged
merged 6 commits into from
Jul 7, 2023

Conversation

mstruk
Copy link
Contributor

@mstruk mstruk commented Jun 28, 2023

This PR removes the use of metric.reporters in OAuth metrics to configure the reporters to instantiate.
The reason is that OAuth metrics integrates with the reporters by creating new instances of them, resulting in one instance being created by Kafka broker, and another instance created by OAuth metrics. Not all reporters work properly when multiple instances are instantiated.

With this PR OAuth metrics, when enabled, will use strimzi.oauth.metric.reporters as a list of reporters to instantiate.

If strimzi.oauth.metric.reporters is not set, org.apache.kafka.common.metrics.JmxReporter will automatically be instantiated, otherwise only the specified reporters will be instantiated. The config option metric.reporters will be ignored by OAuth metrics.

In order to migrate existing broker configurations use strimzi.oauth.metric.reporters as the configuration option rather than metric.reporters.

STRIMZI_OAUTH_METRIC_REPORTERS=org.apache.kafka.common.metrics.JmxReporter,org.some.package.SomeReporter

See README.md for more information.

…etric.reporters` instead.

JmxReporter has to be listed explicitly to be instantiated.

Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
@mstruk mstruk added this to the 0.13.0 milestone Jun 28, 2023
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will this new option strimzi.metrics.reporter be used? If I got it rght and it is in the general configuration of Kafka, should it rather be a strimzi.oauth.metrics.reporter?

@scholzj scholzj requested review from tombentley and mimaison June 28, 2023 20:18
@mstruk
Copy link
Contributor Author

mstruk commented Jun 29, 2023

Where will this new option strimzi.metrics.reporter be used? If I got it rght and it is in the general configuration of Kafka, should it rather be a strimzi.oauth.metrics.reporter?

This option is global configuration for the Kafka broker yes, so it should not be passed as part of a specific listener configuration, rather as part of client.properties on the client, or server.properties on the broker (except as I describe in the README it should really be set as env var or a system property rather than in server.properties.

Generally I use 'oauth.' properties for JAAS config options, and 'strimzi.authorization.' as globals that are specific to KeycloakAuthorizer. This option works across the board for the whole Kafka instance thus I opted for 'strimzi.*' prefix.

But it is true that it's the OAuth library that uses this option exclusively so your suggestion makes sense.

@scholzj
Copy link
Member

scholzj commented Jun 29, 2023

I do not think the strimzi. itself is an issue. Its more without the .oauth behind it, it possibly collides with other Strimzi options. So if it is something only Oauth reads as far as I understood, I think strimzi.oauth... would be better if we stick with the strimzi. part. Not sure if @tombentley for example has some thoughts about it.

@tombentley
Copy link
Member

strimzi.oauth... seems better to me.

Comment on lines 99 to 101
Pattern pattern = Pattern.compile(prepareRegex(regex));
for (String line: log) {
if (pattern.matcher(line).matches()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit weird to have prepareRegex() and then use Matcher.matches(), rather than just using Matcher.find(). Using find() would be the preferred way to do this, but if there's a good reason for it then I think it at least deserves a comment.

Copy link
Contributor Author

@mstruk mstruk Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. That's a dumb code. I'll remove the redundant prepareRegex() and use find()

README.md Outdated

Metrics are exposed through JMX managed beans. They can also be exposed as Prometheus metrics by using the Prometheus JMX Exporter agent, and mapping the JMX metrics names to prometheus metrics names.
NOTE: This breaks compatibility with OAuth versions preceding 0.13.0 where `metric.reporters` was used to configure reporters which were consequently automatically instantiated twice.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compatibility break unavoidable?

Copy link
Contributor Author

@mstruk mstruk Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not breaking compatibility means to keep checking if 'metric.reporters' is set and if it is set instantiating the configured reporters. That is what happens when we configure CruiseControl metric reporter which results in exceptions at start time. We could say 'if strimzi.oauth.metric.reporters is NOT set, behave as before - if metric.reporters was set instantiate the reporters mentioned there, if JmxReporter is not mentioned instantiate it anyway, except if running on Kafka 3.5.0 in which case check if auto.include.jmx.reporter is set to false in which case don't instantiate it (but in the future with Kafka 4.0 this check will have to be modified again as auto.include.jmx.reporter will no longer be taken into account, and JmxReporter will not be automatically instantiated if not mentioned in metric.reporters.
The result of us maintaining compatibility is that we don't address automatic double instantiation and people may face exceptions and startup issues as a result of using oauth.enable.metrics or strimzi.authorization.enable.metrics, like we face it in Strimzi when CruiseControl is enabled.
By breaking this backwards compatibility we can ensure that no automatic double instantiation occurs as user has to not only enable oauth metrics, but also specify the reporters to instantiate. Also we can immediately move on to use Kafka 4.0 configuration model for 'our' metric reporters which is where Kafka is moving.
If we keep providing backwards support the question is then do we also ensure that in the future Kafka 4.0 the behaviour will mimic that of Kafka.
I clearly lean towards breaking backwards support but if it's deemed a 'no-no' in that we'll do whatever it takes to not break backwards compatibility then it is possible to keep it.

README.md Outdated
Comment on lines 1236 to 1247
The OAuth metrics ignores the Kafka `metric.reporters` option in order to prevent automatically instantiating double instances of reporters. Most reporters may expect that they are singleton object and may not function properly in multiple copies.
Instead, there is `strimzi.metric.reporters` option where the reporters that support multiple copies can be specified for the purpose of metrics integration:
- `strimzi.metric.reporters` (e.g.: "org.apache.kafka.common.metrics.JmxReporter", use ',' as a separator to enable multiple reporters)

The configuration option `strimzi.metric.reporters` has to be configured as an env variable or a system property. Using it on the Kafka broker inside `server.properties` does not work reliably due to multiple pluggability mechanisms that can be used (authorizer, authentication callback handler, client).
Some of these mechanisms get the `server.properties` filtered so only configuration recognised by Kafka makes it through. However, this is a global OAuth Metrics configuration, and it is initialized on first use by any of the components, using the configuration provided to that component.
Specifically, the inter-broker client using OAUTHBEARER might be the first to trigger OAuth Metrics initialisation on the broker, and does not see this config option.

In order to reliably configure `strimzi.metric.reporters` one of the following options should be used when starting a Kafka broker:
- `STRIMZI_METRIC_REPORTERS` env variable
- `strimzi.metric.reporters` env variable or system property

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of verbose description which isn't completely easy to understand, but which people would need to follow in order to make use of this feature. I don't see anywhere actual examples they they could copy and paste and expect to work. Could we add some examples somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are different ways to configure it.

The simplest one would be:

# Enable OAuth metrics for all listeners, Keycloak authorizer, and inter-broker clients:
export OAUTH_ENABLE_METRICS=true
export STRIMZI_OAUTH_METRIC_REPORTERS=org.apache.kafka.common.metrics.JmxReporter
bin/kafka-server-start.sh config/server.properties

I wouldn't go more complex with it. Let people read the docs. If the docs are not clear enough we'll improve.

You can have more complex config for different contexts. For example for the Kafka client you can have:

~/client.properties:
...
sasl.jaas.config=org.apache.kafka.common.security.oauthbearer.OAuthBearerLoginModule required \
  oauth.token.endpoint.uri="https://server/token-endpoint" oauth.client.id="clientId" oauth.client.secret="client-secret" oauth.enable.metrics="true";

strimzi.oauth.metric.reporters=org.apache.kafka.common.metrics.JmxReporter
...

Start client:

bin/kafka-console-producer.sh --broker-list kafka:9092 --topic my-topic  --producer.config=$HOME/client.properties

But it could also simply be:

export OAUTH_ENABLE_METRICS=true
export STRIMZI_OAUTH_METRIC_REPORTERS=org.apache.kafka.common.metrics.JmxReporter
bin/kafka-console-producer.sh --broker-list kafka:9092 --topic my-topic  --producer.config=$HOME/client.properties

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added examples to README.

if (configMap.get(STRIMZI_METRIC_REPORTERS) != null) {
reporters = kafkaConfig.getConfiguredInstances(STRIMZI_METRIC_REPORTERS, MetricsReporter.class);
} else {
log.warn("Configuration option '{}' is not set, OAuth metrics will not be exported to JMX", STRIMZI_METRIC_REPORTERS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We define the config as low importance below and without a default value but if it's not set we emit a warning. Should we increase the importance or downgrade that log level?

Copy link
Contributor Author

@mstruk mstruk Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how you decide on the importance of the particular configuration option. In this case the warning is most welcome because it is almost certainly an omission that STRIMZI_METRIC_REPORTERS was not configured. With metrics enabled and not exported to JMX there is no way to collect metrics by any mechanism (as far as I know).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not setting it generates warnings, it looks pretty important to me. That said I'm not sure what conventions are used in other parts of the Strimzi code so if none of the maintainers pick on this, maybe we can ignore it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with Mickael that this seems a bit weird. I guess when you put it that enabling metrics but not configuring the metrics reporter would be wrong, that makes sense. But why does the user need to:

  • Enable the metrics
  • And configure the metrics reporter

Why not do just one and have the other done automatically by the Oauth library? I guess I'm missing some obvious reason why it is not possible, but it would be good to understand it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For OAuth metrics I didn't want them to be enabled by default in order to avoid potential issues (performance or memory consumption for example). And that did allow us to very easily work around CruiseControl issue - everything was fine until someone enabled OAuth metrics. The solution was - don't enable OAuth metrics.

OAuth metrics also has finer control on metrics collection. You can enable metrics globally, or for just particular listeners, or just for KeycloakAuthorizer.

Kafka metrics are enabled by default and which of them are enabled is controlled by metrics.recording.level. But as far as I understand there is no way to completely turn them off.

For OAuth we could change the default for the configuration (enabled by default and you can disable it), but if we otherwise keep backwards compatibility around metric.reporters this will result in things crashing / acting up by default when non-compatible metric reporters (such as CruiseControl metric reporter) are configured, as we would double-instantiate them.
We could then enable metrics gathering only when some reporters are configured via 'strimzi.oauth.metric.reporters'. However, if we made this change then absence of strimzi.oauth.metric.reporters if we want to maintain backwards compatibility with metrics.reporters would mean that we have to instantiate JmxReporter each and every time even if metric.reporters is not set.

I would prefer for now to NOT have OAuth metrics enabled by default. Let's have it opt-in until we're confident that there are no issues when using them (like the issue this PR is addressing of things crashing when a reporter not compatible with OAuth metrics is configured).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand why OAuth metrics should not be enabled by default. But why does the user need to configure the strimzi.oauth.metric.reporters? Isn't it Strimzi OAuth who consumes this option anyway? So why not set it automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes that's a great point. I was motivated by aligning with where Kafka's handling of metric.reporters is headed.
With Kafka 4.0 there will be no exporting of (automatically enabled) metrics to JMX unless a specific reporter is configured that provides such an export.

We can opt to not go this way to avoid needing to set two configurations (Kafka does not have that problem since its metrics are always enabled). We could for example handle omission of strimzi.oauth.metric.reporters by automatically creating JmxReporter. If user specifies strimzi.oauth.metric.reporters we could only instantiate the configured reporters, but we would not check if JmxReporter is part of the config, and we would not automatically instantiate it if not present in the config (that's different from current metric.reporters handling when we automatically instantiate JmxReporter if it is not part of the metric.reporters list).

That probably makes more sense in our case as then it is not necessary to always set it, and user should consult the documentation anyway to determine the behaviour and that it is not the same as for Kafka metrics configured via metric.reporters.
Also the presence of strimzi.oauth.metric.reporters= (configured and set to empty value) could then mean to NOT instantiate a JmxReporter if for some reason one really wants to have no reporters attached to the OAuth Metrics object (and thus no export to JMX).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also keep in mind that hopefully that should all disappear in the near future once KIP-877 gets in Kafka. So maybe we want to keep things simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll modify the logic as described above. It means no backwards compatibility in the sense that metric.reporters will be ignored. But if strimzi.oauth.metric.reporters is set it will be used, if not set, JmxReporter will automatically be instantiated.

* Close any configured Services so they can be reinitialised again
*/
public static synchronized void close() {
services = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really closing services

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one day it will be. There are at the moment no threads running in there so I simply release the memory. You think this should be called differently or the docs should be different or you think the implementation should be different?

mstruk added 4 commits July 4, 2023 09:40
Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
README.md Outdated
@@ -1225,21 +1225,73 @@ Configuring the metrics

By default, the gathering and exporting of metrics is disabled. Metrics are available to get an insight into the performance and failures during token validation, authorization operations and client authentication to the authorization server. You can also monitor the authorization server requests by background services such as refreshing of JWKS keys and refreshing of grants when `KeycloakAuthorizer` is used.

You can enable metrics for token validation on the Kafka broker or for client authentication on the client by setting the following JAAS option to `true`:
You can enable metrics for token validation, and keycloak authorization on the Kafka broker or for client authentication on the client by setting the following JAAS option to `true`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be Keycloak? Or should it be marked as a code to indicate some configuration option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to: KeycloakAuthorizer

README.md Outdated
- `oauth.enable.metrics` (e.g.: "true")

You can enable metrics for `KeycloakAuthorizer` by setting an analogous option in Kafka broker's `server.properties` file:
You can also enable metrics only for keycloak authorization by setting an analogous option in Kafka broker's `server.properties` file:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to: KeycloakAuthorizer

…m it. Address README comments.

Signed-off-by: Marko Strukelj <marko.strukelj@gmail.com>
@mstruk
Copy link
Contributor Author

mstruk commented Jul 5, 2023

@tombentley @mimaison Any more unaddressed issues here?

Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Marko.


The OAuth metrics also honor the `metric.reporters`, `metrics.num.samples`, `metrics.recording.level` and `metrics.sample.window.ms` configurations of Kafka runtimes. When OAuth metrics are enabled the OAuth layer creates duplicate instances of `JmxReporter` and the configured `MetricReporter`s since at the moment there is no other way to integrate with the existing metrics system of Kafka runtimes.
The configuration option `strimzi.oauth.metric.reporters` on the Kafka broker has to be configured as an env variable or a system property. Using it on the Kafka broker inside `server.properties` does not work reliably due to multiple pluggability mechanisms that can be used (authorizer, authentication callback handler, inter-broker client).
Some of these mechanisms get the `server.properties` filtered so only configuration recognised by Kafka makes it through. However, this is a global OAuth Metrics configuration, and it is initialized on first use by any of the components, using the configuration provided to that component.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these mechanisms get the server.properties filtered so only configuration recognised by Kafka makes it through.

Not a comment about this PR, really an observation about Kafka being inconsistent/the Configurable contract being rather weak. If Configurable itself isn't going to specify whether configure() is getting the originals, or some filtered subset then it should be documented on each subclass what is passed. Thoughts @mimaison?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the "contract" was to always pass originals to plugins so they can have custom configurations.
https://issues.apache.org/jira/browse/KAFKA-7588

Copy link
Member

@tombentley tombentley Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then @mstruk which plugin did you find where the originals weren't propagated to configure()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the inter-broker communication client that gets initialized with the already parsed Configuration object, which then no longer contains Map<String, String> but Map<String, ?> and the unrecognised config options are missing.

@mstruk mstruk merged commit a0c4c34 into strimzi:main Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants