diff --git a/pip/pip-338.md b/pip/pip-338.md new file mode 100644 index 0000000000000..f3acce5c063de --- /dev/null +++ b/pip/pip-338.md @@ -0,0 +1,172 @@ + +# PIP-338: Add default lookup listener and fix inconsistency with listener's usage between different protocols + +# Background knowledge +Currently, the `internalListenerName` specified in the broker config assists in selecting a listener from the `advertisedListeners` list to define the service URL for broker-to-broker communication. +On the client side, for lookup requests, there is an option to provide the listener name either as a query parameter or in the header [X-Pulsar-ListenerName]. This choice indicates which listener should be chosen from the list of advertised listeners as a response to the client's request. + +The expectation is that, in the event of a 200/307 response, the broker service address corresponding to the listener passed in the request will be returned, as outlined in PIP-95 (Pull Request: https://github.com/apache/pulsar/pull/12072). + +# Motivation +## Issue in existing Code flow +Even if the client sends a `listenerName` (PIP-91/PIP-95) corresponding to the http protocol broker’s addresses, it is only used in the `pulsar` and `pulsar+ssl` protocols and is not consistent for the other protocols. + +See [code](https://github.com/apache/pulsar/blob/a77fb6ed165ac8ad6968558077d80ea3edaf9a7e/pulsar-broker/src/main/java/org/apache/pulsar/broker/namespace/NamespaceService.java#L482) +``` + if (options.hasAdvertisedListenerName()) { + AdvertisedListener listener = + nsData.get().getAdvertisedListeners().get(options.getAdvertisedListenerName()); + if (listener == null) { + future.completeExceptionally( + new PulsarServerException("the broker do not have " + + options.getAdvertisedListenerName() + " listener")); + } else { + URI url = listener.getBrokerServiceUrl(); + URI urlTls = listener.getBrokerServiceUrlTls(); + future.complete(Optional.of(new LookupResult(nsData.get(), + url == null ? null : url.toString(), + urlTls == null ? null : urlTls.toString()))); + } + return; + } else { + future.complete(Optional.of(new LookupResult(nsData.get()))); + } + + } +``` +As we can see, in the existing flow, the listenerName is not acknowledged in case of http protocol. +Here, it is trying to obtain the broker service url. If the listener has service url/tls (pulsar protocols) then they are used in the response, but the HTTP urls are from the internal listener only (when configured). Later in the call chain, the http urls are used for redirect location - which may be non resolvable for the client. + +See [code](https://github.com/apache/pulsar/blob/a77fb6ed165ac8ad6968558077d80ea3edaf9a7e/pulsar-broker/src/main/java/org/apache/pulsar/broker/lookup/TopicLookupBase.java#L103) +```java + if (result.isRedirect()) { + boolean newAuthoritative = result.isAuthoritativeRedirect(); + URI redirect; + try { + String redirectUrl = isRequestHttps() ? result.getLookupData().getHttpUrlTls() + : result.getLookupData().getHttpUrl(); + if (redirectUrl == null) { + log.error("Redirected cluster's service url is not configured"); + throw new RestException(Response.Status.PRECONDITION_FAILED, + "Redirected cluster's service url is not configured."); + } + String lookupPath = topicName.isV2() ? LOOKUP_PATH_V2 : LOOKUP_PATH_V1; + String path = String.format("%s%s%s?authoritative=%s", + redirectUrl, lookupPath, topicName.getLookupName(), newAuthoritative); + path = listenerName == null ? path : path + "&listenerName=" + listenerName; + redirect = new URI(path); + } + +``` + +## Need to have lookupListener +The Pulsar broker has the concept of advertised listeners representing broker endpoints that are discoverable by, and accessible to, Pulsar clients. In addition, one listener may be designated as the internal listener, to be used for broker-to-broker communication. Listeners may be understood as named routes to the broker. If a listener name is not specified, the response contains the endpoint for the internal listener, or the first listener for the protocol. + +But the challenge arises when it considers the same internalListener for lookup requests redirects. This may result in returning an unresolvable broker service address to the clients. +It may not be possible to pass the listenerName from the clients, which consequently may lead to cluster downtime. + +it might not be feasible to have the listerner config at the client side in every tech stack or connector. +Also, Currently, there is no option to pass the listener while using the admin APIs. As admin APIs can be called from an external network as well, the use of the internal listener’s broker service URL can lead all admin operations to get affected. + +Moreover, as per current code provided above, even if the client provides a listenerName/listener header, the redirect urls are taken from the internal listener or the first listener with http protocol which may not be a client-resolvable http address. + +# Proposed Solution +To fix inconsistency with all protocol’s service url, we need to include all broker addresses associated with the listener name while returning the lookupResult, rather than solely the service URL + +Also, Introduce a broker-level property to determine the appropriate listener for lookup calls if listenerName is missing from the client request. This can help deal with the side effects of not having listerName on the client side when multiple advertisedListeners are being used. + +Also, this can help to determine the broker service URL that needs to be returned in case of lookup redirects and also for using Pulsar admin API calls from outside the network. + +This will help in having a more transparent listener identifier for internal and external communications. + + +# Design & Implementation Details +## Approach +**1. Fix Code Flow - Inconsistency with listener usage with protocols other than pulsar and pulsar+ssl** + +Currently, lookup result return the broker service url corresponding to the internal listerner in case of http or https protocol urls. To address the issue in the workflow, it is necessary to include all broker addresses associated with the listener name, rather than solely the service URL. There can be a check if all broker addresses corresponding to the given listener is null, then use the default listener(discussed below) + +**2. Only return the service URL corresponding to the lookupListener** + +This approach introduces one new configuration `lookupListenerName` in the broker.conf. The `lookupListenerName` is used to specify which service URL should be returned to the clients from the lookup requests or redirects. `lookupListenerName` should be present in the `advertisedListeners` list. +Users can set up the `lookupListenerName`. for example: + +``` +existing configs: +advertisedListeners=internal:pulsar://xyz-broker:6660,internal:pulsar+ssl://192.168.1.11:6651,external:http://192.168.1.11:8080 +internalListenerName=internal + +new config required: +lookupListenerName=external +``` + +In the approach, when the client sends a lookup request to the broker to get the owner broker without the listnerName, the broker only returns one service URL that is with the same listener name for which the default value of the lookupListenerName is given in the broker configuration. Therefore, we must allow the client to get the corresponding service URL with the same advertised listener name as present in the broker config. + +This approach's purpose is to keep client-side simple and handle situations when not having listenerName information can lead to breaking scenarios. By having the default value, clients should be able to connect with the broker even if the request lands to the broker who is not the owner and redirects to some other broker. + +## Changes +1. Fix the lookup result formation with all broker addresses rather than only service url + in NamespaceService.java in following methods: +```java +- CompletableFuture> findBrokerServiceUrl(NamespaceBundle bundle, LookupOptions options) + +- void searchForCandidateBroker(NamespaceBundle bundle, CompletableFuture> lookupFuture, LookupOptions options) + +- CompletableFuture createLookupResult(String candidateBroker, boolean authoritativeRedirect, final String advertisedListenerName) +``` +2. The pulsar broker should return the right listener for a given topic according to the lookup listener name which the broker returns to the clients. So, we can have a check in the PulsarWebResource +``` +PulsarWebResource { + if(clientListener is not present) + use lookupListener from broker conf + else return listener from client configs +} +``` +This check can be applied in all the below-mentioned classes which are extending the PulsarWebResource. This can be added while creating the lookupOptionsBuilder: +``` +TopicLookupBase.java +TopicBase.java +PulsarWebResource.java +``` + +Since we are supplying the listener solely for creating the lookupOptionsBuilder, it will exclusively apply to lookup requests and can be used for admin API calls because admin calls can come from outside the network as well. To note here, This config will not be utilized or override the internalListener usage for internal communication. +### Configuration +Add one configuration in the broker.conf +``` + +# The Default value is absent, the broker uses the internal listener as the lookup listener redirects. +# This config can be used to specify the listener needs to be considered from the list of advertisedListeners for lookups requests if clients are not passing the listener information. This will be overriden in case the listener value is provided by the clients. +# lookupListenerName= + +#for backward compatibility, by default the value will be set as false +preferHttpClientListenerOverInternalListener=false +``` +### Test Plan +The behaviour after adding a lookupListenerConfig and preferHttpClientListenerOverInternalListener can be validated by having a test cases where + +CASE I: +The lookup requests should return the broker address corresponding to the lookup listener if listener not passed in the lookup request. + +CASE II: +The lookup requests with listenerName in the query parameter. It should return the broker address corresponding to the listener mentioned in the query param. This is to verify the bug fix and also priority of listener in the request over the config + +CASE III: +The lookup request with listener corresponding to the http and https protocol broker’s address, should return the same in the lookup result to validate the bug fix + +CASE IV: +To test backward compatibility, we can have a test where the service url corresponding to the internal listener should be returned if the `preferHttpClientListenerOverInternalListener` +is set as false and no listener is passed in the request. + +## Public-facing Changes +### Backward Compatibility +The above mentioned approach(2) of having a new config as lookupListener is backward compatible as there should not be any impact on the existing users if this configuration would not be set. Users can set this configuration for listener selection for the external communication as per their usecase. The same listener would be picked up for the admin APIs as well if admin calls are coming outside the network. + +Approach(1), Fixing the code flow for http protocol might be a breaking change for the clients where we are returning the all broker addresses corresponding to the listener rather than just having service url. This can be breaking for the clients who are used to this inconsistency of having internal listener’s service url as lookup request. To deal with this, we can have a flag - `preferHttpClientListenerOverInternalListener` which would indicate whether this change should be enabled for clients or not. +By Default, the value of this flag can be set as false for backward compatibility. + +## Revert +This fix can be controlled by a config - `preferHttpClientListenerOverInternalListener` which we are adding for a backward compatibility. So there would not be a need to revert this change. + +# Links +* Mailing List discussion thread: +* Mailing List voting thread: