Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-14104 allow setting of HTTP Request Headers for Elasticsearch requests #9606

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ChrisSamo632
Copy link
Contributor

Summary

NIFI-14104 allow setting of HTTP Request Headers for Elasticsearch requests

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • [ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • [ ] New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • [ ] Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @ChrisSamo632.

The general approach seems fairly straightforward in terms of mapping prefixed dynamic properties from Processors through to Controller Services. It is notable that this impacts the method signatures of all methods, which raises a general question: should support for custom request headers be at the Processor level, or is it sufficient to be configured at the Controller Service level? In other words, do you expect the need to influence request headers for every call to Elasticsearch, or is this something that could be set on the Controller Service and applied for all requests? The Run As User example mentioned in the Jira seems like it could go either way depending on the flow design. However, if it is sufficient to support custom headers at the Controller Service level, that would reduce the complexity and scope significantly.

@ChrisSamo632
Copy link
Contributor Author

Thanks for the question @exceptionfactory.

I thought this was worth sticking with when thinking about a use case such as using NiFi as a webserver that can receive files/data through a HandleHttpRequest or ListenHttp processor where the user to be used in Elasticsearch for storage is presented as a request header/attribute that is extracted from the request to a FlowFile attribute, then used when sending the document to Elasticsearch.

A similar use case would be the same sort of thing but with a JWT presented per request that could then be used for authentication with Elasticsearch.

These use cases would need careful thought by the Flow designer so as to ensure that only valid/applicable users were passed in the request. An alternative may be that the Flow contains logic to change the "run as user" based upon some other runtime attributes or FlowFile content.

The problem with the Controller Service-only approach is that it's limited to a single setup per service instance, i.e. it can be configured to connect/run as one single user. The per-processor/FlowFile approach means that multiple users could be used through the same service instance.

There are potentially other request headers that one might want to control on a per-request basis to Elasticsearch, but "run as user" (or JWT auth) in the one that seemed like it might make this change worthwhile, alongside #9605 for NIFI-10831 that allows the simpler configuration of a single user/JWT provider for an alternative static authentication mechanism between NiFi and Elasticsearch.

@ChrisSamo632
Copy link
Contributor Author

@exceptionfactory what do you think about a change to the Elasticsearch Service's API to replace the existing requestParams and new requestHeaders maps with a more generic combined requestOptions object?

Thinking that, in the future, there may well be other per-request options that would be useful, and agreeing that it would be best to avoid wholesale changes to all methods on the Service API in future tickets, such a change here would help reduce the impact of future changes in the same area.

@exceptionfactory
Copy link
Contributor

@exceptionfactory what do you think about a change to the Elasticsearch Service's API to replace the existing requestParams and new requestHeaders maps with a more generic combined requestOptions object?

Thinking that, in the future, there may well be other per-request options that would be useful, and agreeing that it would be best to avoid wholesale changes to all methods on the Service API in future tickets, such a change here would help reduce the impact of future changes in the same area.

That sounds like a good approach. That would support passing with parameters and headers for this set of changes, and make it easier to introduce any other changes down the road.

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisSamo632, returning to this pull request, does it still seem necessary in light of #9605 supporting the Run As User property?

The tests use Accept as an example, but that is actually a good example of potentially breaking changes that could be introduced through dynamic headers, making the request invalid. Dynamic properties provide a useful way to cover a large number of custom configuration scenarios without introducing many explicit properties. In the case of Elasticsearch, however, the HTTP request seems more narrowly defined in terms of possible options that Elasticsearch supports.

I think the general implementation approach looks good, but if there isn't a strong case for a variety of custom HTTP request headers, it seems better to avoid introducing this level of complexity for now.

Comment on lines 28 to 29
this.requestParameters = requestParameters == null ? new HashMap<>() : requestParameters;
this.requestHeaders = requestHeaders == null ? new HashMap<>() : requestHeaders;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend wrapping the input Maps in unmodifiableMap.

@ChrisSamo632
Copy link
Contributor Author

@exceptionfactory thanks for continuing to look at this.

I think the general implementation approach looks good, but if there isn't a strong case for a variety of custom HTTP request headers, it seems better to avoid introducing this level of complexity for now.

The use of request headers for Elasticsearch interaction is somewhat limited, but becomes a bit more relevant if people are using Elasticsearch-like services and/or proxies that require/allow extra request headers in-between.

You're right that the use of Accept headers in the tests does highlight the possibility of someone introducing a configuration that would break NiFi's interaction with Elasticsearch, but the same could be said for other combinations of properties across the Elasticearch components within NiFi.

The main use case I was raising this for is to allow for per-request (i.e. FlowFile of data coming through an Elasticsearch processor) setting of some request headers, namely the run-as-user. But I then chose not to put that in the tests as it may cause issues/confusion if not paired with other required headers that such an authentication mechanism needs within Elasticsearch.

There are Flows where one may wish to index data as different users, but wouldn't necessarily know those users upfront, nor wish to create an ElasticsearchClientService controller and matching *Elasticsearch processor combination for every user that will be used for data ingest. The main example of this kind of approach that I can think of would be if the NiFi Flow were fronted by a HandleHttpRequest/ListenHTTP processor that is acting as a RESTful API endpoint - a setup that I've wanted to create in the past, but have had to agree for an alternative workaround in the meantime (which can complicate RBAC/ABAC controls within Elasticsearch itself, rather than relying on per-user access configuration).

A more generic version of this would be if NiFi were to enable the use of FlowFile attributes at the Controller level, e.g. a property is evaluated at runtime from FlowFile attributes as they are within Processors. That, of course, is a much bigger framework-level change and comes with its own set of difficulties for things like this around connection pooling, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants