-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-14104 allow setting of HTTP Request Headers for Elasticsearch requests #9606
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @ChrisSamo632.
The general approach seems fairly straightforward in terms of mapping prefixed dynamic properties from Processors through to Controller Services. It is notable that this impacts the method signatures of all methods, which raises a general question: should support for custom request headers be at the Processor level, or is it sufficient to be configured at the Controller Service level? In other words, do you expect the need to influence request headers for every call to Elasticsearch, or is this something that could be set on the Controller Service and applied for all requests? The Run As User example mentioned in the Jira seems like it could go either way depending on the flow design. However, if it is sufficient to support custom headers at the Controller Service level, that would reduce the complexity and scope significantly.
Thanks for the question @exceptionfactory. I thought this was worth sticking with when thinking about a use case such as using NiFi as a webserver that can receive files/data through a A similar use case would be the same sort of thing but with a JWT presented per request that could then be used for authentication with Elasticsearch. These use cases would need careful thought by the Flow designer so as to ensure that only valid/applicable users were passed in the request. An alternative may be that the Flow contains logic to change the "run as user" based upon some other runtime attributes or FlowFile content. The problem with the Controller Service-only approach is that it's limited to a single setup per service instance, i.e. it can be configured to connect/run as one single user. The per-processor/FlowFile approach means that multiple users could be used through the same service instance. There are potentially other request headers that one might want to control on a per-request basis to Elasticsearch, but "run as user" (or JWT auth) in the one that seemed like it might make this change worthwhile, alongside #9605 for NIFI-10831 that allows the simpler configuration of a single user/JWT provider for an alternative static authentication mechanism between NiFi and Elasticsearch. |
@exceptionfactory what do you think about a change to the Elasticsearch Service's API to replace the existing Thinking that, in the future, there may well be other per-request options that would be useful, and agreeing that it would be best to avoid wholesale changes to all methods on the Service API in future tickets, such a change here would help reduce the impact of future changes in the same area. |
That sounds like a good approach. That would support passing with parameters and headers for this set of changes, and make it easier to introduce any other changes down the road. |
7d33f21
to
afa6bb2
Compare
afa6bb2
to
5b19405
Compare
...ient-service/src/main/java/org/apache/nifi/elasticsearch/ElasticSearchClientServiceImpl.java
Outdated
Show resolved
Hide resolved
5b19405
to
0ff75a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChrisSamo632, returning to this pull request, does it still seem necessary in light of #9605 supporting the Run As User property?
The tests use Accept
as an example, but that is actually a good example of potentially breaking changes that could be introduced through dynamic headers, making the request invalid. Dynamic properties provide a useful way to cover a large number of custom configuration scenarios without introducing many explicit properties. In the case of Elasticsearch, however, the HTTP request seems more narrowly defined in terms of possible options that Elasticsearch supports.
I think the general implementation approach looks good, but if there isn't a strong case for a variety of custom HTTP request headers, it seems better to avoid introducing this level of complexity for now.
this.requestParameters = requestParameters == null ? new HashMap<>() : requestParameters; | ||
this.requestHeaders = requestHeaders == null ? new HashMap<>() : requestHeaders; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend wrapping the input Maps in unmodifiableMap
.
@exceptionfactory thanks for continuing to look at this.
The use of request headers for Elasticsearch interaction is somewhat limited, but becomes a bit more relevant if people are using Elasticsearch-like services and/or proxies that require/allow extra request headers in-between. You're right that the use of The main use case I was raising this for is to allow for per-request (i.e. FlowFile of data coming through an Elasticsearch processor) setting of some request headers, namely the run-as-user. But I then chose not to put that in the tests as it may cause issues/confusion if not paired with other required headers that such an authentication mechanism needs within Elasticsearch. There are Flows where one may wish to index data as different users, but wouldn't necessarily know those users upfront, nor wish to create an ElasticsearchClientService controller and matching *Elasticsearch processor combination for every user that will be used for data ingest. The main example of this kind of approach that I can think of would be if the NiFi Flow were fronted by a HandleHttpRequest/ListenHTTP processor that is acting as a RESTful API endpoint - a setup that I've wanted to create in the past, but have had to agree for an alternative workaround in the meantime (which can complicate RBAC/ABAC controls within Elasticsearch itself, rather than relying on per-user access configuration). A more generic version of this would be if NiFi were to enable the use of FlowFile attributes at the Controller level, e.g. a property is evaluated at runtime from FlowFile attributes as they are within Processors. That, of course, is a much bigger framework-level change and comes with its own set of difficulties for things like this around connection pooling, etc. |
…ters and headers in Elasticsearch operations
Summary
NIFI-14104 allow setting of HTTP Request Headers for Elasticsearch requests
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000
NIFI-00000
Pull Request Formatting
main
branchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-check
Licensing
[ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy[ ] New dependencies are documented in applicableLICENSE
andNOTICE
filesDocumentation
[ ] Documentation formatting appears as expected in rendered files