-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-5051 Created ElasticSearch lookup service. #2615
Conversation
a2a4c3b
to
f3f1cca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeThomsen Extremely sorry.. Unable to spend the planned time on this one. I'll try to get to it, if no one else does.
</dependency> | ||
<dependency> | ||
<groupId>org.apache.nifi</groupId> | ||
<artifactId>nifi-avro-record-utils</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is included twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still twice, I can remove on merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is still valid @MikeThomsen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
.displayName("Index") | ||
.description("The name of the index to read from") | ||
.required(true) | ||
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the lookup is performed on an incoming flow file, is there any reason the Index, Type, etc. properties couldn't support attributes coming from the flow file? If it is this way because the ES Client Service CS can't use them, perhaps we should write up a separate improvement Jira to do something like NIFI-5121.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could definitely see some value to that, but since this is a LookupService implementation, we should discuss it in that context. NIFI-5121 only describes one particular interface, and LookupService is more expansive in use.
.identifiesControllerService(SchemaRegistry.class) | ||
.build(); | ||
|
||
public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most record-based processors (usually because they're in the same NAR) extend from SchemaRegistryService and thus offer the same schema-related properties. I realize you can't do that here (currently), but I think we should do one of two things: 1) Move SchemaRegistryService into the API, or 2) Offer the same properties as other record-based processors, including Schema Access Strategy, Schema Text, Schema Version, etc. I think the first one would be easier and more helpful for anyone creating a record-based processor in another bundle, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put up #2661 to address option 1 above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that #2661 has been merged, could you extend SchemaRegistryService instead of AbstractControllerService (which you'll still get as SchemaRegistryService's parent)? That way you'll have immediate access to the same properties and logic as other schema-registry-aware processors to give a consistent UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll give it a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattyb149 I got started on that, but I'm not sure how this is supposed to work. The only entry point that is standard for LookupService
is lookup(Map)
. How did you envision communicating the info for the other lookup strategies? The schema name one makes sense; it can be a property on the service or simply schema.name
in the Map
passed to lookup
. The rest, I'm not sure about.
} | ||
} | ||
|
||
private RecordSchema convertSchema(Map<String, Object> result) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be in a utilities class (if there isn't already such a method in one)? Seems pretty helpful for JSON-to-schema conversions (or any Map-to-Schema) in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I just wrote up a Jira ticket for this. Let's table it for now because we need to think about things like date strings.
import java.util.Map; | ||
import java.util.Optional; | ||
|
||
public class ElasticSearchLookupService_IT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have some unit tests as well, since the integration tests do not get run as part of any automated build. I think you could mock the ES Client Service or something? There appears to be something similar in TestFetchElasticsearchHttp (and the other ES processor unit tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll add some.
<dependency> | ||
<groupId>org.apache.avro</groupId> | ||
<artifactId>avro</artifactId> | ||
<version>1.8.2</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't nifi-avro-record-utils bring in the Avro library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I think that might have just been IntelliJ acting up.
@mattyb149 Done. |
@mattyb149 Can you comment on the schema detection strategy issue I raised here? |
@mattyb149 I converted it over to be a subclass of |
Reviewing... |
</dependency> | ||
<dependency> | ||
<groupId>org.apache.nifi</groupId> | ||
<artifactId>nifi-avro-record-utils</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still twice, I can remove on merge
.build(); | ||
|
||
|
||
public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this property anymore, as you get one from SchemaRegistryService
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I removed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to remove it @MikeThomsen ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
|
||
@Override | ||
public Optional lookup(Map coordinates) throws LookupFailureException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda thought this LookupService would behave a bit like the Mongo one, where you could give it multiple keys and it would do the query based on that (the fields and the values for each record). This one seems a bit awkward to me, as the user would have to build up their own query field in each record, putting the value they want to match inside a JSON query body.
Is there a different use case here, or could/should we make it more consistent with the other "NoSQL" lookup service(s)? We'd have to generate the query body but that shouldn't be too hard. Also you'd only be able to query top-level fields for lookup, but that seems like it would cover most use cases. If there is a way to specify a nested field for lookup (such as a qualified name with period delimiters), we could do that (although we'd likely have to use a "nested" operator in the generated query), seems like a good (but separate) improvement. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can do that. I think what it would look like is this:
{
"bool": {
"must": [
{
"match": {
"username": "john.smith"
}
},
{
"match": {
"email": "john.smith@company.com"
}
}
]
}
}
For input:
{
"username": "john.smith",
"email": "john.smith@company.com"
}
as lookup keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nested, it could be trickier. I think this would work:
/user/email => "user.contact.email"
{
"query": {
"nested": {
"path": "user.contact",
"query": {
"match": {
"email": "john.smith@company.com"
}
}
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattyb149 I'm going to work on knocking the basic query builder changes out tonight. Could be a big change, so I apologize in advance :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did those changes make it into the latest PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm not sure. It's been a while. I'll look at let you know today. Got most of your feedback squared away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, didn't make it in. Good news is that after kicking the tires with Kibana looks like it'll be pretty easy to do if I get some time after work.
@mattyb149 changed the query model per the discussion above and changed the tests to be Groovy so that the inline JSON, etc. would be a lot cleaner to read. |
@mattyb149 @pvillard31 Changed the query model as requested and it's ready for final review AFAICT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments after a quick pass over the code, will try to find time to test it but I probably won't be able to do it before next week :(
</dependency> | ||
<dependency> | ||
<groupId>org.apache.nifi</groupId> | ||
<artifactId>nifi-avro-record-utils</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is still valid @MikeThomsen
</execution> | ||
</executions> | ||
<configuration> | ||
<source>1.8</source> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have this kind of configuration in low-level poms? Wondering if it'd be an issue with current modifications to support Java 9/10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. Removed.
.build(); | ||
|
||
|
||
public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to remove it @MikeThomsen ?
@OnEnabled | ||
public void onEnabled(final ConfigurationContext context) throws InitializationException { | ||
clientService = context.getProperty(CLIENT_SERVICE).asControllerService(ElasticSearchClientService.class); | ||
index = context.getProperty(INDEX).getValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.evaluateExpressionLanguage()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
public void onEnabled(final ConfigurationContext context) throws InitializationException { | ||
clientService = context.getProperty(CLIENT_SERVICE).asControllerService(ElasticSearchClientService.class); | ||
index = context.getProperty(INDEX).getValue(); | ||
type = context.getProperty(TYPE).getValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.evaluateExpressionLanguage()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@pvillard31 @mattyb149 changes checked in. |
@@ -212,6 +212,31 @@ | |||
</execution> | |||
</executions> | |||
</plugin> | |||
|
|||
<plugin> | |||
<groupId>org.jacoco</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to add a code coverage plugin to Maven, this is probably something that should be added to the root-level pom (and disabled by default?) What was the impetus for including it in a single bundle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how well that would work at root level because there are plenty of integration tests that have to be run to get a full sense of code coverage. So maybe I should back this out or one of you can drop it when rebasing for a merge if you think it makes more sense to add a root level profile for code coverage.
<dependency> | ||
<groupId>org.apache.nifi</groupId> | ||
<artifactId>nifi-avro-record-utils</artifactId> | ||
<version>1.7.0-SNAPSHOT</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be 1.8.0-SNAPSHOT now, sorry it's taken so long to get through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
</dependencies> | ||
|
||
<build> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to remove the groovy and Jacoco stuff, let's get a discussion going on the dev mailing list about code coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving in the helper plugin for now because for some reason, it won't even detect the groovy test source without it. I'll remove it if you have any suggestions on how to fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is nothing in src/test/java
, the Groovy tests won't be detected unless a plugin references them directly. In this case, the build-helper-maven-plugin
is accomplishing that. In other locations, the maven-compiler-plugin
is set to use groovy-eclipse-compiler
to achieve the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alopresto thanks for explaining that. I just added a .gitignore file into src/test/java and that did the trick.
|
||
@Override | ||
protected List<PropertyDescriptor> getSupportedPropertyDescriptors() { | ||
List<PropertyDescriptor> _desc = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a style nitpick, these can be set up in the constructor or a static block (I think the former is preferred?). Unless they're dynamic the list only needs to be created once, where this method gets called often IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
import org.junit.Before | ||
import org.junit.Test | ||
|
||
import static groovy.json.JsonOutput.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't pass CheckStyle as we don't allow star imports in Java, we probably just don't have an existing (or complete) CheckStyle rule for Groovy files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually fixed that.
…k client service.
…Groovy to make them easier to read with all of the inline JSON.
@mattyb149 Refactored the query builder. |
Should be all good to go now. |
@MikeThomsen I would maybe put a comment in the |
@mattyb149 can we merge? |
@mattyb149 can we close this out? |
Reviewing... |
|
||
private final List<PropertyDescriptor> DESCRIPTORS; | ||
|
||
ElasticSearchLookupService() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has to be public for ServiceLoader to find it, I'm getting errors when trying to load it into NiFi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
facepalm
One fix, coming right up...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. NiFi is able to load it and assign it as the lookupservice for lookuprecord.
nifi-nar-bundles/nifi-elasticsearch-bundle/nifi-elasticsearch-client-service/pom.xml
Outdated
Show resolved
Hide resolved
put("bool", new HashMap<String, Object>(){{ | ||
put("must", coordinates.entrySet().stream() | ||
.map(e -> new HashMap<String, Object>(){{ | ||
if (e.getKey().contains(".")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've run the unit and integration tests and the code LGTM, but I'd feel better if I could get an example going where I do the lookup on a field that's not at the top level. I have a document containing a "user" field, which contains other fields such as "name", and "name" contains other fields like "first" and "last". I tried using this with a simple CSV input containing an id and a first name, and tried to use the lookup service to match "user.name.first" and return the value of "user.name.last", but got an error saying I was trying to do a nested query on a field that wasn't nested. I didn't add an explicit mapping for the index, just put the complex JSON docs into ES. Am I configuring it wrong, or is this not supported, or could there be a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into that. Should be able to get something resolved this weekend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but got an error saying I was trying to do a nested query on a field that wasn't nested.
I think you are. ES can be weird about detecting nested documents. I've only had consistent good results when explicitly defining them. I'll try to set up a test example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I got it working and will share some artifacts tomorrow if I get a chance so you can watch them in action. I'm thinking some of the behavior still needs a second opinion on the flexibility/user-friendliness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still getting the same error (nested object under path [user.name] is not of nested type) on my flow. I tried yours but I don't have any documents/mappings in ES (such as a doc with "subfield.longfield), can you share an example doc I can put in there? I have my own ES so I didn't start up the Docker Compose stuff you attached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below. It has a sample flow, the commands for Kibana and a docker compose file.
…the result set into the fields they want.
+1 LGTM, ran full build with unit tests, tried the lookup service with a nested record and everything worked fine. Thanks for the improvement! Merging to master |
commit 7cb39d6 Author: Jeff Storck <jtswork@gmail.com> Date: Fri Oct 12 16:57:15 2018 -0400 NIFI-5696 Update references to default value for nifi.cluster.node.load.load.balance.port This closes apache#3071. Signed-off-by: Koji Kawamura <ijokarumawak@apache.org> commit 0229a5c Author: zenfenan <sivaprasanna246@gmail.com> Date: Sun Oct 14 13:18:25 2018 +0530 NIFI-5698: Fixed DeleteAzureBlobStorage bug This closes apache#3073. Signed-off-by: Koji Kawamura <ijokarumawak@apache.org> commit e30a21c Author: Brad Hards <bradh@frogmouth.net> Date: Sat Oct 13 19:25:43 2018 +1100 [NIFI-5697] Trivial description fix for GenerateFlowFile processor This closes apache#3072. Signed-off-by: Aldrin Piri <aldrin@apache.org> commit 270ce85 Author: Mark Payne <markap14@hotmail.com> Date: Fri Oct 12 15:27:10 2018 -0400 NIFI-5695: Fixed bug that caused ports to not properly map to their correct child group on Flow Import if the child group is independently versioned This closes apache#3070. Signed-off-by: Bryan Bende <bbende@apache.org> commit 5eb5e96 Author: thenatog <thenatog@gmail.com> Date: Mon Oct 8 12:58:20 2018 -0400 NIFI-5665 - Changed netty versions to more closely match the original netty dependency version. NIFI-5665 - Fixed version for nifi-spark-bundle. NIFI-5665 - Fixing copy and paste error. This closes apache#3067 commit 02e0a16 Author: Bryan Bende <bbende@apache.org> Date: Thu Oct 11 15:58:55 2018 -0400 NIFI-5680 Handling trailing slashes on URLs of registry clients This closes apache#3065. Signed-off-by: Mark Payne <markap14@hotmail.com> commit 0f88805 Author: Matt Gilman <matt.c.gilman@gmail.com> Date: Fri Oct 12 10:23:47 2018 -0400 NIFI-5691: - Overriding the version of jackson in aws java sdk. This closes apache#3066. Signed-off-by: Aldrin Piri <aldrin@apache.org> commit e25b26e Author: joewitt <joewitt@apache.org> Date: Fri Oct 12 11:27:48 2018 -0400 Revert "NIFI-5448 Added failure relationship to UpdateAttributes to handle bad expression language logic." This reverts commit 32ee552. commit 6b77e7d Author: joewitt <joewitt@apache.org> Date: Fri Oct 12 11:08:22 2018 -0400 Revert "NIFI-5448 Changed from 'stop' to 'penalize' in allowablevalue field to make the popup more consistent." This reverts commit 9d2b698. commit a6b9364 Author: Carl Gieringer <carl.gieringer@snagajob.com> Date: Thu Oct 4 12:50:08 2018 -0400 NIFI-5664 Support ArrayList in DataTypeUtils#toArray NIFI-5664 Generalize to handling List This closes apache#3049 Signed-off-by: Mike Thomsen <mikerthomsen@gmail.com> commit 5aa4263 Author: Endre Zoltan Kovacs <ekovacs@hortonworks.com> Date: Mon Oct 8 13:10:37 2018 +0200 NIFI-1490: better field naming / displayname and description mix up fix This closes apache#2994. Signed-off-by: Mark Payne <markap14@hotmail.com> commit c81a135 Author: Endre Zoltan Kovacs <andrewsmith87@protonmail.com> Date: Thu Sep 6 17:33:33 2018 +0200 NIFI-1490: multipart/form-data support for ListenHTTP processor - introducing a in-memory-file-size-threashold, above which the incoming file is written to local disk - using java.io.tmpdir for such file writes - enhancing documentation commit 8398ea7 Author: Mark Payne <markap14@hotmail.com> Date: Thu Oct 11 14:57:31 2018 -0400 NIFI-5688: Ensure that when we map our flow to a VersionedProcessGroup that we include the connections' Load Balance Compression flag This closes apache#3064 commit 8da403c Author: Matt Gilman <matt.c.gilman@gmail.com> Date: Thu Oct 11 13:21:20 2018 -0400 NIFI-5661: - Allowing load balance settings to be applied during creation. - Clearing the load balance settings when the dialog is closed. commit 79c03ca Author: Matt Gilman <matt.c.gilman@gmail.com> Date: Thu Oct 11 12:23:53 2018 -0400 NIFI-5661: - Allowing the load balance configuration to be shown/edited in both clustered and standalone mode. commit 64de5c7 Author: thenatog <thenatog@gmail.com> Date: Fri Sep 7 12:39:18 2018 -0400 NIFI-5479 - Supressed the AnnotationParser logs using the logback.xml. Dependency changes can be look at in future. NIFI-5479 - Updated comment. This closes apache#3034 commit 8a751e8 Author: Koji Kawamura <ijokarumawak@apache.org> Date: Fri Sep 14 21:18:04 2018 +0900 NIFI-5661: Adding Load Balance config UI Incorporated review comments. Move combo options to a common place. This closes apache#3046 commit a6f7222 Author: Koji Kawamura <ijokarumawak@apache.org> Date: Fri Sep 28 17:37:34 2018 +0900 NIFI-5645: Auto reconnect ConsumeWindowsEventLog This commit also contains following refactoring: - Catch URISyntaxException inside subscribe when constructing provenance URI as it does not affect the core responsibility of this processor. Even if it fails to be a proper URI, if the query works for consuming logs, the processor should proceed forward. Upgrade JNA version. Do not update lastActivityTimestamp when subscribe failed. This closes apache#3037 commit 97afa4e Author: Mark Payne <markap14@hotmail.com> Date: Tue Oct 9 14:54:21 2018 -0400 NIFI-5585: Addressed bug in calculating swap size of a queue partition when rebalancing This closes apache#3010. Signed-off-by: Mark Payne <markap14@hotmail.com> commit a1a4c99 Author: Mark Payne <markap14@hotmail.com> Date: Mon Oct 8 09:53:14 2018 -0400 NIFI-5585: Adjustments to the Connection Load Balancing to ensure that node offloading works smoothly Signed-off-by: Jeff Storck <jtswork@gmail.com> commit 01e2098 Author: Jeff Storck <jtswork@gmail.com> Date: Tue Sep 25 15:17:19 2018 -0400 NIFI-5585 A node that was previously offloaded can now be reconnected to the cluster and queue flowfiles again Added Spock test for NonLocalPartitionPartitioner Updated NOTICE files for FontAwesome with the updated version (4.7.0) and URL to the free license Updated package-lock.json with the updated version of FontAwesome (4.7.0) Added method to FlowFileQueue interface to reset an offloaded queue Queues that are now immediately have the offloaded status reset once offloading finishes SocketLoadBalancedFlowFileQueue now ignores back-pressure when offloading flowfiles Cleaned up javascript in nf-cluster-table.js when creating markup for the node operation icons Fixed incorrect handling of a heartbeat from an offloaded node. Heartbeats from offloading or offloaded nodes will now be reported as an event, the heartbeat will be removed and ignored. Added unit tests and integration tests to cover offloading nodes Updated Cluster integration test class with accessor for the current cluster coordinator Updated Node integration test class's custom NiFiProperties implementation to return the load balancing port and a method to assert an offloaded node Added exclusion to top-level pom for ITSpec.class commit be2c24c Author: Mark Payne <markap14@hotmail.com> Date: Mon Sep 24 09:17:22 2018 -0400 NIFI-5585: Fixed bug that arised when multiple nodes were decommissioning at same time; could get into state where the nodes queued up data for one another so the data just stayed put commit 04d8da8 Author: Jeff Storck <jtswork@gmail.com> Date: Tue Sep 18 17:09:13 2018 -0400 NIFI-5585 Added capability to offload a node that is disconnected from the cluster. Updated NodeClusterCoordinator to allow idempotent requests to offload a cluster Added capability to connect/delete/disconnect/offload a node from the cluster to the Toolkit CLI Added capability to get the status of nodes from the cluster to the Toolkit CLI Upgraded FontAwesome to 4.7.0 (from 4.6.1) Added icon "fa-upload" for offloading nodes in the cluster table UI commit 83ca676 Author: Kotaro Terada <koterada@yahoo-corp.jp> Date: Tue Oct 9 18:31:41 2018 +0900 NIFI-5681: Fix a locale-dependent test in TestVersionedFlowSnapshotMetadataResult Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com> This closes apache#3061. commit 6c17685 Author: Kotaro Terada <koterada@yahoo-corp.jp> Date: Fri Oct 5 16:43:44 2018 +0900 NIFI-5675: Fix some locale-dependent tests in ConvertExcelToCSVProcessorTest Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com> This closes apache#3058. commit fc5c8ba Author: Kotaro Terada <koterada@yahoo-corp.jp> Date: Tue Oct 9 14:12:53 2018 +0900 NIFI-5676: Fix a timezone-dependent test in PutORCTest Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com> This closes apache#3059. commit dd50322 Author: Matt Gilman <matt.c.gilman@gmail.com> Date: Tue Oct 9 12:49:31 2018 -0400 NIFI-5600: Recalculating the available columns for the queue listing and component state because they contain conditions which need to be re-evaluated. Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com> This closes apache#3055. commit 9dfc668 Author: Mark Payne <markap14@hotmail.com> Date: Tue Oct 9 12:19:24 2018 -0400 NIFI-5672: Do not compare Load Balancing address/port for logical equivalence of Node Identifiers. Added more details to logging of Node Identifiers This closes apache#3054 commit 77edddd Author: joewitt <joewitt@apache.org> Date: Mon Oct 8 13:35:01 2018 -0400 NIFI-5666 Updated all usages of Spring, beanutils, collections to move beyond deps with cves This closes apache#3052 commit 117e60c Author: Mark Payne <markap14@hotmail.com> Date: Tue Oct 9 12:23:44 2018 -0400 Empty commit to force Github sync commit c425bd2 Author: Mark Payne <markap14@hotmail.com> Date: Fri Aug 17 14:08:14 2018 -0400 NIFI-5533: Be more efficient with heap utilization - Updated FlowFile Repo / Write Ahead Log so that any update that writes more than 1 MB of data is written to a file inside the FlowFile Repo rather than being buffered in memory - Update SplitText so that it does not hold FlowFiles that are not the latest version in heap. Doing them from being garbage collected, so while the Process Session is holding the latest version of the FlowFile, SplitText is holding an older version, and this results in two copies of the same FlowFile object NIFI-5533: Checkpoint NIFI-5533: Bug Fixes Signed-off-by: Matthew Burgess <mattyb149@apache.org> This closes apache#2974 commit c87d791 Author: Mark Payne <markap14@hotmail.com> Date: Fri Oct 5 12:06:39 2018 -0400 NIFI-5663: Ensure that when sort Node Identifiers that we use both the node's API Address as well as API Port, in case 2 nodes are running on same host. Also ensure that when Local Node ID is determined that we update all Load Balancing Partitions, if necessary This closes apache#3048. Signed-off-by: Koji Kawamura <ijokarumawak@apache.org> commit 768bcfb Author: Pierre Villard <pierre.villard.fr@gmail.com> Date: Tue Sep 25 22:53:28 2018 +0200 NIFI-5635 - Description PutEmail properties with multiple senders/recipients This closes apache#3031 Signed-off-by: Mike Moser <mosermw@apache.org> commit 246c090 Author: thenatog <thenatog@gmail.com> Date: Thu Sep 13 21:45:00 2018 -0400 NIFI-5595 - Added the CORS filter to the templates/upload endpoint using a URL matcher. Explicitly allow methods GET, HEAD. These are the Spring defaults when the allowedMethods is empty but now it is explicit. This will require other methods like POST etc to be from the same origin (for the template/upload URL). This closes apache#3024. Signed-off-by: Andy LoPresto <alopresto@apache.org> commit c6572f0 Author: Matthew Burgess <mattyb149@apache.org> Date: Fri Aug 10 16:49:25 2018 -0400 NIFI-4517: Added ExecuteSQLRecord and QueryDatabaseTableRecord processors Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com> This closes apache#2945. commit b4810b8 Author: Mark Payne <markap14@hotmail.com> Date: Fri Oct 5 12:08:55 2018 -0400 Empty commit to force sync with mirrors commit 619f1ff Author: Mark Payne <markap14@hotmail.com> Date: Thu Jun 14 11:57:21 2018 -0400 NIFI-5516: Implement Load-Balanced Connections Refactoring StandardFlowFileQueue to have an AbstractFlowFileQueue Refactored more into AbstractFlowFileQueue Added documentation, cleaned up code some Refactored FlowFileQueue so that there is SwappablePriorityQueue Several unit tests written Added REST API Endpoint to allow PUT to update connection to use load balancing or not. When enabling load balancing, though, I saw the queue size go from 9 to 18. Then was only able to process 9 FlowFiles. Bug fixes Code refactoring Added integration tests, bug fixes Refactored clients to use NIO Bug fixes. Appears to finally be working with NIO Client!!!!! NIFI-5516: Refactored some code from NioAsyncLoadBalanceClient to LoadBalanceSession Bug fixes and allowed load balancing socket connections to be reused Implemented ability to compress Nothing, Attributes, or Content + Attributes when performing load-balancing Added flag to ConnectionDTO to indicate Load Balance Status Updated Diagnostics DTO for connections Store state about cluster topology in NodeClusterCoordinator so that the state is known upon restart Code cleanup Fixed checkstyle and unit tests NIFI-5516: Updating logic for Cluster Node Firewall so that the node's identity comes from its certificate, not from whatever it says it is. NIFI-5516: FIxed missing License headers NIFI-5516: Some minor code cleanup NIFI-5516: Adddressed review feedback; Bug fixes; some code cleanup. Changed dependency on nifi-registry from SNAPSHOT to official 0.3.0 release NIFI-5516: Take backpressure configuration into account NIFI-5516: Fixed ConnectionDiagnosticsSnapshot to include node identifier NIFI-5516: Addressed review feedback This closes apache#2947 commit 5872eb3 Author: Mark Payne <markap14@hotmail.com> Date: Wed Aug 15 10:23:49 2018 -0400 NIFI-5331: When checkpointing SequentialAccessWriteAheadLog, if the journal is not healthy, ensure that we roll it over and ensure that if an Exception is thrown when attempting to fsync() or close() the journal, we continue creating a new one. This closes apache#2952. Signed-off-by: Brandon Devries <devriesb@apache.org> commit 8f4d13e Author: Koji Kawamura <ijokarumawak@apache.org> Date: Thu Oct 4 13:48:26 2018 +0900 NIFI-5581: Fix replicate request timeout This closes apache#3044 - Revert 87cf474 to enable connection pooling - Changes the expected HTTP status code for the 1st request of a two-phase commit transaction from 150 (NiFi custom) to 202 Accepted - Corrected RevisionManager Javadoc about revision varidation protocol commit f65286b Author: Andy LoPresto <alopresto@apache.org> Date: Fri Sep 21 19:26:10 2018 -0700 NIFI-5622 Updated test resource keystores and truststores with SubjectAlternativeNames to be compliant with RFC 6125. Refactored some test code to be clearer. Renamed some resources to be consistent across modules. Changed passwords to meet new minimum length requirements. This closes apache#3018 commit 8e233ca Author: joewitt <joewitt@apache.org> Date: Thu Sep 20 23:24:17 2018 -0400 NIFI-4806 updated tika and a ton of other deps as found by dependency versions plugin This closes apache#3028 commit de685a7 Author: pepov <peterwilcsinszky@gmail.com> Date: Tue Oct 2 15:21:36 2018 +0200 NIFI-5656 Handly empty "Node Group" property in FileAccessPolicyProvider consistently, add some logs to help with debugging, add test for the invalid group name and for the empty case. This closes apache#3043. Signed-off-by: Kevin Doran <kdoran@apache.org> commit b4c8e01 Merge: 895323f 76a9f98 Author: Brandon Devries <devriesb@apache.org> Date: Tue Oct 2 11:08:43 2018 -0400 Merge branch 'pr2931' commit 76a9f98 Author: Mike Moser <mosermw@apache.org> Date: Wed Sep 5 15:49:44 2018 -0400 NIFI-3531 Catch and rethrow generic Exception to handle RuntimeExceptions, and allow test to pass This closes apache#2931. Signed-off-by: Brandon Devries <devriesb@apache.org> commit 895323f Merge: 813cc1f 4f538f1 Author: Brandon Devries <devriesb@apache.org> Date: Tue Oct 2 09:40:36 2018 -0400 Merge branch 'pr2949' commit 4f538f1 Author: Mike Moser <mosermw@apache.org> Date: Tue Aug 14 18:55:10 2018 +0000 NIFI-3672 updated PublishJMS message property docs This closes apache#2949 Signed-off-by: Brandon Devries <devriesb@apache.org> commit 813cc1f Author: Matthew Burgess <mattyb149@apache.org> Date: Mon Oct 1 10:23:44 2018 -0400 NIFI-5650: Added Xerces to scripting bundle for Jython 2.7.1 This closes apache#3042 Signed-off-by: Mike Thomsen <mikerthomsen@gmail.com> commit b1478cd Author: Mike Thomsen <mikerthomsen@gmail.com> Date: Fri Apr 6 21:38:07 2018 -0400 NIFI-5051 Created ElasticSearch lookup service. NIFI-5051 Fixed checkstyle issue. NIFI-5051 Converted ES lookup service to use a SchemaRegistry. NIFI-5051 Cleaned up POM and added a simple unit test that uses a mock client service. NIFI-5051 Added change; waiting for feedback. NIFI-5051 Changed query setup based on code review. Changed tests to Groovy to make them easier to read with all of the inline JSON. NIFI-5051 fixed a checkstyle issue. NIFI-5051 Rebased to cleanup merge issues NIFI-5051 Added changes from a code review. NIFI-5051 Fixed a checkstyle issue. NIFI-5051 Added coverage generator for tests. Rebased. NIFI-5051 Updated service and switched it over to JsonInferenceSchemaRegistryService. NIFI-5051 Removed dead code. NIFI-5051 Fixed checkstyle errors. NIFI-5051 Refactored query builder. NIFI-5051 Added placeholder gitignore to force test compile. NIFI-5051 Added note explaining why the .gitignore file was needed. NIFI-5051 Made constructor public. NIFI-5051 Fixed path issue in client service integration tests. NIFI-5051 Added additional mapping capabilities to let users massage the result set into the fields they want. Signed-off-by: Matthew Burgess <mattyb149@apache.org> This closes apache#2615 commit 748cf74 Author: Andy LoPresto <alopresto@apache.org> Date: Wed Sep 26 18:18:22 2018 -0700 NIFI-5628 Added content length check to OkHttpReplicationClient. Added unit tests. This closes apache#3035 commit 0dd3823 Author: Colin Dean <colin.dean@arcadia.io> Date: Wed Sep 19 20:27:47 2018 -0400 NIFI-5612: Support JDBC drivers that return Long for unsigned ints Refactors tests in order to share code repeated in tests and to enable some parameterized testing. MySQL Connector/J 5.1.x in conjunction with MySQL 5.0.x will return a Long for ResultSet#getObject when the SQL type is an unsigned integer. This change prevents that error from occurring while implementing a more informational exception describing what the failing object's POJO type is in addition to its string value. Signed-off-by: Matthew Burgess <mattyb149@apache.org> This closes apache#3032 commit e24388a Author: Jeff Storck <jtswork@gmail.com> Date: Tue Sep 25 18:30:19 2018 -0400 NIFI-5557 Added test in PutHDFSTest for IOException with a nested GSSException Resolved most of the code warnings in PutHDFSTest This closes apache#2971. commit 0f55cbf Author: Endre Zoltan Kovacs <ekovacs@hortonworks.com> Date: Tue Aug 28 10:47:59 2018 +0200 NIFI-5557: handling expired ticket by rollback and penalization commit 2e1005e Author: Mark Payne <markap14@hotmail.com> Date: Thu Sep 27 10:10:48 2018 -0400 NIFI-5640: Improved efficiency of Avro Reader and some methods of AvroTypeUtil. Also switched ServiceStateTransition to using read/write locks instead of synchronized blocks because profiling showed that significant time was spent in determining state of a Controller Service when attempting to use it. Switching to a ReadLock should provide better performance there. Signed-off-by: Matthew Burgess <mattyb149@apache.org> This closes apache#3036 commit ad4c886 Author: Mark Payne <markap14@hotmail.com> Date: Tue Sep 25 09:05:06 2018 -0400 NIFI-5634: When merging RPG entities, ensure that we only send back the ports that are common to all nodes - even if that means sending back no ports This closes apache#3030 commit 66eeb48 Author: Mike Moser <mosermw@apache.org> Date: Mon Aug 13 17:40:54 2018 +0000 NIFI-3672 Add support for strongly typed message properties in PublishJMS commit 8309747 Author: Mike Moser <mosermw@apache.org> Date: Wed Aug 1 20:11:35 2018 +0000 NIFI-3531 Moved session.recover in JMSConsumer to exceptional situations
NIFI-5051 Fixed checkstyle issue. NIFI-5051 Converted ES lookup service to use a SchemaRegistry. NIFI-5051 Cleaned up POM and added a simple unit test that uses a mock client service. NIFI-5051 Added change; waiting for feedback. NIFI-5051 Changed query setup based on code review. Changed tests to Groovy to make them easier to read with all of the inline JSON. NIFI-5051 fixed a checkstyle issue. NIFI-5051 Rebased to cleanup merge issues NIFI-5051 Added changes from a code review. NIFI-5051 Fixed a checkstyle issue. NIFI-5051 Added coverage generator for tests. Rebased. NIFI-5051 Updated service and switched it over to JsonInferenceSchemaRegistryService. NIFI-5051 Removed dead code. NIFI-5051 Fixed checkstyle errors. NIFI-5051 Refactored query builder. NIFI-5051 Added placeholder gitignore to force test compile. NIFI-5051 Added note explaining why the .gitignore file was needed. NIFI-5051 Made constructor public. NIFI-5051 Fixed path issue in client service integration tests. NIFI-5051 Added additional mapping capabilities to let users massage the result set into the fields they want. Signed-off-by: Matthew Burgess <mattyb149@apache.org> This closes apache#2615
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.