Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-5051 Created ElasticSearch lookup service. #2615

Closed
wants to merge 21 commits into from

Conversation

MikeThomsen
Copy link
Contributor

@MikeThomsen MikeThomsen commented Apr 7, 2018

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically master)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

@MikeThomsen MikeThomsen force-pushed the NIFI-5051 branch 2 times, most recently from a2a4c3b to f3f1cca Compare April 10, 2018 13:26
Copy link
Contributor

@zenfenan zenfenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing..

Copy link
Contributor

@zenfenan zenfenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikeThomsen Extremely sorry.. Unable to spend the planned time on this one. I'll try to get to it, if no one else does.

</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-avro-record-utils</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is included twice

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still twice, I can remove on merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is still valid @MikeThomsen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.displayName("Index")
.description("The name of the index to read from")
.required(true)
.expressionLanguageSupported(ExpressionLanguageScope.VARIABLE_REGISTRY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the lookup is performed on an incoming flow file, is there any reason the Index, Type, etc. properties couldn't support attributes coming from the flow file? If it is this way because the ES Client Service CS can't use them, perhaps we should write up a separate improvement Jira to do something like NIFI-5121.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could definitely see some value to that, but since this is a LookupService implementation, we should discuss it in that context. NIFI-5121 only describes one particular interface, and LookupService is more expansive in use.

.identifiesControllerService(SchemaRegistry.class)
.build();

public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most record-based processors (usually because they're in the same NAR) extend from SchemaRegistryService and thus offer the same schema-related properties. I realize you can't do that here (currently), but I think we should do one of two things: 1) Move SchemaRegistryService into the API, or 2) Offer the same properties as other record-based processors, including Schema Access Strategy, Schema Text, Schema Version, etc. I think the first one would be easier and more helpful for anyone creating a record-based processor in another bundle, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put up #2661 to address option 1 above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #2661 has been merged, could you extend SchemaRegistryService instead of AbstractControllerService (which you'll still get as SchemaRegistryService's parent)? That way you'll have immediate access to the same properties and logic as other schema-registry-aware processors to give a consistent UX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattyb149 I got started on that, but I'm not sure how this is supposed to work. The only entry point that is standard for LookupService is lookup(Map). How did you envision communicating the info for the other lookup strategies? The schema name one makes sense; it can be a property on the service or simply schema.name in the Map passed to lookup. The rest, I'm not sure about.

}
}

private RecordSchema convertSchema(Map<String, Object> result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in a utilities class (if there isn't already such a method in one)? Seems pretty helpful for JSON-to-schema conversions (or any Map-to-Schema) in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I just wrote up a Jira ticket for this. Let's table it for now because we need to think about things like date strings.

import java.util.Map;
import java.util.Optional;

public class ElasticSearchLookupService_IT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have some unit tests as well, since the integration tests do not get run as part of any automated build. I think you could mock the ES Client Service or something? There appears to be something similar in TestFetchElasticsearchHttp (and the other ES processor unit tests).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll add some.

<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.2</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't nifi-avro-record-utils bring in the Avro library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think that might have just been IntelliJ acting up.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 Done.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 Can you comment on the schema detection strategy issue I raised here?

@MikeThomsen
Copy link
Contributor Author

@mattyb149 I converted it over to be a subclass of SchemaRegistryService. Let me know if it needs anything else.

@mattyb149
Copy link
Contributor

Reviewing...

</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-avro-record-utils</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still twice, I can remove on merge

.build();


public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this property anymore, as you get one from SchemaRegistryService

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought I removed that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove it @MikeThomsen ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

@Override
public Optional lookup(Map coordinates) throws LookupFailureException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda thought this LookupService would behave a bit like the Mongo one, where you could give it multiple keys and it would do the query based on that (the fields and the values for each record). This one seems a bit awkward to me, as the user would have to build up their own query field in each record, putting the value they want to match inside a JSON query body.

Is there a different use case here, or could/should we make it more consistent with the other "NoSQL" lookup service(s)? We'd have to generate the query body but that shouldn't be too hard. Also you'd only be able to query top-level fields for lookup, but that seems like it would cover most use cases. If there is a way to specify a nested field for lookup (such as a qualified name with period delimiters), we could do that (although we'd likely have to use a "nested" operator in the generated query), seems like a good (but separate) improvement. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can do that. I think what it would look like is this:

{
    "bool": {
        "must": [
             {
                 "match": {
                     "username": "john.smith"
                 }
             },
             {
                 "match": {
                     "email": "john.smith@company.com"
                 }
             }
         ]
    }
}

For input:

{
    "username": "john.smith",
     "email": "john.smith@company.com"
}

as lookup keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For nested, it could be trickier. I think this would work:

/user/email => "user.contact.email"
{
    "query": {
         "nested": {
             "path": "user.contact",
              "query": {
                  "match": {
                      "email": "john.smith@company.com"
                   }
              }
         }
     }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattyb149 I'm going to work on knocking the basic query builder changes out tonight. Could be a big change, so I apologize in advance :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did those changes make it into the latest PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I'm not sure. It's been a while. I'll look at let you know today. Got most of your feedback squared away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, didn't make it in. Good news is that after kicking the tires with Kibana looks like it'll be pretty easy to do if I get some time after work.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 changed the query model per the discussion above and changed the tests to be Groovy so that the inline JSON, etc. would be a lot cleaner to read.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 @pvillard31 Changed the query model as requested and it's ready for final review AFAICT.

Copy link
Contributor

@pvillard31 pvillard31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments after a quick pass over the code, will try to find time to test it but I probably won't be able to do it before next week :(

</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-avro-record-utils</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is still valid @MikeThomsen

</execution>
</executions>
<configuration>
<source>1.8</source>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have this kind of configuration in low-level poms? Wondering if it'd be an issue with current modifications to support Java 9/10

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. Removed.

.build();


public static final PropertyDescriptor RECORD_SCHEMA_NAME = new PropertyDescriptor.Builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove it @MikeThomsen ?

@OnEnabled
public void onEnabled(final ConfigurationContext context) throws InitializationException {
clientService = context.getProperty(CLIENT_SERVICE).asControllerService(ElasticSearchClientService.class);
index = context.getProperty(INDEX).getValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.evaluateExpressionLanguage() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public void onEnabled(final ConfigurationContext context) throws InitializationException {
clientService = context.getProperty(CLIENT_SERVICE).asControllerService(ElasticSearchClientService.class);
index = context.getProperty(INDEX).getValue();
type = context.getProperty(TYPE).getValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.evaluateExpressionLanguage() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@MikeThomsen
Copy link
Contributor Author

@pvillard31 @mattyb149 changes checked in.

@@ -212,6 +212,31 @@
</execution>
</executions>
</plugin>

<plugin>
<groupId>org.jacoco</groupId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to add a code coverage plugin to Maven, this is probably something that should be added to the root-level pom (and disabled by default?) What was the impetus for including it in a single bundle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how well that would work at root level because there are plenty of integration tests that have to be run to get a full sense of code coverage. So maybe I should back this out or one of you can drop it when rebasing for a merge if you think it makes more sense to add a root level profile for code coverage.

<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-avro-record-utils</artifactId>
<version>1.7.0-SNAPSHOT</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be 1.8.0-SNAPSHOT now, sorry it's taken so long to get through

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

</dependencies>

<build>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to remove the groovy and Jacoco stuff, let's get a discussion going on the dev mailing list about code coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving in the helper plugin for now because for some reason, it won't even detect the groovy test source without it. I'll remove it if you have any suggestions on how to fix that.

Copy link
Contributor

@alopresto alopresto Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is nothing in src/test/java, the Groovy tests won't be detected unless a plugin references them directly. In this case, the build-helper-maven-plugin is accomplishing that. In other locations, the maven-compiler-plugin is set to use groovy-eclipse-compiler to achieve the same result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alopresto thanks for explaining that. I just added a .gitignore file into src/test/java and that did the trick.


@Override
protected List<PropertyDescriptor> getSupportedPropertyDescriptors() {
List<PropertyDescriptor> _desc = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style nitpick, these can be set up in the constructor or a static block (I think the former is preferred?). Unless they're dynamic the list only needs to be created once, where this method gets called often IIRC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

import org.junit.Before
import org.junit.Test

import static groovy.json.JsonOutput.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't pass CheckStyle as we don't allow star imports in Java, we probably just don't have an existing (or complete) CheckStyle rule for Groovy files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually fixed that.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 Refactored the query builder.

@MikeThomsen
Copy link
Contributor Author

Should be all good to go now.

@alopresto
Copy link
Contributor

@MikeThomsen I would maybe put a comment in the src/test/java/.gitignore file explaining why it's there so someone in the future doesn't see it as a superfluous tooling artifact and remove it, and then your test is silently no longer executed and we don't catch regressions. We've had similar occurrences in some of the other modules.

@MikeThomsen
Copy link
Contributor Author

@mattyb149 can we merge?

@MikeThomsen
Copy link
Contributor Author

@mattyb149 can we close this out?

@mattyb149
Copy link
Contributor

Reviewing...


private final List<PropertyDescriptor> DESCRIPTORS;

ElasticSearchLookupService() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to be public for ServiceLoader to find it, I'm getting errors when trying to load it into NiFi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

facepalm

One fix, coming right up...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. NiFi is able to load it and assign it as the lookupservice for lookuprecord.

put("bool", new HashMap<String, Object>(){{
put("must", coordinates.entrySet().stream()
.map(e -> new HashMap<String, Object>(){{
if (e.getKey().contains(".")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run the unit and integration tests and the code LGTM, but I'd feel better if I could get an example going where I do the lookup on a field that's not at the top level. I have a document containing a "user" field, which contains other fields such as "name", and "name" contains other fields like "first" and "last". I tried using this with a simple CSV input containing an id and a first name, and tried to use the lookup service to match "user.name.first" and return the value of "user.name.last", but got an error saying I was trying to do a nested query on a field that wasn't nested. I didn't add an explicit mapping for the index, just put the complex JSON docs into ES. Am I configuring it wrong, or is this not supported, or could there be a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into that. Should be able to get something resolved this weekend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but got an error saying I was trying to do a nested query on a field that wasn't nested.

I think you are. ES can be weird about detecting nested documents. I've only had consistent good results when explicitly defining them. I'll try to set up a test example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I got it working and will share some artifacts tomorrow if I get a chance so you can watch them in action. I'm thinking some of the behavior still needs a second opinion on the flexibility/user-friendliness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting the same error (nested object under path [user.name] is not of nested type) on my flow. I tried yours but I don't have any documents/mappings in ES (such as a doc with "subfield.longfield), can you share an example doc I can put in there? I have my own ES so I didn't start up the Docker Compose stuff you attached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below. It has a sample flow, the commands for Kibana and a docker compose file.

@MikeThomsen
Copy link
Contributor Author

@mattyb149
Copy link
Contributor

+1 LGTM, ran full build with unit tests, tried the lookup service with a nested record and everything worked fine. Thanks for the improvement! Merging to master

@asfgit asfgit closed this in b1478cd Oct 1, 2018
adyoun2 added a commit to adyoun2/nifi that referenced this pull request Oct 15, 2018
commit 7cb39d6
Author: Jeff Storck <jtswork@gmail.com>
Date:   Fri Oct 12 16:57:15 2018 -0400

    NIFI-5696 Update references to default value for nifi.cluster.node.load.load.balance.port

    This closes apache#3071.

    Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>

commit 0229a5c
Author: zenfenan <sivaprasanna246@gmail.com>
Date:   Sun Oct 14 13:18:25 2018 +0530

    NIFI-5698: Fixed DeleteAzureBlobStorage bug

    This closes apache#3073.

    Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>

commit e30a21c
Author: Brad Hards <bradh@frogmouth.net>
Date:   Sat Oct 13 19:25:43 2018 +1100

    [NIFI-5697] Trivial description fix for GenerateFlowFile processor

    This closes apache#3072.

    Signed-off-by: Aldrin Piri <aldrin@apache.org>

commit 270ce85
Author: Mark Payne <markap14@hotmail.com>
Date:   Fri Oct 12 15:27:10 2018 -0400

    NIFI-5695: Fixed bug that caused ports to not properly map to their correct child group on Flow Import if the child group is independently versioned

    This closes apache#3070.

    Signed-off-by: Bryan Bende <bbende@apache.org>

commit 5eb5e96
Author: thenatog <thenatog@gmail.com>
Date:   Mon Oct 8 12:58:20 2018 -0400

    NIFI-5665 - Changed netty versions to more closely match the original netty dependency version.
    NIFI-5665 - Fixed version for nifi-spark-bundle.
    NIFI-5665 - Fixing copy and paste error.

    This closes apache#3067

commit 02e0a16
Author: Bryan Bende <bbende@apache.org>
Date:   Thu Oct 11 15:58:55 2018 -0400

    NIFI-5680 Handling trailing slashes on URLs of registry clients

    This closes apache#3065.

    Signed-off-by: Mark Payne <markap14@hotmail.com>

commit 0f88805
Author: Matt Gilman <matt.c.gilman@gmail.com>
Date:   Fri Oct 12 10:23:47 2018 -0400

    NIFI-5691:
    - Overriding the version of jackson in aws java sdk.

    This closes apache#3066.

    Signed-off-by: Aldrin Piri <aldrin@apache.org>

commit e25b26e
Author: joewitt <joewitt@apache.org>
Date:   Fri Oct 12 11:27:48 2018 -0400

    Revert "NIFI-5448 Added failure relationship to UpdateAttributes to handle bad expression language logic."

    This reverts commit 32ee552.

commit 6b77e7d
Author: joewitt <joewitt@apache.org>
Date:   Fri Oct 12 11:08:22 2018 -0400

    Revert "NIFI-5448 Changed from 'stop' to 'penalize' in allowablevalue field to make the popup more consistent."

    This reverts commit 9d2b698.

commit a6b9364
Author: Carl Gieringer <carl.gieringer@snagajob.com>
Date:   Thu Oct 4 12:50:08 2018 -0400

    NIFI-5664 Support ArrayList in DataTypeUtils#toArray
    NIFI-5664 Generalize to handling List

    This closes apache#3049

    Signed-off-by: Mike Thomsen <mikerthomsen@gmail.com>

commit 5aa4263
Author: Endre Zoltan Kovacs <ekovacs@hortonworks.com>
Date:   Mon Oct 8 13:10:37 2018 +0200

    NIFI-1490: better field naming / displayname and description mix up fix

    This closes apache#2994.

    Signed-off-by: Mark Payne <markap14@hotmail.com>

commit c81a135
Author: Endre Zoltan Kovacs <andrewsmith87@protonmail.com>
Date:   Thu Sep 6 17:33:33 2018 +0200

    NIFI-1490: multipart/form-data support for ListenHTTP processor
    - introducing a in-memory-file-size-threashold, above which the incoming file is written to local disk
    - using java.io.tmpdir for such file writes
    - enhancing documentation

commit 8398ea7
Author: Mark Payne <markap14@hotmail.com>
Date:   Thu Oct 11 14:57:31 2018 -0400

    NIFI-5688: Ensure that when we map our flow to a VersionedProcessGroup that we include the connections' Load Balance Compression flag

    This closes apache#3064

commit 8da403c
Author: Matt Gilman <matt.c.gilman@gmail.com>
Date:   Thu Oct 11 13:21:20 2018 -0400

    NIFI-5661:
    - Allowing load balance settings to be applied during creation.
    - Clearing the load balance settings when the dialog is closed.

commit 79c03ca
Author: Matt Gilman <matt.c.gilman@gmail.com>
Date:   Thu Oct 11 12:23:53 2018 -0400

    NIFI-5661:
    - Allowing the load balance configuration to be shown/edited in both clustered and standalone mode.

commit 64de5c7
Author: thenatog <thenatog@gmail.com>
Date:   Fri Sep 7 12:39:18 2018 -0400

    NIFI-5479 - Supressed the AnnotationParser logs using the logback.xml. Dependency changes can be look at in future.
    NIFI-5479 - Updated comment.

    This closes apache#3034

commit 8a751e8
Author: Koji Kawamura <ijokarumawak@apache.org>
Date:   Fri Sep 14 21:18:04 2018 +0900

    NIFI-5661: Adding Load Balance config UI
    Incorporated review comments.
    Move combo options to a common place.

    This closes apache#3046

commit a6f7222
Author: Koji Kawamura <ijokarumawak@apache.org>
Date:   Fri Sep 28 17:37:34 2018 +0900

    NIFI-5645: Auto reconnect ConsumeWindowsEventLog

    This commit also contains following refactoring:
    - Catch URISyntaxException inside subscribe when constructing provenance
    URI as it does not affect the core responsibility of this processor.
    Even if it fails to be a proper URI, if the query works for consuming
    logs, the processor should proceed forward.

    Upgrade JNA version.

    Do not update lastActivityTimestamp when subscribe failed.

    This closes apache#3037

commit 97afa4e
Author: Mark Payne <markap14@hotmail.com>
Date:   Tue Oct 9 14:54:21 2018 -0400

    NIFI-5585: Addressed bug in calculating swap size of a queue partition when rebalancing

    This closes apache#3010.

    Signed-off-by: Mark Payne <markap14@hotmail.com>

commit a1a4c99
Author: Mark Payne <markap14@hotmail.com>
Date:   Mon Oct 8 09:53:14 2018 -0400

    NIFI-5585: Adjustments to the Connection Load Balancing to ensure that node offloading works smoothly

    Signed-off-by: Jeff Storck <jtswork@gmail.com>

commit 01e2098
Author: Jeff Storck <jtswork@gmail.com>
Date:   Tue Sep 25 15:17:19 2018 -0400

    NIFI-5585 A node that was previously offloaded can now be reconnected to the cluster and queue flowfiles again
    Added Spock test for NonLocalPartitionPartitioner
    Updated NOTICE files for FontAwesome with the updated version (4.7.0) and URL to the free license
    Updated package-lock.json with the updated version of FontAwesome (4.7.0)
    Added method to FlowFileQueue interface to reset an offloaded queue
    Queues that are now immediately have the offloaded status reset once offloading finishes
    SocketLoadBalancedFlowFileQueue now ignores back-pressure when offloading flowfiles
    Cleaned up javascript in nf-cluster-table.js when creating markup for the node operation icons
    Fixed incorrect handling of a heartbeat from an offloaded node.  Heartbeats from offloading or offloaded nodes will now be reported as an event, the heartbeat will be removed and ignored.
    Added unit tests and integration tests to cover offloading nodes
    Updated Cluster integration test class with accessor for the current cluster coordinator
    Updated Node integration test class's custom NiFiProperties implementation to return the load balancing port and a method to assert an offloaded node
    Added exclusion to top-level pom for ITSpec.class

commit be2c24c
Author: Mark Payne <markap14@hotmail.com>
Date:   Mon Sep 24 09:17:22 2018 -0400

    NIFI-5585: Fixed bug that arised when multiple nodes were decommissioning at same time; could get into state where the nodes queued up data for one another so the data just stayed put

commit 04d8da8
Author: Jeff Storck <jtswork@gmail.com>
Date:   Tue Sep 18 17:09:13 2018 -0400

    NIFI-5585 Added capability to offload a node that is disconnected from the cluster.
    Updated NodeClusterCoordinator to allow idempotent requests to offload a cluster
    Added capability to connect/delete/disconnect/offload a node from the cluster to the Toolkit CLI
    Added capability to get the status of nodes from the cluster to the Toolkit CLI
    Upgraded FontAwesome to 4.7.0 (from 4.6.1)
    Added icon "fa-upload" for offloading nodes in the cluster table UI

commit 83ca676
Author: Kotaro Terada <koterada@yahoo-corp.jp>
Date:   Tue Oct 9 18:31:41 2018 +0900

    NIFI-5681: Fix a locale-dependent test in TestVersionedFlowSnapshotMetadataResult

    Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

    This closes apache#3061.

commit 6c17685
Author: Kotaro Terada <koterada@yahoo-corp.jp>
Date:   Fri Oct 5 16:43:44 2018 +0900

    NIFI-5675: Fix some locale-dependent tests in ConvertExcelToCSVProcessorTest

    Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

    This closes apache#3058.

commit fc5c8ba
Author: Kotaro Terada <koterada@yahoo-corp.jp>
Date:   Tue Oct 9 14:12:53 2018 +0900

    NIFI-5676: Fix a timezone-dependent test in PutORCTest

    Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

    This closes apache#3059.

commit dd50322
Author: Matt Gilman <matt.c.gilman@gmail.com>
Date:   Tue Oct 9 12:49:31 2018 -0400

    NIFI-5600: Recalculating the available columns for the queue listing and component state because they contain conditions which need to be re-evaluated.

    Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

    This closes apache#3055.

commit 9dfc668
Author: Mark Payne <markap14@hotmail.com>
Date:   Tue Oct 9 12:19:24 2018 -0400

    NIFI-5672: Do not compare Load Balancing address/port for logical equivalence of Node Identifiers. Added more details to logging of Node Identifiers

    This closes apache#3054

commit 77edddd
Author: joewitt <joewitt@apache.org>
Date:   Mon Oct 8 13:35:01 2018 -0400

    NIFI-5666 Updated all usages of Spring, beanutils, collections to move beyond deps with cves

    This closes apache#3052

commit 117e60c
Author: Mark Payne <markap14@hotmail.com>
Date:   Tue Oct 9 12:23:44 2018 -0400

    Empty commit to force Github sync

commit c425bd2
Author: Mark Payne <markap14@hotmail.com>
Date:   Fri Aug 17 14:08:14 2018 -0400

    NIFI-5533: Be more efficient with heap utilization
     - Updated FlowFile Repo / Write Ahead Log so that any update that writes more than 1 MB of data is written to a file inside the FlowFile Repo rather than being buffered in memory
     - Update SplitText so that it does not hold FlowFiles that are not the latest version in heap. Doing them from being garbage collected, so while the Process Session is holding the latest version of the FlowFile, SplitText is holding an older version, and this results in two copies of the same FlowFile object

    NIFI-5533: Checkpoint

    NIFI-5533: Bug Fixes

    Signed-off-by: Matthew Burgess <mattyb149@apache.org>

    This closes apache#2974

commit c87d791
Author: Mark Payne <markap14@hotmail.com>
Date:   Fri Oct 5 12:06:39 2018 -0400

    NIFI-5663: Ensure that when sort Node Identifiers that we use both the node's API Address as well as API Port, in case 2 nodes are running on same host. Also ensure that when Local Node ID is determined that we update all Load Balancing Partitions, if necessary

    This closes apache#3048.

    Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>

commit 768bcfb
Author: Pierre Villard <pierre.villard.fr@gmail.com>
Date:   Tue Sep 25 22:53:28 2018 +0200

    NIFI-5635 - Description PutEmail properties with multiple senders/recipients

    This closes apache#3031

    Signed-off-by: Mike Moser <mosermw@apache.org>

commit 246c090
Author: thenatog <thenatog@gmail.com>
Date:   Thu Sep 13 21:45:00 2018 -0400

    NIFI-5595 - Added the CORS filter to the templates/upload endpoint using a URL matcher.
    Explicitly allow methods GET, HEAD. These are the Spring defaults when the allowedMethods is empty but now it is explicit. This will require other methods like POST etc to be from the same origin (for the template/upload URL).

    This closes apache#3024.

    Signed-off-by: Andy LoPresto <alopresto@apache.org>

commit c6572f0
Author: Matthew Burgess <mattyb149@apache.org>
Date:   Fri Aug 10 16:49:25 2018 -0400

    NIFI-4517: Added ExecuteSQLRecord and QueryDatabaseTableRecord processors

    Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

    This closes apache#2945.

commit b4810b8
Author: Mark Payne <markap14@hotmail.com>
Date:   Fri Oct 5 12:08:55 2018 -0400

    Empty commit to force sync with mirrors

commit 619f1ff
Author: Mark Payne <markap14@hotmail.com>
Date:   Thu Jun 14 11:57:21 2018 -0400

    NIFI-5516: Implement Load-Balanced Connections
    Refactoring StandardFlowFileQueue to have an AbstractFlowFileQueue
    Refactored more into AbstractFlowFileQueue
    Added documentation, cleaned up code some
    Refactored FlowFileQueue so that there is SwappablePriorityQueue
    Several unit tests written
    Added REST API Endpoint to allow PUT to update connection to use load balancing or not. When enabling load balancing, though, I saw the queue size go from 9 to 18. Then was only able to process 9 FlowFiles.
    Bug fixes
    Code refactoring
    Added integration tests, bug fixes
    Refactored clients to use NIO
    Bug fixes. Appears to finally be working with NIO Client!!!!!
    NIFI-5516: Refactored some code from NioAsyncLoadBalanceClient to LoadBalanceSession
    Bug fixes and allowed load balancing socket connections to be reused
    Implemented ability to compress Nothing, Attributes, or Content + Attributes when performing load-balancing
    Added flag to ConnectionDTO to indicate Load Balance Status
    Updated Diagnostics DTO for connections
    Store state about cluster topology in NodeClusterCoordinator so that the state is known upon restart
    Code cleanup
    Fixed checkstyle and unit tests
    NIFI-5516: Updating logic for Cluster Node Firewall so that the node's identity comes from its certificate, not from whatever it says it is.
    NIFI-5516: FIxed missing License headers
    NIFI-5516: Some minor code cleanup
    NIFI-5516: Adddressed review feedback; Bug fixes; some code cleanup. Changed dependency on nifi-registry from SNAPSHOT to official 0.3.0 release
    NIFI-5516: Take backpressure configuration into account
    NIFI-5516: Fixed ConnectionDiagnosticsSnapshot to include node identifier
    NIFI-5516: Addressed review feedback

    This closes apache#2947

commit 5872eb3
Author: Mark Payne <markap14@hotmail.com>
Date:   Wed Aug 15 10:23:49 2018 -0400

    NIFI-5331: When checkpointing SequentialAccessWriteAheadLog, if the journal is not healthy, ensure that we roll it over and ensure that if an Exception is thrown when attempting to fsync() or close() the journal, we continue creating a new one.
    This closes apache#2952.
    Signed-off-by: Brandon Devries <devriesb@apache.org>

commit 8f4d13e
Author: Koji Kawamura <ijokarumawak@apache.org>
Date:   Thu Oct 4 13:48:26 2018 +0900

    NIFI-5581: Fix replicate request timeout

    This closes apache#3044

    - Revert 87cf474 to enable connection
    pooling
    - Changes the expected HTTP status code for the 1st request of a
    two-phase commit transaction from 150 (NiFi custom) to 202 Accepted
    - Corrected RevisionManager Javadoc about revision varidation protocol

commit f65286b
Author: Andy LoPresto <alopresto@apache.org>
Date:   Fri Sep 21 19:26:10 2018 -0700

    NIFI-5622 Updated test resource keystores and truststores with SubjectAlternativeNames to be compliant with RFC 6125.
    Refactored some test code to be clearer.
    Renamed some resources to be consistent across modules.
    Changed passwords to meet new minimum length requirements.

    This closes apache#3018

commit 8e233ca
Author: joewitt <joewitt@apache.org>
Date:   Thu Sep 20 23:24:17 2018 -0400

    NIFI-4806 updated tika and a ton of other deps as found by dependency versions plugin

    This closes apache#3028

commit de685a7
Author: pepov <peterwilcsinszky@gmail.com>
Date:   Tue Oct 2 15:21:36 2018 +0200

    NIFI-5656 Handly empty "Node Group" property in FileAccessPolicyProvider consistently, add some logs to help with debugging, add test for the invalid group name and for the empty case.

    This closes apache#3043.

    Signed-off-by: Kevin Doran <kdoran@apache.org>

commit b4c8e01
Merge: 895323f 76a9f98
Author: Brandon Devries <devriesb@apache.org>
Date:   Tue Oct 2 11:08:43 2018 -0400

    Merge branch 'pr2931'

commit 76a9f98
Author: Mike Moser <mosermw@apache.org>
Date:   Wed Sep 5 15:49:44 2018 -0400

    NIFI-3531 Catch and rethrow generic Exception to handle RuntimeExceptions, and allow test to pass
    This closes apache#2931.
    Signed-off-by: Brandon Devries <devriesb@apache.org>

commit 895323f
Merge: 813cc1f 4f538f1
Author: Brandon Devries <devriesb@apache.org>
Date:   Tue Oct 2 09:40:36 2018 -0400

    Merge branch 'pr2949'

commit 4f538f1
Author: Mike Moser <mosermw@apache.org>
Date:   Tue Aug 14 18:55:10 2018 +0000

    NIFI-3672 updated PublishJMS message property docs

    This closes apache#2949

    Signed-off-by: Brandon Devries <devriesb@apache.org>

commit 813cc1f
Author: Matthew Burgess <mattyb149@apache.org>
Date:   Mon Oct 1 10:23:44 2018 -0400

    NIFI-5650: Added Xerces to scripting bundle for Jython 2.7.1

    This closes apache#3042

    Signed-off-by: Mike Thomsen <mikerthomsen@gmail.com>

commit b1478cd
Author: Mike Thomsen <mikerthomsen@gmail.com>
Date:   Fri Apr 6 21:38:07 2018 -0400

    NIFI-5051 Created ElasticSearch lookup service.

    NIFI-5051 Fixed checkstyle issue.

    NIFI-5051 Converted ES lookup service to use a SchemaRegistry.

    NIFI-5051 Cleaned up POM and added a simple unit test that uses a mock client service.

    NIFI-5051 Added change; waiting for feedback.

    NIFI-5051 Changed query setup based on code review. Changed tests to Groovy to make them easier to read with all of the inline JSON.

    NIFI-5051 fixed a checkstyle issue.

    NIFI-5051 Rebased to cleanup merge issues

    NIFI-5051 Added changes from a code review.

    NIFI-5051 Fixed a checkstyle issue.

    NIFI-5051 Added coverage generator for tests.

    Rebased.

    NIFI-5051 Updated service and switched it over to JsonInferenceSchemaRegistryService.

    NIFI-5051 Removed dead code.

    NIFI-5051 Fixed checkstyle errors.

    NIFI-5051 Refactored query builder.

    NIFI-5051 Added placeholder gitignore to force test compile.

    NIFI-5051 Added note explaining why the .gitignore file was needed.

    NIFI-5051 Made constructor public.

    NIFI-5051 Fixed path issue in client service integration tests.

    NIFI-5051 Added additional mapping capabilities to let users massage the result set into the fields they want.

    Signed-off-by: Matthew Burgess <mattyb149@apache.org>

    This closes apache#2615

commit 748cf74
Author: Andy LoPresto <alopresto@apache.org>
Date:   Wed Sep 26 18:18:22 2018 -0700

    NIFI-5628 Added content length check to OkHttpReplicationClient.
    Added unit tests.

    This closes apache#3035

commit 0dd3823
Author: Colin Dean <colin.dean@arcadia.io>
Date:   Wed Sep 19 20:27:47 2018 -0400

    NIFI-5612: Support JDBC drivers that return Long for unsigned ints

    Refactors tests in order to share code repeated in tests and to enable
    some parameterized testing.

    MySQL Connector/J 5.1.x in conjunction with MySQL 5.0.x will return
    a Long for ResultSet#getObject when the SQL type is an unsigned integer.
    This change prevents that error from occurring while implementing a more
    informational exception describing what the failing object's POJO type
    is in addition to its string value.

    Signed-off-by: Matthew Burgess <mattyb149@apache.org>

    This closes apache#3032

commit e24388a
Author: Jeff Storck <jtswork@gmail.com>
Date:   Tue Sep 25 18:30:19 2018 -0400

    NIFI-5557 Added test in PutHDFSTest for IOException with a nested GSSException
    Resolved most of the code warnings in PutHDFSTest

    This closes apache#2971.

commit 0f55cbf
Author: Endre Zoltan Kovacs <ekovacs@hortonworks.com>
Date:   Tue Aug 28 10:47:59 2018 +0200

    NIFI-5557: handling expired ticket by rollback and penalization

commit 2e1005e
Author: Mark Payne <markap14@hotmail.com>
Date:   Thu Sep 27 10:10:48 2018 -0400

    NIFI-5640: Improved efficiency of Avro Reader and some methods of AvroTypeUtil. Also switched ServiceStateTransition to using read/write locks instead of synchronized blocks because profiling showed that significant time was spent in determining state of a Controller Service when attempting to use it. Switching to a ReadLock should provide better performance there.

    Signed-off-by: Matthew Burgess <mattyb149@apache.org>

    This closes apache#3036

commit ad4c886
Author: Mark Payne <markap14@hotmail.com>
Date:   Tue Sep 25 09:05:06 2018 -0400

    NIFI-5634: When merging RPG entities, ensure that we only send back the ports that are common to all nodes - even if that means sending back no ports

    This closes apache#3030

commit 66eeb48
Author: Mike Moser <mosermw@apache.org>
Date:   Mon Aug 13 17:40:54 2018 +0000

    NIFI-3672 Add support for strongly typed message properties in PublishJMS

commit 8309747
Author: Mike Moser <mosermw@apache.org>
Date:   Wed Aug 1 20:11:35 2018 +0000

    NIFI-3531 Moved session.recover in JMSConsumer to exceptional situations
bdesert pushed a commit to bdesert/nifi that referenced this pull request Oct 15, 2018
NIFI-5051 Fixed checkstyle issue.

NIFI-5051 Converted ES lookup service to use a SchemaRegistry.

NIFI-5051 Cleaned up POM and added a simple unit test that uses a mock client service.

NIFI-5051 Added change; waiting for feedback.

NIFI-5051 Changed query setup based on code review. Changed tests to Groovy to make them easier to read with all of the inline JSON.

NIFI-5051 fixed a checkstyle issue.

NIFI-5051 Rebased to cleanup merge issues

NIFI-5051 Added changes from a code review.

NIFI-5051 Fixed a checkstyle issue.

NIFI-5051 Added coverage generator for tests.

Rebased.

NIFI-5051 Updated service and switched it over to JsonInferenceSchemaRegistryService.

NIFI-5051 Removed dead code.

NIFI-5051 Fixed checkstyle errors.

NIFI-5051 Refactored query builder.

NIFI-5051 Added placeholder gitignore to force test compile.

NIFI-5051 Added note explaining why the .gitignore file was needed.

NIFI-5051 Made constructor public.

NIFI-5051 Fixed path issue in client service integration tests.

NIFI-5051 Added additional mapping capabilities to let users massage the result set into the fields they want.

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes apache#2615
@MikeThomsen MikeThomsen deleted the NIFI-5051 branch August 14, 2024 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants