Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UDF] JSON Array Contains function #608

Merged
merged 9 commits into from
Jan 25, 2018

Conversation

satybald
Copy link
Contributor

@satybald satybald commented Jan 9, 2018

UDF function that searches if the value exists in JSON Array. Function accepts two arguments a JSON string, and a searchValue. SearchValue can be any of the BOOLEAN, INTEGER, BIGINT, DOUBLE and VARCHAR. Fixes #607

@bluemonk3y
Copy link

@satybald - there is an internal nexus issue - it will be resolved in the next couple of hours

KsqlFunction jsonArrayContainsString = new KsqlFunction(
Schema.BOOLEAN_SCHEMA, Arrays.asList(array, Schema.STRING_SCHEMA),
"JSON_ARRAY_CONTAINS", JsonArrayContainsKudf.class);
addFunction(jsonArrayContainsString);
Copy link
Contributor Author

@satybald satybald Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an issue registering my function passing arguments to it. I'm expecting that 'evaluate' method passes json string as first argment of the array, and the second argument is search value.

@satybald
Copy link
Contributor Author

satybald commented Jan 9, 2018

thanks @bluemonk3y for information. Is there any way of getting the result of Jenkins build?

sorry @satybald - ATM its behind a confluent firewall. We are looking at other ways of notifying build failures

@hjafarpour
Copy link
Contributor

hjafarpour commented Jan 9, 2018

@satybald I would suggest to change this function to search general array type regardless if the serialization mode is JSON or AVRO. This way this can be used in broader use cases.

@satybald
Copy link
Contributor Author

@hjafarpour thanks for the suggestion. I added support for the general array type.

jsonUdf.init();
}

public void testEmptyArray()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'@ Test` annotation?

}

public void testIfJsonArrayContainsLongValue()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, missing Test annotation.

}

public void testIfArrayContainsDoubleValue()
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, missing Test annotation.

assertEquals(false, jsonUdf.evaluate("[[2.0], 3.0]", 2.0));
}

public void testIfArrayContainsStringValue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, missing Test annotation.
The same for all tests methods.

@satybald
Copy link
Contributor Author

@hjafarpour thanks for the feedback. Added @test annotation.

@satybald satybald changed the title [WIP] Add JSON Array Contains function [UDF] JSON Array Contains function Jan 11, 2018
@@ -151,6 +153,27 @@ private void init() {
"EXTRACTJSONFIELD", JsonExtractStringKudf.class);
addFunction(getStringFromJson);

KsqlFunction jsonArrayContainsString = new KsqlFunction(
Schema.BOOLEAN_SCHEMA, Arrays.asList(Schema.STRING_SCHEMA, Schema.STRING_SCHEMA),
"JSON_ARRAY_CONTAINS", JsonArrayContainsKudf.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have doubts should name contains underscore or not? For consistency with EXTRACTJSONFIELD it should be without, but for readability, I argue that underscore is needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@satybald - it needs to be consistent for this PR. EXTRACTJSONFIELD - changing formatting will require doc changes etc - which could be done in another PR

Copy link
Contributor Author

@satybald satybald Jan 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluemonk3y good point about the doc. I completely forgot to document function 😄

Copy link

@bluemonk3y bluemonk3y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@satybald
Copy link
Contributor Author

@hjafarpour @dguy what do you think about PR? I compile and test branch locally build passes all the tests.

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @satybald, left a few comments. Generally i think the tests need to be named better, i.e, shouldReturnFalseWhenNullArray etc. Also use assertTrue() and assertFalse() when testing boolean conditions

}
Object searchValue = args[1];
if(args[0] instanceof String) {
String jsonString = args[0].toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps just cast it to string and inline in ifJsonStringArrayContains

throw new KsqlFunctionException("Invalid type parameters for " + Arrays.toString(args));
}

private Object ifArrayContains(Object[] array, Object searchValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return boolean and rename to arrayContains


private Object ifArrayContains(Object[] array, Object searchValue) {
//TODO: Refactor this to ArrayUtil.containsValue from TopK PR#575
for (Object value : array) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Arrays.stream(array).anyMatch(value -> Objects.equals(o, searchValue))

return false;
}

private boolean ifJsonStringArrayContains(String json, Object searchValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to stringArrayContains

while (parser.currentToken() != null) {
JsonToken token = parser.nextToken();
if (token == null) {
return token == searchValue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return searchValue == null ?

@Test
public void testEmptyArray()
{
assertEquals(false, jsonUdf.evaluate("[]", true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertFalse for all of these

}

@Test
public void testNullArray()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split this into two tests, i.e., one for when it should be true and on for when it should be false. Also naming as proposed above

@Test
public void testNullArray()
{
assertEquals(true, jsonUdf.evaluate("[null]", null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertTrue(...) etc

}

@Test
public void testIntegerJsonArrayContainsValues()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

}

@Test
public void testIfJsonArrayContainsLongValue()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@satybald
Copy link
Contributor Author

@dguy thanks for the review. I've applied your feedback. However, I cannot use assertTrue() and assertFalse() as jsonUdf.evaluate returns Object. I need explicitly state against which object I'm evaluating.

@dguy
Copy link
Contributor

dguy commented Jan 15, 2018

@satybald with your latest updates it looks like you have bought in a bunch of changes that aren't yours, i.e, Dockerfile, pom.xml etc

@satybald
Copy link
Contributor Author

@dguy I did rebase with a master to add ArraysUtill.containsValue. However, I forgot that there's a lot of changes since then has been merged to master.

@dguy
Copy link
Contributor

dguy commented Jan 16, 2018

@satybald - those changes shouldn't show up in this diff, though. They should already be in master

@satybald satybald force-pushed the feature-array-contains branch 3 times, most recently from dbc6b82 to 866fdce Compare January 17, 2018 11:30
@satybald satybald force-pushed the feature-array-contains branch from 866fdce to ca6af94 Compare January 17, 2018 11:33
@satybald
Copy link
Contributor Author

retest this please

@satybald
Copy link
Contributor Author

@dguy I fixed that now and cherry-picked need me commit with ArraysUtill.containsValue changes. Could you please look once more?

@bluemonk3y
Copy link

@satybald ..engine/src/main/java/io/confluent/ksql/function/udf/json/ArrayContainsKudf.java:29:8: Unused import - java.util.Objects. [UnusedImports]

@satybald satybald force-pushed the feature-array-contains branch from 542154e to c0208ad Compare January 20, 2018 06:04
@satybald
Copy link
Contributor Author

thanks, @bluemonk3y. fixed the issue.

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @satybald, LGTM!

@satybald
Copy link
Contributor Author

@dguy @bluemonk3y @hjafarpour build again failed 😢

@dguy dguy merged commit 15dcbc7 into confluentinc:master Jan 25, 2018
@dguy
Copy link
Contributor

dguy commented Jan 25, 2018

Thanks @satybald, the build has passed so i've merged this PR. Thanks for the contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants