Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] jsonNode array index access for EXTRACTJSONFIELD #610

Merged

Conversation

robert2d
Copy link
Contributor

@robert2d robert2d commented Jan 10, 2018

Allows access to retrieval of JSON node when in array or complex nested objects.

CREATE STREAM sample_2 AS SELECT EXTRACTJSONFIELD(message, '$.log.logs[0].entry') FROM sample;

Previously the above query returned null because jsonNode would only access an index of an array node when using an int which was not being catered for in the current release.

Two tests have been updated to check this change both "tokenises" correctly using an array accessor and that it integrates into the jsonFormatTest when used in an actual query.

@ghost
Copy link

ghost commented Jan 10, 2018

It looks like @robert2d hasn't signed our Contributor License Agreement, yet.

The purpose of a CLA is to ensure that the guardian of a project's outputs has the necessary ownership or grants of rights over all contributions to allow them to distribute under the chosen licence.
Wikipedia

You can read and sign our full Contributor License Agreement here.

Once you've signed reply with [clabot:check] to prove it.

Appreciation of efforts,

clabot

@robert2d
Copy link
Contributor Author

[clabot:check]

@ghost
Copy link

ghost commented Jan 10, 2018

@confluentinc It looks like @robert2d just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@robert2d
Copy link
Contributor Author

related to: #523

("{\"log\":{\"@timestamp\":\"2017-05-30T16:44:22.175Z\",\"@version\":\"1\","
+ "\"caasVersion\":\"0.0.2\",\"cloud\":\"aws\",\"clusterId\":\"cp99\",\"clusterName\":\"kafka\",\"cpComponentId\":\"kafka\",\"host\":\"kafka-1-wwl0p\",\"k8sId\":\"k8s13\",\"k8sName\":\"perf\",\"level\":\"ERROR\",\"logger\":\"kafka.server.ReplicaFetcherThread\",\"message\":\"Found invalid messages during fetch for partition [foo512,172] offset 0 error Record is corrupt (stored crc = 1321230880, computed crc = 1139143803)\",\"networkId\":\"vpc-d8c7a9bf\",\"region\":\"us-west-2\",\"serverId\":\"1\",\"skuId\":\"sku5\",\"source\":\"kafka\",\"tenantId\":\"t47\",\"tenantName\":\"perf-test\",\"thread\":\"ReplicaFetcherThread-0-2\",\"zone\":\"us-west-2a\"},\"stream\":\"stdout\",\"time\":2017}"));
("{\"log\":{\"@timestamp\":\"2017-05-30T16:44:22.175Z\",\"@version\":\"1\","
+ "\"caasVersion\":\"0.0.2\",\"cloud\":\"aws\",\"logs\":[{\"entry\":\"first\"}],\"clusterId\":\"cp99\",\"clusterName\":\"kafka\",\"cpComponentId\":\"kafka\",\"host\":\"kafka-1-wwl0p\",\"k8sId\":\"k8s13\",\"k8sName\":\"perf\",\"level\":\"ERROR\",\"logger\":\"kafka.server.ReplicaFetcherThread\",\"message\":\"Found invalid messages during fetch for partition [foo512,172] offset 0 error Record is corrupt (stored crc = 1321230880, computed crc = 1139143803)\",\"networkId\":\"vpc-d8c7a9bf\",\"region\":\"us-west-2\",\"serverId\":\"1\",\"skuId\":\"sku5\",\"source\":\"kafka\",\"tenantId\":\"t47\",\"tenantName\":\"perf-test\",\"thread\":\"ReplicaFetcherThread-0-2\",\"zone\":\"us-west-2a\"},\"stream\":\"stdout\",\"time\":2017}"));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have both previous tests and this one? The UDF should work for both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem I thought this might come up. Reason I modified the existing is that it essentially performs both tests when accessing the nested nodes anyway. But I'll add back in the original spec unmodified and duplicate it into a new one for array access now 😄

@robert2d robert2d force-pushed the bug/extract-json-field-with-index branch 2 times, most recently from 329a0e2 to 6f9ea5b Compare January 19, 2018 02:19
@robert2d
Copy link
Contributor Author

@hjafarpour Updated to include original test back in. Also rebased upstream/master to ensure no changes between original work and updates have any effect.

Copy link
Contributor

@hjafarpour hjafarpour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copy link
Contributor

@dguy dguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @robert2d, just a couple of minor comments


assertThat(results, equalTo(expectedResults));

ksqlEngine.terminateQuery(queryMetadata.getId(), true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be done in a `try{...}finally{...}' otherwise it won't run when the test fails

Assert.assertTrue(tokenList.get(0).equalsIgnoreCase("log"));
Assert.assertTrue(tokenList.get(1).equalsIgnoreCase("cloud"));
Assert.assertTrue(tokenList.get(2).equalsIgnoreCase("region"));
Assert.assertTrue(tokenList.size() == 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these were previously assertTrue but it would be good to change this to use assertThat(..) instead

@dguy
Copy link
Contributor

dguy commented Jan 31, 2018

@robert2d are you able to update this PR with the feedback?

Copy link
Contributor

@apurvam apurvam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to merge this so that we can include it in the 0.4 release. We can address the test related issues later.

@apurvam apurvam merged commit 1b0382c into confluentinc:master Jan 31, 2018
@robert2d
Copy link
Contributor Author

robert2d commented Feb 1, 2018

Hi I have addressed the issues you mentioned @dguy and will submit a seperate pull request for it in the interest of code quality 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants