Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23094] Fix invalid character handling in JsonDataSource #20302

Closed
wants to merge 1 commit into from

Conversation

brkyvz
Copy link
Contributor

@brkyvz brkyvz commented Jan 18, 2018

What changes were proposed in this pull request?

There were two related fixes regarding from_json, get_json_object and json_tuple (Fix #1,
Fix #2), but they weren't comprehensive it seems. I wanted to extend those fixes to all the parsers, and add tests for each case.

How was this patch tested?

Regression tests

@brkyvz
Copy link
Contributor Author

brkyvz commented Jan 18, 2018

cc @hvanhovell @cloud-fan

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Jan 18, 2018

Test build #86304 has finished for PR 20302 at commit 1d998fb.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@HyukjinKwon
Copy link
Member

LGTM

}
}

test("invalid json with leading nulls - from dataset") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this case looks already passing tho, and I thought we should put this in JsonSuite.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole suite should be moved. Not related to this PR. I will take this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think usually we encourage to put the test cases in a better suite, in-place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite is still in org.apache.spark.sql.sources. We should move these test suite to /sql/core

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the PR #20331

@SparkQA
Copy link

SparkQA commented Jan 18, 2018

Test build #86312 has finished for PR 20302 at commit 1d998fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

asfgit pushed a commit that referenced this pull request Jan 18, 2018
## What changes were proposed in this pull request?

There were two related fixes regarding `from_json`, `get_json_object` and `json_tuple` ([Fix #1](c8803c0),
 [Fix #2](86174ea)), but they weren't comprehensive it seems. I wanted to extend those fixes to all the parsers, and add tests for each case.

## How was this patch tested?

Regression tests

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #20302 from brkyvz/json-invfix.

(cherry picked from commit e01919e)
Signed-off-by: hyukjinkwon <gurwls223@gmail.com>
@HyukjinKwon
Copy link
Member

Merged to master and branch-2.3.

@asfgit asfgit closed this in e01919e Jan 18, 2018
asfgit pushed a commit that referenced this pull request Feb 15, 2018
## What changes were proposed in this pull request?
This PR is to revert the PR #20302, because it causes a regression.

## How was this patch tested?
N/A

Author: gatorsmile <gatorsmile@gmail.com>

Closes #20614 from gatorsmile/revertJsonFix.

(cherry picked from commit 95e4b49)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
asfgit pushed a commit that referenced this pull request Feb 15, 2018
## What changes were proposed in this pull request?
This PR is to revert the PR #20302, because it causes a regression.

## How was this patch tested?
N/A

Author: gatorsmile <gatorsmile@gmail.com>

Closes #20614 from gatorsmile/revertJsonFix.
@brkyvz brkyvz deleted the json-invfix branch February 3, 2019 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants