Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert [SPARK-23094] Fix invalid character handling in JsonDataSource #20614

Closed
wants to merge 4 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Feb 14, 2018

What changes were proposed in this pull request?

This PR is to revert the PR #20302, because it causes a regression.

How was this patch tested?

N/A

@gatorsmile
Copy link
Member Author

@brkyvz
Copy link
Contributor

brkyvz commented Feb 14, 2018

LGTM, my initial assumption that files had to be UTF-8 encoded was a wrong one :(

@HyukjinKwon
Copy link
Member

Wait .. do we formally support UTF-16 encoded files in line by line parsing via JSON? I think this accidentally works because the newline character in UTF-8 is the part of UTF-16.

I think I am fine to revert it but how about we avoid adding a test?

@SparkQA
Copy link

SparkQA commented Feb 15, 2018

Test build #87456 has finished for PR 20614 at commit d4015d0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

The failure seems to be irrelevant to this.

org.apache.spark.sql.execution.streaming.RateSourceV2Suite.basic microbatch execution

@dongjoon-hyun
Copy link
Member

Retest this please

@gatorsmile
Copy link
Member Author

Saw the history. Our UTF-16 support is pretty weak. Let me revert the test case.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we should remove the UTF-16 file too.

LGTM too.

@SparkQA
Copy link

SparkQA commented Feb 15, 2018

Test build #87467 has finished for PR 20614 at commit d4015d0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Since this is just to remove the file and the previous test already passed, I would merge it to master/2.3. Thanks!

@gatorsmile
Copy link
Member Author

BTW, the ongoing test will be killed by -9 at midnight

asfgit pushed a commit that referenced this pull request Feb 15, 2018
## What changes were proposed in this pull request?
This PR is to revert the PR #20302, because it causes a regression.

## How was this patch tested?
N/A

Author: gatorsmile <gatorsmile@gmail.com>

Closes #20614 from gatorsmile/revertJsonFix.

(cherry picked from commit 95e4b49)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
@asfgit asfgit closed this in 95e4b49 Feb 15, 2018
@SparkQA
Copy link

SparkQA commented Feb 15, 2018

Test build #87470 has finished for PR 20614 at commit 86c88ae.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 15, 2018

Test build #87473 has finished for PR 20614 at commit 0bb86c6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants