Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20549] 'java.io.CharConversionException: Invalid UTF-32' in JsonToStructs #17826

Closed
wants to merge 2 commits into from

Conversation

brkyvz
Copy link
Contributor

@brkyvz brkyvz commented May 1, 2017

What changes were proposed in this pull request?

A fix for the same problem was made in #17693 but ignored JsonToStructs. This PR uses the same fix for JsonToStructs.

How was this patch tested?

Regression test

@brkyvz
Copy link
Contributor Author

brkyvz commented May 1, 2017

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented May 2, 2017

Test build #76366 has finished for PR 17826 at commit 6d59636.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/2.2!

asfgit pushed a commit that referenced this pull request May 2, 2017
…nToStructs

## What changes were proposed in this pull request?

A fix for the same problem was made in #17693 but ignored `JsonToStructs`. This PR uses the same fix for `JsonToStructs`.

## How was this patch tested?

Regression test

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #17826 from brkyvz/SPARK-20549.

(cherry picked from commit 86174ea)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@asfgit asfgit closed this in 86174ea May 2, 2017
@HyukjinKwon
Copy link
Member

HyukjinKwon commented May 2, 2017

Hi @cloud-fan and @brkyvz, just while checking related issues with it just for my curiosity, I just realised that we are throwing an exception in XML related expressions as below:

scala> sql("SELECT xpath_string('<a><b>b</b><c>cc</c></','a/c')").show()
...
java.lang.RuntimeException: Invalid XML document: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 23; XML document structures must start and end within the same entity.
<a><b>b</b><c>cc</c></
  at org.apache.spark.sql.catalyst.expressions.xml.UDFXPathUtil.eval(UDFXPathUtil.java:72)
  at org.apache.spark.sql.catalyst.expressions.xml.UDFXPathUtil.evalString(UDFXPathUtil.java:81)
  at org.apache.spark.sql.catalyst.expressions.xml.XPathString.nullSafeEval(xpath.scala:186)
...

Should we fix these expressions too to return null for malformed inputs or just leave them out for now?

@cloud-fan
Copy link
Contributor

I'm not sure what's the semantic of xpath_string...

@HyukjinKwon
Copy link
Member

Maybe let me leave this out for now until someone raises an issue about this again. Thank you for your response @cloud-fan.

@brkyvz brkyvz deleted the SPARK-20549 branch February 3, 2019 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants