Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33432][SQL] SQL parser should use active SQLConf #30357

Closed
wants to merge 3 commits into from

Conversation

luluorta
Copy link
Contributor

@luluorta luluorta commented Nov 12, 2020

What changes were proposed in this pull request?

This PR makes SQL parser using active SQLConf instead of the one in ctor-parameters.

Why are the changes needed?

In ANSI mode, schema string parsing should fail if the schema uses ANSI reserved keyword as attribute name:

spark.conf.set("spark.sql.ansi.enabled", "true")
spark.sql("""select from_json('{"time":"26/10/2015"}', 'time Timestamp', map('timestampFormat',  'dd/MM/yyyy'));""").show

output:

Cannot parse the data type:
no viable alternative at input 'time'(line 1, pos 0)

== SQL ==
time Timestamp
^^^

But this query may accidentally succeed in certain cases cause the DataType parser sticks to the configs of the first created session in the current thread:

DataType.fromDDL("time Timestamp")
val newSpark = spark.newSession()
newSpark.conf.set("spark.sql.ansi.enabled", "true")
newSpark.sql("""select from_json('{"time":"26/10/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'));""").show

output:

+--------------------------------+
|from_json({"time":"26/10/2015"})|
+--------------------------------+
| {2015-10-26 00:00...|
+--------------------------------+

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Newly and updated UTs

@github-actions github-actions bot added the SQL label Nov 12, 2020
@SparkQA
Copy link

SparkQA commented Nov 12, 2020

Test build #131011 has finished for PR 30357 at commit a166c40.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging
  • abstract class AbstractSqlParser extends ParserInterface with Logging
  • class CatalystSqlParser extends AbstractSqlParser
  • class SparkSqlParser extends AbstractSqlParser
  • class SparkSqlAstBuilder extends AstBuilder
  • class VariableSubstitution

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35637/

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35637/

@HyukjinKwon
Copy link
Member

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Test build #131031 has finished for PR 30357 at commit 6d02b6c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging
  • abstract class AbstractSqlParser extends ParserInterface with Logging
  • class CatalystSqlParser extends AbstractSqlParser
  • class SparkSqlParser extends AbstractSqlParser
  • class SparkSqlAstBuilder extends AstBuilder
  • class VariableSubstitution

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one comment

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35653/

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35653/

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35658/

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35658/

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Test build #131047 has finished for PR 30357 at commit 8464da0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Test build #131049 has finished for PR 30357 at commit 4139f6d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 13, 2020

Test build #131053 has finished for PR 30357 at commit acb19f2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

assertEqual("a rlike '^\\x20[\\x20-\\x23]+$'", 'a rlike "^\\x20[\\x20-\\x23]+$", parser)
assertEqual("a rlike 'pattern\\\\'", 'a rlike "pattern\\\\", parser)
assertEqual("a rlike 'pattern\\t\\n'", 'a rlike "pattern\\t\\n", parser)
withSQLConf(SQLConf.ESCAPED_STRING_LITERALS.key -> "true") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style changes look like another benefit. Thanks.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you , @luluorta and @cloud-fan .
I also verified in spark-shell interactively.

Merged to master for Apache Spark 3.1.

@dongjoon-hyun
Copy link
Member

If this is required at branch-3.0 as described in JIRA, please make a backporting PR, @luluorta .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants