Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38534][SQL][TESTS] Disable to_timestamp('366', 'DD') test case #35825

Closed
wants to merge 1 commit into from
Closed

[SPARK-38534][SQL][TESTS] Disable to_timestamp('366', 'DD') test case #35825

wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Mar 12, 2022

What changes were proposed in this pull request?

This PR aims to disable to_timestamp('366', 'DD') to recover ansi test suite in Java11+.

Why are the changes needed?

Currently, Daily Java 11 and 17 GitHub Action jobs are broken.

Java 8

$ bin/spark-shell --conf spark.sql.ansi.enabled=true
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/03/12 00:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://172.16.0.31:4040
Spark context available as 'sc' (master = local[*], app id = local-1647075572229).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.3.0-SNAPSHOT
      /_/

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_322)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sql("select to_timestamp('366', 'DD')").show
java.time.format.DateTimeParseException: Text '366' could not be parsed, unparsed text found at index 2. If necessary set spark.sql.ansi.enabled to false to bypass this error.

Java 11+

$ bin/spark-shell --conf spark.sql.ansi.enabled=true
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/03/12 01:00:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://172.16.0.31:4040
Spark context available as 'sc' (master = local[*], app id = local-1647075607932).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.3.0-SNAPSHOT
      /_/

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 11.0.12)
Type in expressions to have them evaluated.
Type :help for more information.

scala> sql("select to_timestamp('366', 'DD')").show
java.time.DateTimeException: Invalid date 'DayOfYear 366' as '1970' is not a leap year. If necessary set spark.sql.ansi.enabled to false to bypass this error.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Test with Java 11+.

BEFORE

$ java -version
openjdk version "17.0.2" 2022-01-18 LTS
OpenJDK Runtime Environment Zulu17.32+13-CA (build 17.0.2+8-LTS)
OpenJDK 64-Bit Server VM Zulu17.32+13-CA (build 17.0.2+8-LTS, mixed mode, sharing)

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z ansi/datetime-parsing-invalid.sql"
...
[info] SQLQueryTestSuite:
01:23:00.219 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
01:23:05.209 ERROR org.apache.spark.sql.SQLQueryTestSuite: Error using configs:
[info] - ansi/datetime-parsing-invalid.sql *** FAILED *** (267 milliseconds)
[info]   ansi/datetime-parsing-invalid.sql
[info]   Expected "java.time.[format.DateTimeParseException
[info]   Text '366' could not be parsed, unparsed text found at index 2]. If necessary set s...", but got "java.time.[DateTimeException
[info]   Invalid date 'DayOfYear 366' as '1970' is not a leap year]. If necessary set s..." Result did not match for query #8
[info]   select to_timestamp('366', 'DD') (SQLQueryTestSuite.scala:476)
...
[info] Run completed in 7 seconds, 389 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 0, failed 1, canceled 0, ignored 0, pending 0
[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.sql.SQLQueryTestSuite
[error] (sql / Test / testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 21 s, completed Mar 12, 2022, 1:23:05 AM

AFTER

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z ansi/datetime-parsing-invalid.sql"
...
[info] SQLQueryTestSuite:
[info] - ansi/datetime-parsing-invalid.sql (390 milliseconds)
...
[info] Run completed in 7 seconds, 673 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 20 s, completed Mar 12, 2022, 1:24:52 AM

@github-actions github-actions bot added the SQL label Mar 12, 2022
@dongjoon-hyun
Copy link
Member Author

@dongjoon-hyun
Copy link
Member Author

Also, cc @MaxGekk because Apache Spark 3.3.0 Java11/Java17 tests are broken.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-38534][SQL][TESTS] Disable to_timestamp('366', 'DD') test case [SPARK-38534][SQL][TESTS] Disable ANSI to_timestamp('366', 'DD') test case Mar 12, 2022
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-38534][SQL][TESTS] Disable ANSI to_timestamp('366', 'DD') test case [SPARK-38534][SQL][TESTS] Disable to_timestamp('366', 'DD') test case Mar 12, 2022
Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. I also verified on my local with java 11

@dongjoon-hyun
Copy link
Member Author

Thank you, @gengliangwang and @MaxGekk . Merged to master!

@dongjoon-hyun dongjoon-hyun deleted the SPARK-38534 branch March 12, 2022 10:48
@LuciferYang
Copy link
Contributor

LGTM+1

@LuciferYang
Copy link
Contributor

@dongjoon-hyun @gengliangwang @MaxGekk

Similar to the current pr, the following SQL will also have behavioral differences between Java 8 and Java 11 / 17

"select to_timestamp('365', 'DD')"

Java 8 failed as:

You may get a different result due to the upgrading to Spark >= 3.0: Fail to parse '365' in the new parser. You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string.
org.apache.spark.SparkUpgradeException: You may get a different result due to the upgrading to Spark >= 3.0: Fail to parse '365' in the new parser. You can set spark.sql.legacy.timeParserPolicy to LEGACY to restore the behavior before Spark 3.0, or set to CORRECTED and treat it as an invalid datetime string.
   at org.apache.spark.sql.errors.QueryExecutionErrors$.failToParseDateTimeInNewParserError(QueryExecutionErrors.scala:1008)
   at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:148)
   at org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:141)
   at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
   at org.apache.spark.sql.catalyst.util.Iso8601TimestampFormatter.parse(TimestampFormatter.scala:133)
   at org.apache.spark.sql.catalyst.expressions.ToTimestamp.eval(datetimeExpressions.scala:1211)

Java 11/17 will successed and return 1970-12-31 00:00:00.0

The reason for the difference may be that 'DD' has the following definitions in the new Java version:

     *    D       1      appendValue(ChronoField.DAY_OF_YEAR)
     *    DD      2      appendValue(ChronoField.DAY_OF_YEAR, 2, 3, SignStyle.NOT_NEGATIVE)
     *    DDD     3      appendValue(ChronoField.DAY_OF_YEAR, 3)

@dongjoon-hyun
Copy link
Member Author

@LuciferYang That doesn't sound like a regression . :)

@dongjoon-hyun
Copy link
Member Author

If you want, you can make a separate documentation JIRA for that, but I don't think Java 8 users will care about that.

@LuciferYang
Copy link
Contributor

@LuciferYang That doesn't sound like a regression . :)

OK ~ got it ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants