-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48348][SPARK-48376][SQL] Introduce LEAVE
and ITERATE
statements
#47973
Conversation
// Stop the iteration. | ||
stopIteration = true | ||
|
||
// TODO: Variable cleanup (once we add SQL script execution logic). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be done once we introduce execution. For now, we are adding DropVariable
statements at the end of each block implicitly. However, introduction of LEAVE
and ITERATE
statements means that these implicit statements will be skipped and we need to add logic to execute them immediately. Currently, we cannot do so, because execution nodes don't have execution logic yet.
c9b8372
to
b5b6b25
Compare
ctx: RuleContext, label: String, isLeave: Boolean): Boolean = { | ||
ctx match { | ||
case c: BeginEndCompoundBlockContext | ||
if isLeave && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we have a dedicated error for using ITERATE on the label of BEGIN END compound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
sql/core/src/main/scala/org/apache/spark/sql/scripting/SqlScriptingExecutionNode.scala
Show resolved
Hide resolved
@@ -35,6 +35,10 @@ class SqlScriptingExecutionNodeSuite extends SparkFunSuite with SharedSparkSessi | |||
override def reset(): Unit = () | |||
} | |||
|
|||
case class TestLeaveStatement(labelText: String) extends LeaveStatementExec(labelText) | |||
|
|||
case class TestIterateStatement(labelText: String) extends IterateStatementExec(labelText) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use the real Leave/IterateStatementExec
in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I'm dumb :) Used these in extractStatementValue
to match on labelText
directly instead of matching by type only and using Leave/IterateStatementExec
... will change in the next commit to use regular exec nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
LEAVE
and ITERATE
statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
cc @yaooqinn , too.
@@ -2453,6 +2453,12 @@ | |||
}, | |||
"sqlState" : "42K0K" | |||
}, | |||
"INVALID_ITERATE_LABEL_USAGE_FOR_COMPOUND" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these error classes be put together with subclasses? like
INVALID_LABEL.ITERATE_FOR_COMPOUND / INVALID_LABEL.IN_STATEMENT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// Handle LEAVE or ITERATE statement if it has been encountered. | ||
retStmt match { | ||
case leaveStatementExec: LeaveStatementExec if !leaveStatementExec.hasBeenMatched => | ||
if (label.contains(leaveStatementExec.label)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we use contains
instead of equals
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not contains
like in Strings, it's from the perspective of Option
, i.e. it means whether the value within the Option
is leaveStatementExec.label
LGTM |
@cloud-fan @yaooqinn @dongjoon-hyun can someone merge this now, all tests are fine? also, could you please close both JIRA items, because it seems that only the first one from the title gets linked to the PR. |
thanks, merging to master! |
…nSqlScript in SQL Scripting Interpreter test suite ### What changes were proposed in this pull request? Previous [pull request](#47973) introduced new tests to `SqlScriptingInterpreterSuite` (among others) where accidentally `parseScript` was used instead of `runSqlScript`. While the same exception would get thrown (since it happens in the parsing phase) it violates the consistency among the tests in this suite and adds unnecessary import, so it would be nice to change it. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch alters tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #48016 from davidm-db/interpreter_test_suite_fix. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…nSqlScript in SQL Scripting Interpreter test suite ### What changes were proposed in this pull request? Previous [pull request](apache/spark#47973) introduced new tests to `SqlScriptingInterpreterSuite` (among others) where accidentally `parseScript` was used instead of `runSqlScript`. While the same exception would get thrown (since it happens in the parsing phase) it violates the consistency among the tests in this suite and adds unnecessary import, so it would be nice to change it. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch alters tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #48016 from davidm-db/interpreter_test_suite_fix. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
### What changes were proposed in this pull request? Previous [pull request](#47973) introduced new logic for `LEAVE` and `ITERATE` statements, but I missed to introduce minor part of `ITERATE` statement handling. While `ITERATE` statement is not allowed for compounds (i.e. user cannot use `LEAVE <label>` where label belongs to compound) it's still possible to use `ITERATE` statement within the compound (i.e. when using label of the surrounding `WHILE` loop for example). In such cases, it's required to drop variables defined in the compound (the same way as for `ITERATE` statement). We don't yet have execution logic (I will work on POC and design doc during the next week), but this is setting up stuff now, so we don't miss anything in future. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added small test, but not completely related to this change. Since we don't have execution logic yet, it's hard to test. I will add more tests together with the execution logic. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #48022 from davidm-db/missing_logic_for_leave_statement. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…ments ### What changes were proposed in this pull request? This PR proposes introduction of `LEAVE` and `ITERATE` statement types to SQL Scripting language: - `LEAVE` statement can be used in loops, as well as in `BEGIN ... END` compound blocks. - `ITERATE` statement can be used only in loops. This PR introduces: - Logical operators for both statement types. - Execution nodes for both statement types. - Interpreter changes required to build execution plans that support new statement types. - New error if statements are not used properly. - Minor changes required to support new keywords. ### Why are the changes needed? Adding support for new statement types to SQL Scripting language. ### Does this PR introduce _any_ user-facing change? This PR introduces new statement types that will be available to users. However, script execution logic hasn't been done yet, so the new changes are not accessible by users yet. ### How was this patch tested? Tests are introduced to all test suites related to SQL scripting. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47973 from davidm-db/sql_scripting_leave_iterate. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…nSqlScript in SQL Scripting Interpreter test suite ### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new tests to `SqlScriptingInterpreterSuite` (among others) where accidentally `parseScript` was used instead of `runSqlScript`. While the same exception would get thrown (since it happens in the parsing phase) it violates the consistency among the tests in this suite and adds unnecessary import, so it would be nice to change it. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch alters tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48016 from davidm-db/interpreter_test_suite_fix. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new logic for `LEAVE` and `ITERATE` statements, but I missed to introduce minor part of `ITERATE` statement handling. While `ITERATE` statement is not allowed for compounds (i.e. user cannot use `LEAVE <label>` where label belongs to compound) it's still possible to use `ITERATE` statement within the compound (i.e. when using label of the surrounding `WHILE` loop for example). In such cases, it's required to drop variables defined in the compound (the same way as for `ITERATE` statement). We don't yet have execution logic (I will work on POC and design doc during the next week), but this is setting up stuff now, so we don't miss anything in future. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added small test, but not completely related to this change. Since we don't have execution logic yet, it's hard to test. I will add more tests together with the execution logic. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48022 from davidm-db/missing_logic_for_leave_statement. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…ments ### What changes were proposed in this pull request? This PR proposes introduction of `LEAVE` and `ITERATE` statement types to SQL Scripting language: - `LEAVE` statement can be used in loops, as well as in `BEGIN ... END` compound blocks. - `ITERATE` statement can be used only in loops. This PR introduces: - Logical operators for both statement types. - Execution nodes for both statement types. - Interpreter changes required to build execution plans that support new statement types. - New error if statements are not used properly. - Minor changes required to support new keywords. ### Why are the changes needed? Adding support for new statement types to SQL Scripting language. ### Does this PR introduce _any_ user-facing change? This PR introduces new statement types that will be available to users. However, script execution logic hasn't been done yet, so the new changes are not accessible by users yet. ### How was this patch tested? Tests are introduced to all test suites related to SQL scripting. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47973 from davidm-db/sql_scripting_leave_iterate. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…nSqlScript in SQL Scripting Interpreter test suite ### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new tests to `SqlScriptingInterpreterSuite` (among others) where accidentally `parseScript` was used instead of `runSqlScript`. While the same exception would get thrown (since it happens in the parsing phase) it violates the consistency among the tests in this suite and adds unnecessary import, so it would be nice to change it. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch alters tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48016 from davidm-db/interpreter_test_suite_fix. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new logic for `LEAVE` and `ITERATE` statements, but I missed to introduce minor part of `ITERATE` statement handling. While `ITERATE` statement is not allowed for compounds (i.e. user cannot use `LEAVE <label>` where label belongs to compound) it's still possible to use `ITERATE` statement within the compound (i.e. when using label of the surrounding `WHILE` loop for example). In such cases, it's required to drop variables defined in the compound (the same way as for `ITERATE` statement). We don't yet have execution logic (I will work on POC and design doc during the next week), but this is setting up stuff now, so we don't miss anything in future. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added small test, but not completely related to this change. Since we don't have execution logic yet, it's hard to test. I will add more tests together with the execution logic. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48022 from davidm-db/missing_logic_for_leave_statement. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
…ments ### What changes were proposed in this pull request? This PR proposes introduction of `LEAVE` and `ITERATE` statement types to SQL Scripting language: - `LEAVE` statement can be used in loops, as well as in `BEGIN ... END` compound blocks. - `ITERATE` statement can be used only in loops. This PR introduces: - Logical operators for both statement types. - Execution nodes for both statement types. - Interpreter changes required to build execution plans that support new statement types. - New error if statements are not used properly. - Minor changes required to support new keywords. ### Why are the changes needed? Adding support for new statement types to SQL Scripting language. ### Does this PR introduce _any_ user-facing change? This PR introduces new statement types that will be available to users. However, script execution logic hasn't been done yet, so the new changes are not accessible by users yet. ### How was this patch tested? Tests are introduced to all test suites related to SQL scripting. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47973 from davidm-db/sql_scripting_leave_iterate. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…nSqlScript in SQL Scripting Interpreter test suite ### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new tests to `SqlScriptingInterpreterSuite` (among others) where accidentally `parseScript` was used instead of `runSqlScript`. While the same exception would get thrown (since it happens in the parsing phase) it violates the consistency among the tests in this suite and adds unnecessary import, so it would be nice to change it. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch alters tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48016 from davidm-db/interpreter_test_suite_fix. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
### What changes were proposed in this pull request? Previous [pull request](apache#47973) introduced new logic for `LEAVE` and `ITERATE` statements, but I missed to introduce minor part of `ITERATE` statement handling. While `ITERATE` statement is not allowed for compounds (i.e. user cannot use `LEAVE <label>` where label belongs to compound) it's still possible to use `ITERATE` statement within the compound (i.e. when using label of the surrounding `WHILE` loop for example). In such cases, it's required to drop variables defined in the compound (the same way as for `ITERATE` statement). We don't yet have execution logic (I will work on POC and design doc during the next week), but this is setting up stuff now, so we don't miss anything in future. ### Why are the changes needed? Changes are minor, they improve consistency among test suites for SQL scripting. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added small test, but not completely related to this change. Since we don't have execution logic yet, it's hard to test. I will add more tests together with the execution logic. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#48022 from davidm-db/missing_logic_for_leave_statement. Authored-by: David Milicevic <david.milicevic@databricks.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>
What changes were proposed in this pull request?
This PR proposes introduction of
LEAVE
andITERATE
statement types to SQL Scripting language:LEAVE
statement can be used in loops, as well as inBEGIN ... END
compound blocks.ITERATE
statement can be used only in loops.This PR introduces:
Why are the changes needed?
Adding support for new statement types to SQL Scripting language.
Does this PR introduce any user-facing change?
This PR introduces new statement types that will be available to users. However, script execution logic hasn't been done yet, so the new changes are not accessible by users yet.
How was this patch tested?
Tests are introduced to all test suites related to SQL scripting.
Was this patch authored or co-authored using generative AI tooling?
No.