-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32258][SQL] NormalizeFloatingNumbers directly normalizes IF/CaseWhen/Coalesce child expressions #29061
Conversation
.../test/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingPointNumbersSuite.scala
Outdated
Show resolved
Hide resolved
Test build #125516 has started for PR 29061 at commit |
This comment has been minimized.
This comment has been minimized.
.../test/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingPointNumbersSuite.scala
Outdated
Show resolved
Hide resolved
.../test/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingPointNumbersSuite.scala
Outdated
Show resolved
Hide resolved
...talyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
Show resolved
Hide resolved
cc @cloud-fan |
...talyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
Show resolved
Hide resolved
Test build #125542 has finished for PR 29061 at commit
|
Test build #125706 has finished for PR 29061 at commit
|
retest this please |
Test build #125707 has finished for PR 29061 at commit
|
.../test/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingPointNumbersSuite.scala
Outdated
Show resolved
Hide resolved
...talyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
Show resolved
Hide resolved
Test build #125719 has finished for PR 29061 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @viirya and all.
Merged to master for Apache Spark 3.1.0 on December 2020.
|
||
case Coalesce(children) => | ||
Coalesce(children.map(normalize)) | ||
|
||
case _ if expr.dataType == FloatType || expr.dataType == DoubleType => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we put these new cases after this case? The main goal of this optimization is to avoid constructing a new CreateStruct
during normalization. If it's just a float/double type If/CashWhen/Coalesce, it's actually an overhead to duplicate the normalization work in each child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay.
…double If/CaseWhen/Coalesce ### What changes were proposed in this pull request? This is followup to #29061. See #29061 (comment). Basically this moves If/CaseWhen/Coalesce case patterns after float/double case so we don't duplicate normalization on children for float/double If/CaseWhen/Coalesce. ### Why are the changes needed? Simplify expression tree. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Modify unit tests. Closes #29091 from viirya/SPARK-32258-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This patch proposes to let
NormalizeFloatingNumbers
rule directly normalizes on certain children expressions. It could simplify expression tree.Why are the changes needed?
Currently NormalizeFloatingNumbers rule treats some expressions as black box but we can optimize it a bit by normalizing directly the inner children expressions.
Also see #28962 (comment).
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests.