-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20211][SQL] Fix the Precision and Scale of Decimal Values when the Input is BigDecimal between -1.0 and 1.0 #18244
Conversation
LGTM, pending test |
Test build #77826 has finished for PR 18244 at commit
|
Test build #77828 has finished for PR 18244 at commit
|
@@ -126,7 +126,15 @@ final class Decimal extends Ordered[Decimal] with Serializable { | |||
def set(decimal: BigDecimal): Decimal = { | |||
this.decimalVal = decimal | |||
this.longVal = 0L | |||
this._precision = decimal.precision | |||
if (decimal.compare(BigDecimal(1.0)) == -1 && decimal.compare(BigDecimal(-1.0)) == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (decimal.presision <= decimal.scale) {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just if (decimal.presision < decimal.scale) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! @davies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stanzhai That is not enough. 0.90
's precision is 2 and scale is also 2. When the scale and precision are identical, we still need to update the precision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we allow the case that precision equals to scale? Seems it works for the example expressions.
scala> sql("select floor(0.9)").show()
+----------+
|FLOOR(0.9)|
+----------+
| 0|
+----------+
scala> sql("select ceil(0.9)").show()
+---------+
|CEIL(0.9)|
+---------+
| 1|
+---------+
scala> sql("select 1 > 0.9").show()
+-----------------------+
|(CAST(1 AS BIGINT) > 0)|
+-----------------------+
| true|
+-----------------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to correct the precision to make it consistent with what the other database system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean 0.9 has precision 2 and scale 1 in other database systems? I just tried MySQL, it seems to be precision=1 and scale=1. Not sure if other database systems have the behavior as you said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, seems that the numbers like 0.9 are the only cases that the precision equals to the scale. If we force precision as scale + 1, looks like we don't support precision=scale anymore for DecimalType
. Then we may need to update the related check and the document in DecimalType
which explicitly says that scale can be equal to precision. Maybe I miss anything about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can try DB2. 0.9
's precision is 2, scale is 1. 0.90
's precision is 3, scale is 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should still allow the case in which the precision equals to the scale. In the traditional database, .99
's precision is 2 and scale is 2; but 0.99
's precision is 3 and scale is 2
Test build #77838 has started for PR 18244 at commit |
lgtm |
@@ -126,7 +126,15 @@ final class Decimal extends Ordered[Decimal] with Serializable { | |||
def set(decimal: BigDecimal): Decimal = { | |||
this.decimalVal = decimal | |||
this.longVal = 0L | |||
this._precision = decimal.precision | |||
if (decimal.precision <= decimal.scale) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the comment is // For Decimal, we expect the precision is equal to or large than the scale
.
=
has been processed within the function floor
and ceil
.
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala#L387
This is reason that I think we should use if (decimal.precision < decimal.scale)
, and it works fine for 0.90
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix is not for floor
and ceil
only. We need to correct the precision if the value is from BigDecimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks for the fix!
how about i think constant doube value should be parsed to double as old spark version, not to bigdecimal in antlr4 |
retest this please |
@discipleforteen Your question is not related to this PR. Converting to the decimal is pretty common for avoiding loss of precision. For example, the link shows how DB2 choose the type when users specify the constant. |
Test build #77848 has finished for PR 18244 at commit
|
LGTM |
Thanks! Merging to master. Will submit PRs for 2.2/2.1/2.0 |
…al Values when the Input is BigDecimal between -1.0 and 1.0 ### What changes were proposed in this pull request? This PR is to backport #18244 to 2.2 --- The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0. The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the [JAVA's BigDecimal definition](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html). However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion. Before this PR, the following queries failed: ```SQL select 1 > 0.0001 select floor(0.0001) select ceil(0.0001) ``` ### How was this patch tested? Added test cases. Author: gatorsmile <gatorsmile@gmail.com> Closes #18297 from gatorsmile/backport18244.
…al Values when the Input is BigDecimal between -1.0 and 1.0 ### What changes were proposed in this pull request? This PR is to backport #18244 to 2.2 --- The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0. The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the [JAVA's BigDecimal definition](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html). However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion. Before this PR, the following queries failed: ```SQL select 1 > 0.0001 select floor(0.0001) select ceil(0.0001) ``` ### How was this patch tested? Added test cases. Author: gatorsmile <gatorsmile@gmail.com> Closes #18297 from gatorsmile/backport18244. (cherry picked from commit 6265119) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…al Values when the Input is BigDecimal between -1.0 and 1.0 ### What changes were proposed in this pull request? This PR is to backport #18244 to 2.2 --- The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0. The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the [JAVA's BigDecimal definition](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html). However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion. Before this PR, the following queries failed: ```SQL select 1 > 0.0001 select floor(0.0001) select ceil(0.0001) ``` ### How was this patch tested? Added test cases. Author: gatorsmile <gatorsmile@gmail.com> Closes #18297 from gatorsmile/backport18244. (cherry picked from commit 6265119) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
… the Input is BigDecimal between -1.0 and 1.0 ### What changes were proposed in this pull request? The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0. The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the [JAVA's BigDecimal definition](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html). However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion. Before this PR, the following queries failed: ```SQL select 1 > 0.0001 select floor(0.0001) select ceil(0.0001) ``` ### How was this patch tested? Added test cases. Author: Xiao Li <gatorsmile@gmail.com> Closes apache#18244 from gatorsmile/bigdecimal.
…al Values when the Input is BigDecimal between -1.0 and 1.0 This PR is to backport apache#18244 to 2.2 --- The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0. The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the [JAVA's BigDecimal definition](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html). However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion. Before this PR, the following queries failed: ```SQL select 1 > 0.0001 select floor(0.0001) select ceil(0.0001) ``` Added test cases. Author: gatorsmile <gatorsmile@gmail.com> Closes apache#18297 from gatorsmile/backport18244. (cherry picked from commit 6265119) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The precision and scale of decimal values are wrong when the input is BigDecimal between -1.0 and 1.0.
The BigDecimal's precision is the digit count starts from the leftmost nonzero digit based on the JAVA's BigDecimal definition. However, our Decimal decision follows the database decimal standard, which is the total number of digits, including both to the left and the right of the decimal point. Thus, this PR is to fix the issue by doing the conversion.
Before this PR, the following queries failed:
How was this patch tested?
Added test cases.