Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/decimal support #982

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

GoEddie
Copy link
Contributor

@GoEddie GoEddie commented Oct 14, 2021

We are excited to review your PR.

So we can do the best job, please check:

  • There's a descriptive title that will make sense to other developers some time from now.
  • There's associated issues. All PR's should have issue(s) associated - unless a trivial self-evident change such as fixing a typo. You can use the format Fixes #nnnn in your description to cause GitHub to automatically close the issue(s) when your PR is merged.
  • Your change description explains what the change does, why you chose your approach, and anything else that reviewers should know.
  • You have included any necessary tests in the same PR.

This implements #818

On the Apache Spark side decimal is implemented using a java.math.BigDecimal and the only way to construct that is using a int/long - there is no way to construct it with any value larger than max long unless you use the string constructor so I pass a string back and forth between .NET and the JVM, hope that is ok.

@GoEddie GoEddie changed the title [WIP] Feature/decimal support Feature/decimal support Oct 14, 2021
@GoEddie GoEddie marked this pull request as ready for review October 14, 2021 22:03
@Macromullet
Copy link
Contributor

Can we get some traction on this one? It looks like this PR is ready and we have a lot of code that uses decimals that will be difficult to write tests around without this.

/// </summary>
/// <param name="s">The stream to write</param>
/// <param name="value">The decimal to write</param>
public static void Write(Stream s, decimal value) => Write(s, value.ToString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use ToString(CultureInfo.InvariantCulture) if we are using a string on the wire.

@@ -267,6 +267,9 @@ private ISocketWrapper GetConnection()
case 'd':
returnValue = SerDe.ReadDouble(inputStream);
break;
case 'm':
returnValue = decimal.Parse(SerDe.ReadString(inputStream));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use decimal.Parse(SerDe.ReadString(inputStream), CultureInfo.InvariantCulture) to ensure we are using invariant culture on the wire.

Row row = df.Collect().First();
Assert.Equal(decimal.MinValue, row[0]);
Assert.Equal(decimal.MaxValue, row[1]);
Assert.Equal(decimal.Zero, row[2]);
Copy link
Contributor

@cutecycle cutecycle May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gotten to dive deep into whether this is an issue yet, but want to bring it to attention just in case:

There was a time when we were comparing SQL Server output to Spark SQL output trying to migrate a pipeline to Synapse, and when attempting to diff two tables, found an issue with a double.

SQL Server uses, presumably, C#'s (and JavaScript, which the Python Notebook table preview in Synapse uses)'s conception of floats: -0.0 == 0.0, but the JVM/Spark in some cases compares by bit and differentiates because of the signed bit: -0.0 != 0.0.
image
image
image

It's resolved in later versions of Spark's DataFrames, and may not apply in the case of [decimal]String, so it may not be problematic.

Copy link
Contributor

@cutecycle cutecycle Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because internally, BigDecimal uses BigInteger, and BigInteger also only has a single concept of zero. A BigInteger behaves as a two's-complement integer, and two's-complement only has a single zero.

@Macromullet
Copy link
Contributor

Any updates on this PR?

@AFFogarty
Copy link
Contributor

Hey @GoEddie , are you still working on this?

@GoEddie
Copy link
Contributor Author

GoEddie commented Feb 1, 2023

Hi @AFFogarty,

I had given up really as no one seemed to be reviewing pr’s but am happy to get it up to date again.

ed

@wudanzy
Copy link
Collaborator

wudanzy commented Jan 3, 2025

Hi @GoEddie , thanks for the contribution! Are you still working on this? We recently get write permission of this repo and happy to move this forward. We just added support for 3.3 and 3.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants