Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Add Span-based APIs to BigInteger #22689

Merged
merged 4 commits into from
Aug 2, 2017

Conversation

stephentoub
Copy link
Member

Adds:

public BigInteger(ReadOnlySpan<byte> value);
public int GetByteCount();
public static BigInteger Parse(ReadOnlySpan<char> value, NumberStyles style, IFormatProvider provider);
public static bool TryParse(ReadOnlySpan<char> value, out BigInteger result, NumberStyles style, IFormatProvider provider);
public bool TryWriteBytes(Span<byte> destination, out int bytesWritten);

Rather than duplicate a lot of logic, existing code paths were changed to use {ReadOnly}Span, with string and array-based overloads targeting them as well.

cc: @axelheer, @mellinoe, @bartonjs
Fixes https://github.com/dotnet/corefx/issues/22401

Copy link
Member

@bartonjs bartonjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, though I read it between talks, so I was maybe not as thorough as I could have been

return TryParseBigInteger(AsReadOnlySpan(value), style, info, out result);
}

[SecuritySafeCritical]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why CAS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply because it was on the other overload. I figure either a) we've not yet cleaned these out and so it can be removed when the rest are, or b) someone left it on purpose and so I should have it here as well.

@stephentoub stephentoub added the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Jul 27, 2017
@stephentoub
Copy link
Member Author

I took a look at the perf impact of the changes; the ideal would be that they don't have a negative impact (I don't think we could hope for a positive one).

Parse/TryParse both appear to be largely unaffected, even though they represent the bulk of the consolidated code paths... I expect that's because a) the implementations are fairly meaty and b) the bulk of those implementations is using pointers, so it's just a matter of whether we're using fixed with a string/byte array or with a span.

ToByteArray is unfortunately impacted. Worst case is a single digit BigInteger, where the overhead (for the admittedly very fast operation) was huge, almost 50%, whereas at 150 digits, it was around ~5%. I'd really like to avoid duplicating this fairly chunky method essentially three times (ToByteArray, TryWriteBytes, and GetByteCount), but the extra overhead associated with handling the three different modes appears to be prohibitive. I'll revisit it.

The BigInteger(byte[]) ctor is the more interesting case:
https://github.com/dotnet/corefx/pull/22689/files#diff-a57a8c64886231e5bde022a3eec12e6fR252
It also regresses, but it's interesting because the body of the ctor didn't have to change syntactically at all, so the only difference here is the byte[] getting wrapped in a ReadOnlySpan<byte> and then the indexing/etc. of the span instead of the byte[]. For a 1000-digit BigInteger, the overhead of the new version on my machine is ~5-8%. I tried both profiling and looking at the asm to see if I could spot the difference, but I wasn't able to, and the asm is very similar between the two. @jkotas, @russellhadley, is this to be expected at this point? Do we expect gaps here to shrink? If you want to try it out, here's a simple change that doesn't involve everything else in this PR:
stephentoub@2a1db23
All that does is change the BigInteger(byte[]) implementation to use a span instead of an array. And with this test:

using System;
using System.Diagnostics;
using System.Linq;
using System.Numerics;

class Program
{
    static void Main()
    {
        byte[] data = BigInteger.Parse(string.Join("", Enumerable.Repeat(1, 1000))).ToByteArray();
        var sw = new Stopwatch();
        while (true)
        {
            sw.Restart();
            for (int i = 0; i < 5000000; i++) new BigInteger(data);
            sw.Stop();
            Console.WriteLine(sw.Elapsed.TotalMilliseconds);
        }
    }
}

I similarly see an ~5-10% regression from the changed version.

@stephentoub
Copy link
Member Author

I opened https://github.com/dotnet/coreclr/issues/13097 for at least some of the array vs span difference.

@stephentoub stephentoub removed the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Aug 1, 2017
@stephentoub
Copy link
Member Author

@dotnet-bot test UWP NETNative x86 Release Build please ("Error fetching remote repo 'origin'"... FYI @dotnet/dnceng, been seeing a fair number of these today)

@stephentoub
Copy link
Member Author

@mellinoe, I added a perf test project to System.Runtime.Numerics (contributing to https://github.com/dotnet/corefx/issues/18248) and added tests for the relevant methods here on BigInteger.

Copy link
Contributor

@mellinoe mellinoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this. I expect that we will see some more Span-related performance tweaks for the 2.1 release, given how far off it is, and how much of the BCL it will affect. Are there still "worst-case" effects around 50% for this, or were you able to optimize those out?

ValidateGetByteCountAndTryWriteBytes(new BigInteger(l), expectedBytes);

[Theory]
public void TryWriteBytes_FromStringTests(string str, byte[] expectedBigEndianBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this missing a [MemberData] ? "FromStringTests_MemberData"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, will fix, thanks.

Assert.Equal(expectedBytes.Length, count);

int bytesWritten;
for (int i = 0; i < 2; i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop feels a little silly. How about a local function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it silly? Regardless, sure, I can make it a local function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it's a matter of taste, but it feels odd to me to use a loop just to repeat the same logic twice with a different array length. It gives me the impression that the for loop is significant in some way, but it's just a clever trick to re-use logic.

@stephentoub
Copy link
Member Author

Are there still "worst-case" effects around 50% for this, or were you able to optimize those out?

No, at least not with what I tested. In all cases I looked at, BigInteger(byte[]) is now as good or better than it used to be. Parse remains effectively unchanged. ToByteArray is now worst case ~15% slower than it was; this appears to be the case for values <= int.MaxValue, as that's the threshold where the BigInteger switches to using an internal array. For values greater than that, ToByteArray now appears to be faster than it was, anywhere from 5-15% on my machine. That seems like a worthwhile trade-off, given a) the savings of duplicated code, and b) that BigInteger is for big integers :) And hopefully you're not using ToByteArray in a tight loop, anyway.

@mellinoe
Copy link
Contributor

mellinoe commented Aug 2, 2017

That seems like a worthwhile trade-off, given a) the savings of duplicated code, and b) that BigInteger is for big integers :) And hopefully you're not using ToByteArray in a tight loop, anyway.

Yeah, agreed.

Adds:
```C#
public BigInteger(ReadOnlySpan<byte> value);
public int GetByteCount();
public static BigInteger Parse(ReadOnlySpan<char> value, NumberStyles style, IFormatProvider provider);
public static bool TryParse(ReadOnlySpan<char> value, out BigInteger result, NumberStyles style, IFormatProvider provider);
public bool TryWriteBytes(Span<byte> destination, out int bytesWritten);
```

Rather than duplicate a lot of logic, existing code paths were changed to use {ReadOnly}Span, with string and array-based overloads targeting them as well.
We incurred some perf regressions in BigInteger(byte[]) and ToByteArray() due to the span changes.  This mostly fixes them.  In fact, BigInteger(byte[]) is now a bit faster than it was before the span changes.  There is still a small regression in ToByteArray() for values <= int.MaxValue, but for values larger than that, there's a similarly-sized improvement over the original version; since BigInteger is all about, well, big integers, this looks like the right trade-off, and it allows us to continue avoiding to triplicate the same code path for ToByteArray/GetByteCount/TryWriteBytes.
@smile21prc
Copy link
Contributor

smile21prc commented Aug 2, 2017

@stephentoub , thanks for reporting. I've created https://github.com/dotnet/core-eng/issues/1403 on "test UWP NETNative x86 Release Build please“ error and will take a look.

@stephentoub
Copy link
Member Author

Thanks, @smile21prc.

@smile21prc
Copy link
Contributor

@stephentoub this time "UWP NETNative x86 Release Build“ succeeded, do you know any PR has this leg failed for me to take a look?

@stephentoub
Copy link
Member Author

do you know any PR has this leg failed for me to take a look?

Not at the moment. Will let you know the next time I see it.

@smile21prc
Copy link
Contributor

Thanks. @stephentoub . Sometime GitHub has random slowness which might cause random failure like this. I'll resolve https://github.com/dotnet/core-eng/issues/1403 for now, and if you see it again, please feel free to re-open it, or create a new one.

@stephentoub stephentoub merged commit 91f993e into dotnet:master Aug 2, 2017
@stephentoub stephentoub deleted the biginteger_span branch August 2, 2017 18:30
@karelz karelz modified the milestone: 2.1.0 Aug 14, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants