Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ReadOnlySequence<T>.GetPosition(SequencePosition) #771

Merged
merged 12 commits into from
Aug 17, 2020
Merged

Implement ReadOnlySequence<T>.GetPosition(SequencePosition) #771

merged 12 commits into from
Aug 17, 2020

Conversation

MarcoRossignoli
Copy link
Member

@MarcoRossignoli MarcoRossignoli commented Dec 11, 2019

object? positionSequenceObject = position.GetObject();

// if sequence object is null we suppose start segment
if (!positionIsNotNull)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used same semantic of Slice I don't know if it's ok, let me know

Copy link
Contributor

@ahsonkhan ahsonkhan Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the implementation too closely yet, but is there a benefit of the way GetOffset is implemented now instead of just doing the following:

GetOffset(SequencePosition position) => Slice(0, position).Length;

Are we doing different checks? Is it faster than doing all the checks that Slice/GetLength do that would be redundant here?

Probably creating an unnecessary/intermediary ReadOnlySequence would be too slow, but it might be worth measuring.

Copy link
Member Author

@MarcoRossignoli MarcoRossignoli Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan did some perf test and are similar(I need to understand from where that micro allocation comes)
No Slice

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
GetOffsetSingleSegment 73.78 ms 0.259 ms 0.230 ms - - - 138 B
GetOffsetMultiSegment 208.02 ms 0.669 ms 0.626 ms - - - 413 B

Slice

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
GetOffsetSingleSegment 87.31 ms 0.525 ms 0.491 ms - - - -
GetOffsetMultiSegment 211.03 ms 1.191 ms 1.056 ms - - - -
[MemoryDiagnoser]
    public class ReadOnlySequence
    {
        ReadOnlySequence<int> bufferSingleSegment;
        SequencePosition[] positionsSingleSegment;

        ReadOnlySequence<int> bufferMultiSegment;
        SequencePosition[] positionsMultiSegment;



        [GlobalSetup]
        public void Setup()
        {
            int n = 10_000_000;
            bufferSingleSegment = new ReadOnlySequence<int>(new int[n]);
            List<SequencePosition> positionsList = new List<SequencePosition>();
            for (int i = 0; i <= bufferSingleSegment.Length; i++)
            {
                positionsList.Add(bufferSingleSegment.GetPosition(i));
            }
            positionsSingleSegment = positionsList.ToArray();

            positionsList.Clear();

            var bufferSegment1 = new BufferSegment<int>(new int[n / 4]);
            BufferSegment<int> bufferSegment2 = bufferSegment1.Append(new int[n / 4]);
            BufferSegment<int> bufferSegment3 = bufferSegment2.Append(new int[n / 4]);
            BufferSegment<int> bufferSegment4 = bufferSegment3.Append(new int[n / 4]);
            bufferMultiSegment = new ReadOnlySequence<int>(bufferSegment1, 0, bufferSegment4, n / 4);
            for (int i = 0; i <= bufferMultiSegment.Length; i++)
            {
                positionsList.Add(bufferMultiSegment.GetPosition(i));
            }
            positionsMultiSegment = positionsList.ToArray();
            positionsList.Clear();
        }

        [Benchmark]
        public void GetOffsetSingleSegment()
        {
            for (int i = 0; i <= bufferSingleSegment.Length; i++)
            {
                bufferSingleSegment.GetOffset(positionsSingleSegment[i]);
            }
        }

        [Benchmark]
        public void GetOffsetMultiSegment()
        {
            for (int i = 0; i <= bufferMultiSegment.Length; i++)
            {
                bufferMultiSegment.GetOffset(positionsMultiSegment[i]);
            }
        }
    }

BTW the most important thing is that slice doesn't work as expected on boundary and test doesn't pass for
GetOffset_MultiSegment_PositionOutOfRange_SegmentNotFound
GetOffset_MultiSegment_BoundaryConditions
GetOffset_MultiSegment_SequencePositionSegment
GetOffset_MultiSegment_PositionOutOfRange_SegmentNotFound
GetOffset_MultiSegment_BoundaryConditions
GetOffset_MultiSegment_SequencePositionSegment

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I am surprised by the "no slice" method showing allocations as well. Could you measure just a single call to GetOffset rather than doing it in the loop (and let BDN deal with running it many times)?

I will take a closer look tomorrow at the tests failing if we chose to go with the simpler Slice approach. Maybe we make Slice more resilient to those edge cases.

Copy link
Member Author

@MarcoRossignoli MarcoRossignoli Feb 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Letting BDN to profile e chose right iteration count

        [Benchmark]
        public void GetOffsetSingleSegment()
        {
            bufferSingleSegment.GetOffset(positionsSingleSegment[9_000_000]);
        }

        [Benchmark]
        public void GetOffsetMultiSegment()
        {
            bufferMultiSegment.GetOffset(positionsMultiSegment[9_000_000]);
        }

No slice

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
GetOffsetSingleSegment 6.311 ns 0.0351 ns 0.0329 ns - - - -
GetOffsetMultiSegment 19.354 ns 0.0394 ns 0.0329 ns - - - -

Slice

Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
GetOffsetSingleSegment 15.61 ns 0.057 ns 0.053 ns - - - -
GetOffsetMultiSegment 25.62 ns 0.074 ns 0.065 ns - - - -

public long GetOffset(SequencePosition position)
{
bool positionIsNotNull = position.GetObject() != null;
BoundsCheck(position, positionIsNotNull);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did also a version with check inside method, but I've opted for clear code/code reuse, I don't think this is a "hot path" api.

@MarcoRossignoli
Copy link
Member Author

/azp

@MarcoRossignoli
Copy link
Member Author

/azp runtime-libraries

@azure-pipelines
Copy link

Command 'runtime-libraries' is not supported by Azure Pipelines.

Supported commands
  • help:
    • Get descriptions, examples and documentation about supported commands
    • Example: help "command_name"
  • list:
    • List all pipelines for this repository using a comment.
    • Example: "list"
  • run:
    • Run all pipelines or a specific pipeline for this repository using a comment. Use this command by itself to trigger all related pipelines, or specify a pipeline to run.
    • Example: "run" or "run pipeline_name"
  • where:
    • Report back the Azure DevOps orgs that are related to this repository and org
    • Example: "where"

See additional documentation.

@MarcoRossignoli
Copy link
Member Author

/azp list

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 771 in repo dotnet/runtime

@MarcoRossignoli
Copy link
Member Author

/azp run runtime-libraries

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 771 in repo dotnet/runtime

@MarcoRossignoli
Copy link
Member Author

MarcoRossignoli commented Dec 12, 2019

@terrajobst I've accepted the invite to external-ci-access but I get error.

cc: @safern @ViktorHofer

@ViktorHofer
Copy link
Member

/azp run runtime-libraries

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@terrajobst
Copy link
Contributor

@terrajobst I've accepted the invite to external-ci-access but I get error.

Sorry about that. I've asked @mmitche to take a look.

@MarcoRossignoli
Copy link
Member Author

No worries thanks a lot for quick response!

@MarcoRossignoli
Copy link
Member Author

ping @ahsonkhan @JeremyKuhne

@stephentoub
Copy link
Member

cc: @pakrym, @ahsonkhan

src/libraries/System.Memory/src/System/ThrowHelper.cs Outdated Show resolved Hide resolved
@@ -528,6 +528,56 @@ public SequencePosition GetPosition(long offset)
return Seek(offset);
}

/// <summary>
/// Translate opaque <see cref="System.SequencePosition"/> value to numerical position offset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written, it gives the impression that any arbitrary SequencePosition can be passed in and is valid. Also, using translate seems strange to me.

We should try to be consistent with existing docs:
https://docs.microsoft.com/en-us/dotnet/api/system.buffers.readonlysequence-1.getposition?view=netcore-3.1

How about:

Suggested change
/// Translate opaque <see cref="System.SequencePosition"/> value to numerical position offset
/// Returns the offset of a <paramref name="position" /> within this sequence from the start.

We should also add a remarks section stating that callers should only pass in a SequencePosition that belong to this ReadOnlySequence. Otherwise, the behavior is undefined.

Copy link
Contributor

@ahsonkhan ahsonkhan Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing that might be worth noting in the remarks (and testing for) is that the behavior is "exclusive", not "inclusive".

Let's say we have one of these sequences:
[50] -> [0]
[50] -> [50]

What does calling GetOffset(new SequencePosition(second segment, 0)) return? 50, correct? And then if I slice the ReadOnlySequence from 0 to that offset (i.e. 50), is the returned ReadOnlySequence, a single segment?

Something like:

ReadOnlySequence<byte> slice = ros.GetOffset(new SequencePosition(segment2, 0));
Assert.True(slice.IsSingleSegment);

slice = ros.GetOffset(new SequencePosition(segment1, 50)); // This would be the exclusive case.
Assert.True(slice.IsSingleSegment);

slice = ros.GetOffset(new SequencePosition(segment1, 51)); // This will throw, correct?

Copy link
Member Author

@MarcoRossignoli MarcoRossignoli Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan

slice = ros.Slice(new SequencePosition(segment1, 50)); // This would be the exclusive case.
Assert.True(slice.IsSingleSegment);

this is not a single segment...the returned sequence is a multi segment starting from 50th item.

ReadOnlySequence<byte> slice = ros.Slice(new SequencePosition(segment2, 0));  
Assert.True(slice.IsSingleSegment);

[50] -> [0]  returned new ROS startObj segment2 endObj segment2 so IsSingleSegment = true

slice = ros.Slice(new SequencePosition(segment1, 50)); // This would be the exclusive case.
Assert.True(slice.IsSingleSegment);

[50] -> [0]  returned new ROS startObj segment1 endObj segment2 so IsSingleSegment = false

slice = ros.Slice(new SequencePosition(segment1, 51)); // This will throw, correct? <-- Yes BoundsCheck()

// For single segment position bound check is implicit.
Debug.Assert(positionSequenceObject != null);

if (((ReadOnlySequenceSegment<T>)positionSequenceObject).Memory.Length - positionIndex < 0)
Copy link
Member Author

@MarcoRossignoli MarcoRossignoli Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan I don't know if this is ok, default BoundCheck method doesn't check for SequencePosition validity, but only if offset specified is valid against current ROSequence.
It's implicit for single segment because the object is the same(if sequence position is pertinent) and only indexes are checked.

Copy link
Contributor

@ahsonkhan ahsonkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits.

src/libraries/System.Memory/ref/System.Memory.cs Outdated Show resolved Hide resolved
@adamsitnik
Copy link
Member

/azp run

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@adamsitnik adamsitnik added this to the 5.0.0 milestone Aug 14, 2020
@adamsitnik
Copy link
Member

/azp run runtime-libraries

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@adamsitnik adamsitnik closed this Aug 17, 2020
@adamsitnik adamsitnik reopened this Aug 17, 2020
Co-authored-by: Carlos Sanchez <1175054+carlossanlop@users.noreply.github.com>
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoRossignoli thank you for adding good tests and comments which made the review easy for me.

If CI passes today, it's going to be a part of 5.0. :shipit:

@adamsitnik adamsitnik merged commit 2d93df6 into dotnet:master Aug 17, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
MichalStrehovsky pushed a commit to MichalStrehovsky/runtime that referenced this pull request Mar 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: long ReadOnlySequence<T>.GetPosition(SequencePosition)
9 participants