Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUERY] AmqpReceiver DisposeMessagesAsync unnecessarily allocates #20166

Closed
danielmarbach opened this issue Apr 6, 2021 · 3 comments · Fixed by #20427
Closed

[QUERY] AmqpReceiver DisposeMessagesAsync unnecessarily allocates #20166

danielmarbach opened this issue Apr 6, 2021 · 3 comments · Fixed by #20427
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus

Comments

@danielmarbach
Copy link
Contributor

danielmarbach commented Apr 6, 2021

Before

image

Potential after

image

Shows a lot of Guid.ToByteArray allocations. They are coming from

List<ArraySegment<byte>> deliveryTags = ConvertLockTokensToDeliveryTags(lockTokens);

I see two ways to optimize this away:

  1. The code only ever passes one lock token in so by getting rid of all the enumeration we could stackalloc the required memory for the guid and then marshal it into. This would also get rid of all the task lists etc and make the code in general more efficient or if the multiple lock token path needs to be preserved
  2. Either Array Rent and marshal or even better if we know somehow the number of elements
  3. Rent a whole buffer of number of GUIDs x 16, marshal into slices and then return the whole buffer at the end

Some samples

Here a benchmark of the effect

 [Config(typeof(Config))]
    public class GuidByteArrayVsRent
    {
        private Consumer consumer = new Consumer();

        private class Config : ManualConfig
        {
            public Config()
            {
                AddExporter(MarkdownExporter.GitHub);
                AddDiagnoser(MemoryDiagnoser.Default);
                AddJob(Job.ShortRun);
            }
        }

        [Params(1, 2, 4, 8, 16, 32, 64)]
        public int Elements { get; set; }

        [Benchmark(Baseline = true)]
        public void ToByteArray()
        {
            for (var i = 0; i < Elements; i++)
            {
                consumer.Consume(new ArraySegment<byte>(Guid.NewGuid().ToByteArray()));
            }
        }

        [Benchmark]
        public void Marshal()
        {
            for (var i = 0; i < Elements; i++)
            {
                var bufferForGuid = ArrayPool<byte>.Shared.Rent(16);
                try
                {
                    var guid = Guid.NewGuid();
                    if (!MemoryMarshal.TryWrite(bufferForGuid, ref guid))
                    {
                        guid.ToByteArray().AsSpan().CopyTo(bufferForGuid);
                    }
                    consumer.Consume( new ArraySegment<byte>(bufferForGuid, 0, 16));
                }
                finally
                {
                    ArrayPool<byte>.Shared.Return(bufferForGuid);
                }
            }
        }

        [Benchmark]
        public void RentWholeArray()
        {
            var bufferForGuid = ArrayPool<byte>.Shared.Rent(16*Elements);
            try
            {
                for (var i = 0; i < Elements; i++)
                {
                    var guid = Guid.NewGuid();
                    if (!MemoryMarshal.TryWrite(bufferForGuid.AsSpan(i * 16, 16), ref guid))
                    {
                        guid.ToByteArray().AsSpan().CopyTo(bufferForGuid);
                    }
                    consumer.Consume( new ArraySegment<byte>(bufferForGuid, i * 16, 16));
                }
            }
            finally
            {
                ArrayPool<byte>.Shared.Return(bufferForGuid);
            }
        }

        [Benchmark]
        public void TryWriteBytes()
        {
            for (var i = 0; i < Elements; i++)
            {
                var bufferForGuid = ArrayPool<byte>.Shared.Rent(16);
                try
                {
                    var guid = Guid.NewGuid();
                    if (!guid.TryWriteBytes(bufferForGuid))
                    {
                        guid.ToByteArray().AsSpan().CopyTo(bufferForGuid);
                    }
                    consumer.Consume( new ArraySegment<byte>(bufferForGuid, 0, 16));
                }
                finally
                {
                    ArrayPool<byte>.Shared.Return(bufferForGuid);
                }
            }
        }
    }

results

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen Threadripper 1920X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.200
  [Host]   : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  ShortRun : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT

Job=ShortRun  IterationCount=3  LaunchCount=1  
WarmupCount=3  
Method Elements Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
ToByteArray 1 90.77 ns 101.354 ns 5.556 ns 1.00 0.00 0.0095 - - 40 B
Marshal 1 104.82 ns 17.793 ns 0.975 ns 1.16 0.07 - - - -
RentWholeArray 1 106.51 ns 59.081 ns 3.238 ns 1.18 0.08 - - - -
TryWriteBytes 1 112.65 ns 10.476 ns 0.574 ns 1.24 0.07 - - - -
ToByteArray 2 177.96 ns 5.671 ns 0.311 ns 1.00 0.00 0.0191 - - 80 B
Marshal 2 224.87 ns 14.184 ns 0.777 ns 1.26 0.01 - - - -
RentWholeArray 2 181.36 ns 43.272 ns 2.372 ns 1.02 0.01 - - - -
TryWriteBytes 2 253.45 ns 375.564 ns 20.586 ns 1.42 0.12 - - - -
ToByteArray 4 346.50 ns 102.204 ns 5.602 ns 1.00 0.00 0.0381 - - 160 B
Marshal 4 437.53 ns 69.684 ns 3.820 ns 1.26 0.02 - - - -
RentWholeArray 4 336.96 ns 71.407 ns 3.914 ns 0.97 0.03 - - - -
TryWriteBytes 4 437.43 ns 51.660 ns 2.832 ns 1.26 0.02 - - - -
ToByteArray 8 665.03 ns 144.989 ns 7.947 ns 1.00 0.00 0.0763 - - 320 B
Marshal 8 808.72 ns 217.813 ns 11.939 ns 1.22 0.02 - - - -
RentWholeArray 8 651.09 ns 292.807 ns 16.050 ns 0.98 0.03 - - - -
TryWriteBytes 8 840.26 ns 104.325 ns 5.718 ns 1.26 0.02 - - - -
ToByteArray 16 1,382.69 ns 187.206 ns 10.261 ns 1.00 0.00 0.1526 - - 640 B
Marshal 16 1,650.45 ns 255.747 ns 14.018 ns 1.19 0.02 - - - -
RentWholeArray 16 1,299.58 ns 58.239 ns 3.192 ns 0.94 0.01 - - - -
TryWriteBytes 16 1,798.06 ns 417.208 ns 22.869 ns 1.30 0.02 - - - -
ToByteArray 32 2,697.83 ns 912.335 ns 50.008 ns 1.00 0.00 0.3052 - - 1280 B
Marshal 32 3,296.75 ns 1,273.779 ns 69.820 ns 1.22 0.04 - - - -
RentWholeArray 32 2,593.21 ns 432.644 ns 23.715 ns 0.96 0.02 - - - -
TryWriteBytes 32 3,333.38 ns 558.803 ns 30.630 ns 1.24 0.03 - - - -
ToByteArray 64 5,372.84 ns 1,164.633 ns 63.837 ns 1.00 0.00 0.6104 - - 2560 B
Marshal 64 6,830.73 ns 543.035 ns 29.766 ns 1.27 0.02 - - - -
RentWholeArray 64 4,903.72 ns 694.072 ns 38.044 ns 0.91 0.01 - - - -
TryWriteBytes 64 6,913.90 ns 1,446.581 ns 79.292 ns 1.29 0.03 - - - -
@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Apr 6, 2021
@danielmarbach
Copy link
Contributor Author

// cc @jsquire @JoshLove-msft

@danielmarbach
Copy link
Contributor Author

TryWriteBytes is the native way with Guid when you have the proper API support (unfortunately not available with .NET Standard)

https://docs.microsoft.com/en-us/dotnet/api/system.guid.trywritebytes?view=net-5.0

@jsquire jsquire added Client This issue points to a problem in the data-plane of the library. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team Service Bus labels Apr 6, 2021
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Apr 6, 2021
@danielmarbach
Copy link
Contributor Author

I've sent in #20425 assuming you want to preserve the capability of disposing multiple lock tokens at the same time. Because the enumerables are always converted into an array underneath I know the length upfront and can use the rent a whole array approach

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus
Projects
None yet
3 participants