Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Payload Unary Calls: Unable to clean up after use #1267

Closed
efrainsteinbach opened this issue Apr 13, 2021 · 9 comments
Closed

Large Payload Unary Calls: Unable to clean up after use #1267

efrainsteinbach opened this issue Apr 13, 2021 · 9 comments
Labels
question Further information is requested

Comments

@efrainsteinbach
Copy link

efrainsteinbach commented Apr 13, 2021

We're using gRPC to transfer large messages over the network, around 150MB each. I've noticed that in the client processes this tends to generate quite large, lasting memory footprints. This matters to us, because we run this on a k8s cluster: So I want to make sure the memory usage after receiving and digesting such a message is minimal. These messages are only exchanged once a day (at night), so we prefer lower memory consumption over speed.

I wrote a small test solution based on the Greeter app that illustrates this: https://github.com/efrainsteinbach/grpc-client-memory-test
It's simply requesting five messages sequentially which should be around ~200MB in size (uncompressed). After the five requests are through, the Gen2 Heap allocation stays at 400MB, even though I'm doing my best to dispose of any references I may still have to any gRPC components:

image

To me it looks as if some singleton or static components of Grpc.Net.Client or Grpc.Core are somehow holding to the last received message. (?)

Am I doing something wrong? Is there a way to clean this up?

(I tried explicitly shutting down and disposing the channel already)

@efrainsteinbach efrainsteinbach added the question Further information is requested label Apr 13, 2021
@JamesNK
Copy link
Member

JamesNK commented Apr 14, 2021

Is this measuring memory usage on the server? I suspect what you're seeing is the Kestrel memory pool increasing in size and then never decreasing.

@halter73 @davidfowl

@halter73
Copy link

Kestrel's nonshrinking memory pool (dotnet/aspnetcore#27394) is a possible culprit if we're looking at the server. But looking at the repro, it claims that it's intended to "investigate client memory usage", and the client project doesn't use Kestrel.

@efrainsteinbach Did you look types of the objects in gen 2? Or how they're referenced?

@JamesNK
Copy link
Member

JamesNK commented Apr 14, 2021

Yes it's the client.

The held onto memory is from the huge message result:

image

And it's just being held onto as a local variable:

image

I think an async state machine still has a reference to the task at the end of your program.

@JamesNK
Copy link
Member

JamesNK commented Apr 14, 2021

I think I'm right. await RunThingsInScope() hangs onto a huge Task<NumberReply> value in your Main(args) method.

static async Task Main(object[] args)
{
    await RunThingsInScope();
    Console.ReadAnyKey();
}

static async Task RunThingsInScope()
{
    var result = await client.DoGrpcCallThatReturnsHugeResult();
}

Adding await Task.Yield(); to the end of RunThingsInScope() causes memory to drop to zero once it is finished:

static async Task RunThingsInScope()
{
    var result = await client.DoGrpcCallThatReturnsHugeResult();
    await Task.Yield();
}

I'm guessing the task returned by Task.Yield is used in Main method's async state machine, not Task<NumberReply>. That allows Task<NumberReply> to be GCed.

After:

image

@JamesNK
Copy link
Member

JamesNK commented Apr 14, 2021

@stephentoub Hi Stephen, is the behavior described in the comment above expected? I searched for runtime issues and found dotnet/runtime#11189 which could be related, but you added a fix for it.

@efrainsteinbach
Copy link
Author

@JamesNK
Excellent, thanks! - I guess it makes some sense, but I find it really surprising. Does that mean that RunThingsInScope will somehow return the last awaited Task directly if the result is not used? - Anyway, I think I can resolve my memory issues using this approach.

@davidfowl
Copy link
Contributor

Hmm that issue was fixed. I'd be curious to see the repro

@stephentoub
Copy link
Contributor

> gcroot 000002292db26fd8
Thread 71a8:
    0000005959AFEED0 00007FFE4576D22D Client.Program+<RunThingsInScope>d__1.MoveNext() [C:\Users\stoub\Desktop\grpc-client-memory-test\Client\Program.cs @ 69]
        rbp-28: 0000005959afefc8
            ->  000002290C4AD970 System.Threading.Tasks.Task`1[[Greet.NumberReply, Client]]
            ->  000002290C4CAE50 Greet.NumberReply
            ->  000002290C4CAE70 Google.Protobuf.Collections.RepeatedField`1[[Greet.DoubleDataPoint, Client]]
            ->  000002292DB26FD8 Greet.DoubleDataPoint[]

When the RunThingsInScope operation completes, its compiler-generated MoveNext method is calling builder.SetResult(result), which is invoking the continuation synchronously, so the stack frame for MoveNext is still on the stack, which means any locals it has are still on the stack. The compiler generated MoveNext has code something like this after an await:

TaskAwaiter<TResult> awaiter = <>u__1;
<>u__1 = default;
... = awaiter.GetResult();

I believe what we're seeing is the JIT/GC considering that last awaiter local to still be alive, and while the awaiter field in the object was zero'd out, the local wasn't. That local references the Task<TResult> which in turn references the TResult, which here is the giant object.

@davidfowl
Copy link
Contributor

I know the compiler nulls out fields on exit but things may be stuck in registers (I was debugging this recently). At least this confirms my suspicions. Async unwinding is a bit mind bending 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants