Grpc time out for 15 minutes #7587

ptlanu22 · 2020-11-03T14:40:45Z

Hey, i am using mentioned version of grpc libraries. so we have 2 services running on grpc , service A create a managed channel during startup and uses it throughout lifetime of pod. service A receives heavy load (100 req/s) . sometimes we keep receiving timeout error for around 15 min , no call reaches service B. after 15 min automatically this resolves, after doing some browsing i checked this could be becasue of TCP_USER_TIMEOUT being set to around 15 min by default. so can someone confirm if this is the same issue, or something else which i am missing

			<groupId>io.grpc</groupId>
			<artifactId>grpc-stub</artifactId>
			<version>1.20.0</version>
		</dependency>
		<dependency>
			<groupId>io.grpc</groupId>
			<artifactId>grpc-core</artifactId>
			<version>1.20.0</version>
		</dependency>
		<dependency>
			<groupId>io.grpc</groupId>
			<artifactId>grpc-netty-shaded</artifactId>
			<version>1.20.0</version>
		</dependency>```

Below is the code used to create channel
` ManagedChannel managedChannel = ManagedChannelBuilder.forAddress(host, port)
        .usePlaintext()
        .build();
    stub = XServiceGrpc.newBlockingStub(managedChannel).withWaitForReady();`

The text was updated successfully, but these errors were encountered:

ejona86 · 2020-11-03T19:30:02Z

This was posted to Gitter as well, and I was answering it there. In short, v1.20.0 is unsupported as it is a year and a half old. But it seems you can try enabling keepAliveTime on the managedChannelBuilder to 5 minutes or so and see if that changes the behavior.

MrCookie0313 · 2020-11-04T15:48:01Z

grpc调用go服务，每次生产上只要go服务端一更新pod，就会出现15分钟左右请求失败，后面自然就好了，问下是同一个问题吗

MrCookie0313 · 2020-11-06T03:01:45Z

可以参考这篇文章: https://yidongnan.github.io/grpc-spring-boot-starter/en/kubernetes.html

MrCookie0313 · 2020-11-06T03:03:25Z

嘿，我正在使用提到的grpc库版本。因此，我们在grpc上运行了2个服务，服务A在启动期间创建了一个托管通道，并在pod的整个生命周期中都使用它。服务A承受重负载（100 req / s）。有时我们会持续接收约15分钟的超时错误，没有呼叫到达服务B。15分钟后，这自动解决，经过一些浏览后，我检查了一下，这可能是因为TCP_USER_TIMEOUT默认设置为约15分钟。所以有人可以确认这是否是同一问题，还是我所缺少的其他东西
			<groupId>io.grpc</groupId>
			<artifactId>grpc-stub</artifactId>
			<version>1.20.0</version>
		</dependency>
		<dependency>
			<groupId>io.grpc</groupId>
			<artifactId>grpc-core</artifactId>
			<version>1.20.0</version>
		</dependency>
		<dependency>
			<groupId>io.grpc</groupId>
			<artifactId>grpc-netty-shaded</artifactId>
			<version>1.20.0</version>
		</dependency>```

Below is the code used to create channel
` ManagedChannel managedChannel = ManagedChannelBuilder.forAddress(host, port)
        .usePlaintext()
        .build();
    stub = XServiceGrpc.newBlockingStub(managedChannel).withWaitForReady();`

Different cluster

grpc.clients.my-grpc-server-app.address=dns:///my-grpc-server-app.example.svc.cluster.local:1234 如果是域名可以参考这个加上dns:///

ptlanu22 · 2020-11-06T03:15:35Z

@ejona86 i was going through this blog https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
hence i was suspecting the behaviour since timeout window was similar to 15 minutes as per blog,
And also this was included in grpc-java. https://github.com/grpc/proposal/blob/master/A18-tcp-user-timeout.md

If i set keepAliveTime to 5 minutes how that will solve the problem.? if i have continuous traffic every second.
also i am using java at both end client and server. and inside kuberrnetes cluster. connecting to server using ingress url.

ejona86 · 2020-11-09T20:19:25Z

Enabling keepAliveTime enables an http2-level PING which would detect the broken connection, and it also enables TCP_USER_TIMEOUT.

ejona86 · 2020-11-17T22:26:39Z

Seems like this is resolved. If not, comment, and it can be reopened.

ptlanu22 added the question label Nov 3, 2020

ejona86 closed this as completed Nov 17, 2020

github-actions bot locked as resolved and limited conversation to collaborators Jun 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grpc time out for 15 minutes #7587

Grpc time out for 15 minutes #7587

ptlanu22 commented Nov 3, 2020

ejona86 commented Nov 3, 2020

MrCookie0313 commented Nov 4, 2020

MrCookie0313 commented Nov 6, 2020

MrCookie0313 commented Nov 6, 2020

ptlanu22 commented Nov 6, 2020

ejona86 commented Nov 9, 2020

ejona86 commented Nov 17, 2020

Grpc time out for 15 minutes #7587

Grpc time out for 15 minutes #7587

Comments

ptlanu22 commented Nov 3, 2020

ejona86 commented Nov 3, 2020

MrCookie0313 commented Nov 4, 2020

MrCookie0313 commented Nov 6, 2020

MrCookie0313 commented Nov 6, 2020

Different cluster

ptlanu22 commented Nov 6, 2020

ejona86 commented Nov 9, 2020

ejona86 commented Nov 17, 2020