You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To make some test on the TorchServe I've sent 100 requests concurrently using the command below.
$ ./ghz --proto=./proto/inference.proto -i ./proto --insecure -c 100 -z1m -d '{"model_name": "distilroberta", "model_version": "1.0", "input": {"data": "c29tZURhdGFJbkJhc2U2NA==" }}' --call=org.pytorch.serve.grpc.inference.InferenceAPIsService.Predictions localhost:4089
Summary:
Count: 5509
Total: 60.04 s
Slowest: 1.46 s
Fastest: 123.48 ms
Average: 1.09 s
Requests/sec: 91.76
Response time histogram:
123.478 [1] |
257.027 [18] |
390.575 [14] |
524.123 [13] |
657.672 [12] |
791.220 [14] |
924.769 [33] |
1058.317 [127] |∎
1191.865 [5116] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
1325.414 [59] |
1458.962 [5] |
Latency distribution:
10 % in 1.07 s
25 % in 1.09 s
50 % in 1.10 s
75 % in 1.12 s
90 % in 1.14 s
95 % in 1.15 s
99 % in 1.19 s
Status code distribution:
[OK] 5412 responses
[Unavailable] 97 responses
Error distribution:
[97] rpc error: code = Unavailable desc = error reading from server: read tcp 127.0.0.1:6110->127.0.0.1:4089: use of closed network connection
The server crashes giving the error below:
2024-04-17T08:29:23,524 [ERROR] W-9005-distilroberta_1.0 org.pytorch.serve.wlm.WorkerThread - IllegalStateException error
java.lang.IllegalStateException: Stream was terminated by error, no further calls are allowed
at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[model-server.jar:?]
at io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:374) ~[model-server.jar:?]
at org.pytorch.serve.job.GRPCJob.response(GRPCJob.java:130) ~[model-server.jar:?]
at org.pytorch.serve.wlm.BatchAggregator.sendResponse(BatchAggregator.java:103) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:238) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:840) [?:?]
2024-04-17T08:29:23,526 [INFO ] grpc-default-executor-52 ACCESS_LOG - /172.23.0.1:24928 "gRPC org.pytorch.serve.grpc.inference.InferenceAPIsService/Predictions HTTP/2.0" 1 930
2024-04-17T08:29:23,524 [WARN ] grpc-default-executor-44 org.pytorch.serve.grpcimpl.InferenceImpl - grpc client call already cancelled
2024-04-17T08:29:23,524 [ERROR] W-9005-distilroberta_1.0 org.pytorch.serve.wlm.WorkerThread - IllegalStateException error
java.lang.IllegalStateException: Stream was terminated by error, no further calls are allowed
at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[model-server.jar:?]
at io.grpc.stub.ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:374) ~[model-server.jar:?]
at org.pytorch.serve.job.GRPCJob.response(GRPCJob.java:130) ~[model-server.jar:?]
at org.pytorch.serve.wlm.BatchAggregator.sendResponse(BatchAggregator.java:103) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:238) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:840) [?:?]
Hello,
To make some test on the
TorchServe
I've sent 100 requests concurrently using the command below.The server crashes giving the error below:
Configuration:
If it can help, here is the hardware specs:
Does anyone have any idea why the server crashes?
Thank you for your help ❤️
I can give you a thread dump when the crash occures if needed btw
The text was updated successfully, but these errors were encountered: